From deadline at eadline.org Mon Dec 1 10:31:29 2008 From: deadline at eadline.org (Douglas Eadline) Date: Wed Nov 25 01:08:00 2009 Subject: [Beowulf] Free Webinar: Cool Crunching: Understanding Green HPC In-Reply-To: <571f1a060811292333q4aa31ab6x5b7995baa4145445@mail.gmail.com> References: <571f1a060811292333q4aa31ab6x5b7995baa4145445@mail.gmail.com> Message-ID: <34891.192.168.1.213.1228156289.squirrel@mail.eadline.org> I'm moderating a webinar called: Cool Crunching: Understanding Green HPC on Wednesday (Dec 3) at 11AM EST. More info and registration: http://linux-mag.com/id/7172 -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From deadline at eadline.org Mon Dec 1 10:58:33 2008 From: deadline at eadline.org (Douglas Eadline) Date: Wed Nov 25 01:08:00 2009 Subject: [Beowulf] cli alternative to cluster top? In-Reply-To: <4932C58C.6020706@scalableinformatics.com> References: <4932C58C.6020706@scalableinformatics.com> Message-ID: <38215.192.168.1.213.1228157913.squirrel@mail.eadline.org> maybe Joe means this: http://www.basement-supercomputing.com/content/view/19/45/ which is being updated for SGE 6.* I basically hacked it together because I could not easily understand qstat. Source is available, be advised code is ugly as it started out as hack while I was debugging SGE parallel environments and test suites. I should have a new version "real soon now" there are also some web based tools that do this for SGE as well (http://xml-qstat.org/) -- Doug > Thomas Vixel wrote: >> I've been googling for a top-like cli tool to use on our cluster, but >> the closest thing that comes up is Rocks' "cluster top" script. That >> could be tweaked to work via the cli, but due to factors beyond my >> control (management) all functionality has to come from a pre-fab >> program rather than a software stack with local, custom modifications. >> >> I'm sure this has come up more than once in the HPC sector as well -- >> could anyone point me to any top-like apps for our cluster? > > We have a ctop we have written a while ago. Depends upon pdsh, though > with a little effort, even that could be removed (albeit being a > somewhat slower program as a result). Our version is Perl based, open > source, and quite a few of our customers do use it. I had looked at > hooking it into wulfstat at some point. > > Doug Eadline has a top he had written (is that correct Doug?) for > clusters some time ago. > >> >> For reference, wulfware/wulfstat was nixed as well because of the >> xmlsysd dependency. > > Sometimes I wonder about the 'logic' underpinning some of the decisions > I hear about. > > ctop could work with plain ssh, though you will need to make sure that > all nodes are able to be reached via passwordless ssh (shouldn't be an > issue for most of todays clusters), and you will need some mechanism to > tell ctop which nodes you wish to include in the list. We have used > /etc/cluster/hosts.cluster in the past to list hostnames/ip addresses of > the cluster nodes. > > Let me know if you have pdsh implemented. BTW: ctop is OSS (GPLv2). > It should be available on our download site as an RPM/source RPM > (http://downloads.scalableinformatics.com). If there is enough interest > in it, I'll put it into our public Mercurial repository as well. > > > > > -- > Joseph Landman, Ph.D > Founder and CEO > Scalable Informatics LLC, > email: landman@scalableinformatics.com > web : http://www.scalableinformatics.com > http://jackrabbit.scalableinformatics.com > phone: +1 734 786 8423 x121 > fax : +1 866 888 3112 > cell : +1 734 612 4615 > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From deadline at eadline.org Mon Dec 1 11:02:28 2008 From: deadline at eadline.org (Douglas Eadline) Date: Wed Nov 25 01:08:00 2009 Subject: [Beowulf] cli alternative to cluster top? In-Reply-To: <49330EC5.50700@scalableinformatics.com> References: <571f1a060811292333q4aa31ab6x5b7995baa4145445@mail.gmail.com> <20081130182839.GA17239@bx9> <49330EC5.50700@scalableinformatics.com> Message-ID: <54979.192.168.1.213.1228158148.squirrel@mail.eadline.org> > Robert G. Brown wrote: >> On Sun, 30 Nov 2008, Greg Lindahl wrote: >> >>> >>> On Sun, Nov 30, 2008 at 11:45:44AM -0500, Robert G. Brown wrote: >>> >>>> That's fine, but I'm curious. How do you expect to run a cluster >>>> information tool over a network without a socket at both ends? >>> >>> There's always "qstat". The OP didn't really say what sorts of >>> information he was looking for... >> >> :-) Hey, didn't think of that -- an enormous Quake cluster? >> >> Although I didn't realize that qstat worked by electronic telepathy;-) > > to bad we can't use EPR pairs for this ... Well maybe not in this universe ... -- Doug > > -- > Joseph Landman, Ph.D > Founder and CEO > Scalable Informatics LLC, > email: landman@scalableinformatics.com > web : http://www.scalableinformatics.com > http://jackrabbit.scalableinformatics.com > phone: +1 734 786 8423 x121 > fax : +1 866 888 3112 > cell : +1 734 612 4615 > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From rgb at phy.duke.edu Mon Dec 1 15:33:40 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:00 2009 Subject: [Beowulf] cli alternative to cluster top? In-Reply-To: References: <571f1a060811292333q4aa31ab6x5b7995baa4145445@mail.gmail.com> Message-ID: On Mon, 1 Dec 2008, Thomas Vixel wrote: > The main requirements were that 1) "it must look like top", 2) it must > be cli-based, 3) it should not introduce another piece of (server) > software (and thus point of failure) into the system, 4) and it should > not require any local hacks on our part. > > Since this is a web cluster, I suppose the most logical transport > would be HTTP, but everything I've seen so far would require me to > violate (4) to satisfy 1-3. SSH & SNMP are also available, I could see > SNMP being problematic for a project like this. SSH *might* work and > *might* be scalable for a project like this since the main expense is > in the construction the connections (and they can be re-used across > the course of the execution), but I have yet to find a top-like > program that leverages it. > > xmlsysd & wulfstat would appear to be the least expensive solution to > this problem, but as I said, management already nixed it. Um, I don't know what you call a "local hack", but it IS called xmlsysd for a reason. You don't have to run it as an actual public daemon. It produces xml, and in inetd mode it reads from stdin and writes to stdout. Try this. Install xmlsysd on something. Then in the command line, enter: ./xmlsysd -i 7880 (the port number isn't important -- you're just telling it to use inetd mode). It will run and nothing will happen. Then WITHOUT BREAKING OUT just enter init send See? I'd think you could write a TRIVIAL PHP or perl wrapper and pop this out through a webserver. The other end is a bit trickier, but I think doable. wulfstat/wulflogger don't currently speak GET, but the parser should still work. If you only wanted a few objects (e.g. load average), you could again write a pretty trivial perl script to get them. Or if you don't need it today (or the next week or two) over break I could probably hack wulfstat OR wulflogger to connect to a web interface and just use GET to get an update, and maybe write a "permanent" CGI wrapper that puts xmlsysd output on a web address when called (in lieu of "send"). rgb > > Honestly, if it weren't for (4), I'd probably just grab the top source > and graft it onto a SSH library. It might not be the most efficient > solution, but for a HA cluster it doesn't necessarily HAVE to be. I'd think the solution above would be a lot easier. > > On 11/30/08, Robert G. Brown wrote: >> On Sat, 29 Nov 2008, Greg Kurtzer wrote: >> >>> Warewulf has a real time top like command for the cluster nodes and >>> has been known to scale up to the thousands of nodes: >>> >>> http://www.runlevelzero.net/images/wwtop-screenshot.png >>> >>> We are just kicking off Warewulf development again now that Perceus >>> has gotten to a critical mass and Caos NSA 1.0 has been released. We >>> should have our repositories for Warewulf-3 pre-releases up shortly >>> but if you need something ASAP, please contact me offline and I will >>> get you what you need. >>> >>> Thanks! >>> Greg >>> >>> On Wed, Nov 26, 2008 at 12:39 PM, Thomas Vixel wrote: >>>> I've been googling for a top-like cli tool to use on our cluster, but >>>> the closest thing that comes up is Rocks' "cluster top" script. That >>>> could be tweaked to work via the cli, but due to factors beyond my >>>> control (management) all functionality has to come from a pre-fab >>>> program rather than a software stack with local, custom modifications. >>>> >>>> I'm sure this has come up more than once in the HPC sector as well -- >>>> could anyone point me to any top-like apps for our cluster? >>>> >>>> For reference, wulfware/wulfstat was nixed as well because of the >>>> xmlsysd dependency. >> >> That's fine, but I'm curious. How do you expect to run a cluster >> information tool over a network without a socket at both ends? If not >> xmlsysd, then something else -- sshd, xinetd, dedicated or general >> purpose, where the latter almost certainly will have have higher >> overhead? Or are you looking for something with a kernel level network >> interface, more like scyld? >> >> rgb >> >>>> _______________________________________________ >>>> Beowulf mailing list, Beowulf@beowulf.org >>>> To change your subscription (digest mode or unsubscribe) visit >>>> http://www.beowulf.org/mailman/listinfo/beowulf >>>> >>> >>> >>> >>> -- >>> Greg Kurtzer >>> http://www.infiscale.com/ >>> http://www.runlevelzero.net/ >>> http://www.perceus.org/ >>> http://www.caoslinux.org/ >>> _______________________________________________ >>> Beowulf mailing list, Beowulf@beowulf.org >>> To change your subscription (digest mode or unsubscribe) visit >>> http://www.beowulf.org/mailman/listinfo/beowulf >>> >> >> Robert G. Brown Phone(cell): 1-919-280-8443 >> Duke University Physics Dept, Box 90305 >> Durham, N.C. 27708-0305 >> Web: http://www.phy.duke.edu/~rgb >> Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php >> Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 >> > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From prentice at ias.edu Tue Dec 2 07:24:15 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:00 2009 Subject: [Beowulf] InfiniBand VL15 error Message-ID: <4935531F.6040807@ias.edu> I'm getting this error when I run ibchecknet on my cluster: #warn: counter VL15Dropped = 476 (threshold 100) lid 1 port 1 Error check on lid 1 (aurora HCA-1) port 1: FAILED I've googled around this morning, but haven't found anything helpful. Most of the hits turn up code with the phrase "VL15Dropped", but nothing explaining what this error means, what causes it, or how to fix it. After clearing the counters with 'perfquery -r', the VL15Dropped count starts increasing from zero almost immediately. Any ideas what this error represents or how to fix? Could it be a bad cable? -- Prentice From Shainer at mellanox.com Tue Dec 2 11:01:49 2008 From: Shainer at mellanox.com (Gilad Shainer) Date: Wed Nov 25 01:08:00 2009 Subject: [Beowulf] InfiniBand VL15 error Message-ID: <9FA59C95FFCBB34EA5E42C1A8573784F017A3A81@mtiexch01.mti.com> I can try to help you here, and would need to understand your setup and on which port the drop is occurring on. Bad cable causing this seems very unlikely. Gilad. -----Original Message----- From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Prentice Bisbal Sent: Tuesday, December 02, 2008 7:24 AM To: Beowulf Mailing List Subject: [Beowulf] InfiniBand VL15 error I'm getting this error when I run ibchecknet on my cluster: #warn: counter VL15Dropped = 476 (threshold 100) lid 1 port 1 Error check on lid 1 (aurora HCA-1) port 1: FAILED I've googled around this morning, but haven't found anything helpful. Most of the hits turn up code with the phrase "VL15Dropped", but nothing explaining what this error means, what causes it, or how to fix it. After clearing the counters with 'perfquery -r', the VL15Dropped count starts increasing from zero almost immediately. Any ideas what this error represents or how to fix? Could it be a bad cable? -- Prentice _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From tvixel at gmail.com Mon Dec 1 13:57:31 2008 From: tvixel at gmail.com (Thomas Vixel) Date: Wed Nov 25 01:08:00 2009 Subject: [Beowulf] cli alternative to cluster top? In-Reply-To: References: <571f1a060811292333q4aa31ab6x5b7995baa4145445@mail.gmail.com> Message-ID: The main requirements were that 1) "it must look like top", 2) it must be cli-based, 3) it should not introduce another piece of (server) software (and thus point of failure) into the system, 4) and it should not require any local hacks on our part. Since this is a web cluster, I suppose the most logical transport would be HTTP, but everything I've seen so far would require me to violate (4) to satisfy 1-3. SSH & SNMP are also available, I could see SNMP being problematic for a project like this. SSH *might* work and *might* be scalable for a project like this since the main expense is in the construction the connections (and they can be re-used across the course of the execution), but I have yet to find a top-like program that leverages it. xmlsysd & wulfstat would appear to be the least expensive solution to this problem, but as I said, management already nixed it. Honestly, if it weren't for (4), I'd probably just grab the top source and graft it onto a SSH library. It might not be the most efficient solution, but for a HA cluster it doesn't necessarily HAVE to be. On 11/30/08, Robert G. Brown wrote: > On Sat, 29 Nov 2008, Greg Kurtzer wrote: > >> Warewulf has a real time top like command for the cluster nodes and >> has been known to scale up to the thousands of nodes: >> >> http://www.runlevelzero.net/images/wwtop-screenshot.png >> >> We are just kicking off Warewulf development again now that Perceus >> has gotten to a critical mass and Caos NSA 1.0 has been released. We >> should have our repositories for Warewulf-3 pre-releases up shortly >> but if you need something ASAP, please contact me offline and I will >> get you what you need. >> >> Thanks! >> Greg >> >> On Wed, Nov 26, 2008 at 12:39 PM, Thomas Vixel wrote: >>> I've been googling for a top-like cli tool to use on our cluster, but >>> the closest thing that comes up is Rocks' "cluster top" script. That >>> could be tweaked to work via the cli, but due to factors beyond my >>> control (management) all functionality has to come from a pre-fab >>> program rather than a software stack with local, custom modifications. >>> >>> I'm sure this has come up more than once in the HPC sector as well -- >>> could anyone point me to any top-like apps for our cluster? >>> >>> For reference, wulfware/wulfstat was nixed as well because of the >>> xmlsysd dependency. > > That's fine, but I'm curious. How do you expect to run a cluster > information tool over a network without a socket at both ends? If not > xmlsysd, then something else -- sshd, xinetd, dedicated or general > purpose, where the latter almost certainly will have have higher > overhead? Or are you looking for something with a kernel level network > interface, more like scyld? > > rgb > >>> _______________________________________________ >>> Beowulf mailing list, Beowulf@beowulf.org >>> To change your subscription (digest mode or unsubscribe) visit >>> http://www.beowulf.org/mailman/listinfo/beowulf >>> >> >> >> >> -- >> Greg Kurtzer >> http://www.infiscale.com/ >> http://www.runlevelzero.net/ >> http://www.perceus.org/ >> http://www.caoslinux.org/ >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> > > Robert G. Brown Phone(cell): 1-919-280-8443 > Duke University Physics Dept, Box 90305 > Durham, N.C. 27708-0305 > Web: http://www.phy.duke.edu/~rgb > Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php > Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 > From tvixel at gmail.com Mon Dec 1 15:22:35 2008 From: tvixel at gmail.com (Thomas Vixel) Date: Wed Nov 25 01:08:00 2009 Subject: [Beowulf] cli alternative to cluster top? In-Reply-To: References: Message-ID: That does sound interesting, but more for some of my personal projects. It wouldn't work for the situation at hand because: 1) It sounds like it introduces a SPF (the head node). 2) Giving our developers cluster-wide 'killall' & 'kill' functionality makes me cringe. Most of them only know just enough about Linux to be dangerous. 3) It would require completely reworking our current cluster solution; a daunting task to say the least. 4) There isn't much love for commercial & non-OSS software at our company. On 11/30/08, Donald Becker wrote: > On Wed, 26 Nov 2008, Thomas Vixel wrote: > >> I've been googling for a top-like cli tool to use on our cluster, but >> the closest thing that comes up is Rocks' "cluster top" script. That >> could be tweaked to work via the cli, but due to factors beyond my >> control (management) all functionality has to come from a pre-fab >> program rather than a software stack with local, custom modifications. >> >> I'm sure this has come up more than once in the HPC sector as well -- >> could anyone point me to any top-like apps for our cluster? > > Most remote job mechanisms only think about starting remote processes, not > about the full create-monitor-control-report functionality. > > The Scyld system (currently branded "Clusterware") defaults to using a > built-in unified process space. That presents all of the processes > running over the cluster in a process space on the master machine, with > fully POSIX semantics. It neatly solves your need with... the standard > 'top' program. > > Most scheduling systems also have a way to monitor processes that they > start, but I haven't found one that takes advantage of all information > available and reports it quickly/efficiently. > > There are many advantages of the Scyld implementation > -- no new or modified process management tools need to be written. > Standard utilities such as 'top' and 'ps' work unmodified, > as well as tools we didn't specifically plan for e.g. GUI versions of > 'pstree'. > -- The 'killall' program works over the cluster, efficiently. > -- All signals work as expected, including 'kill -9'. (Most remote > process starting mechanisms will just kill off the local endpoint, > leaving the remote process running-but-confused.) > -- Process groups and controlling-TTY groups works properly for job > control and signals > -- Running jobs report their status and statistics accurately -- an > updated 'rusage' structure is sent once per second, and a final > rusage structure and exit status is sent when the process terminates. > > The "downside" is that we explicitly use Linux features and details, > relying on kernel-version-specific features. That's an issue if it's a > one-off hack, but we've been using this approach continuously for > a decade, since the Linux 2.2 kernel and over multiple > architectures. We've been producing supported commercial releases > since 2000, longer than anyone else in the business. > > -- > Donald Becker becker@scyld.com > Penguin Computing / Scyld Software > www.penguincomputing.com www.scyld.com > Annapolis MD and San Francisco CA > > From malcolm.croucher at gmail.com Tue Dec 2 03:05:57 2008 From: malcolm.croucher at gmail.com (malcolm croucher) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question Message-ID: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> Hi Guys , I am still thinking about my cluster and most probably will only begin development next year june/july . Question : If i develop my system on 10 computers (nodes) which are all normal desktops and then would like to place this in data hosting facility which has access to real time information . I am going to need to buy new servers (thin 1 u servers ). Would this be the best choice as desktops take up more space and therefore will be more expensive . how do you guys get around this problem ? or dont you ? Regards Malcolm -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081202/694005b0/attachment.html From lindahl at pbm.com Tue Dec 2 13:18:29 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] InfiniBand VL15 error In-Reply-To: <4935531F.6040807@ias.edu> References: <4935531F.6040807@ias.edu> Message-ID: <20081202211829.GB2920@bx9> On Tue, Dec 02, 2008 at 10:24:15AM -0500, Prentice Bisbal wrote: > #warn: counter VL15Dropped = 476 (threshold 100) lid 1 port 1 > Error check on lid 1 (aurora HCA-1) port 1: FAILED IB is blissfully fading from my brain, but I think this refers to control packets being dropped due to resource limits on the recipient. That takes talent if you're using a Mellanox HCA, as pretty much all of the VL15 packets are interpreted by the processor in the HCA. -- greg From lindahl at pbm.com Tue Dec 2 13:25:29 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Second source for IB switch silicon Message-ID: <20081202212529.GC2920@bx9> As you folks probably know, there is only 1 source for InfiniBand switch silicon today. When IB was first mooted, several companies built switch silicon, but only 2 made it to market, and now only Mellanox makes switch chips. One of the startups doing an IB switch had just gotten their first chip samples back when Intel dropped out of IB, so the startup never powered on the chip. This startup was subsequently bought by QLogic, and did several generations of Fibre Channel switch chips, including some pretty big ones. Well, QLogic announced at SC that they're going to be producing IB switch silicon. No announcement of a ship date, but it'll be nice to have a second source for both HCAs and switches. -- greg Disclaimer: I used to work for QLogic, but don't have any financial interest in them anymore. From niftyompi at niftyegg.com Tue Dec 2 13:24:14 2008 From: niftyompi at niftyegg.com (Nifty Tom Mitchell) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] InfiniBand VL15 error In-Reply-To: <4935531F.6040807@ias.edu> References: <4935531F.6040807@ias.edu> Message-ID: <20081202212414.GA3175@compegg.wr.niftyegg.com> On Tue, Dec 02, 2008 at 10:24:15AM -0500, Prentice Bisbal wrote: > > I'm getting this error when I run ibchecknet on my cluster: > > #warn: counter VL15Dropped = 476 (threshold 100) lid 1 port 1 > Error check on lid 1 (aurora HCA-1) port 1: FAILED > > I've googled around this morning, but haven't found anything helpful. > Most of the hits turn up code with the phrase "VL15Dropped", but nothing > explaining what this error means, what causes it, or how to fix it. > > After clearing the counters with 'perfquery -r', the VL15Dropped count > starts increasing from zero almost immediately. > > Any ideas what this error represents or how to fix? Could it be a bad > cable? > Can you be specific about the hardware (HCA and switch) and software? How large is the fabric? What subnet manager is running and where? The host behind LID-1 is the one of interest. If I recall correctly, VL15 is reserved exclusively for subnet management and is not optional. Traffic to VL15 might be randomly dropped by the switch, SMA or interrupt handler. As long as the subnet is OK modest dropped traffic on VL15 may not be an issue. What is running on the fabric concurrently with ibchecknet (and on the LID-1 host)? Subnet management traffic should be light, very light. Tell us about the subnet manager situation on your fabric. There should only be one active subnet manager. Mixed and uncooperating SMs could cause this, as could basic IB errors (connectors, cables, connections). If the SM is running on LID-1 then traffic will reflect the fabric size. What other IB errors are you seeing.. If the port for LID-1 is not seeing IB errors other than VL15 you should be OK -- do look for multiple SMs. If you stop your subnet manager does the counter reflect the pause. -- T o m M i t c h e l l Found me a new hat, now what? From prentice at ias.edu Tue Dec 2 13:33:20 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] InfiniBand VL15 error In-Reply-To: <20081202211829.GB2920@bx9> References: <4935531F.6040807@ias.edu> <20081202211829.GB2920@bx9> Message-ID: <4935A9A0.2010005@ias.edu> Greg Lindahl wrote: > On Tue, Dec 02, 2008 at 10:24:15AM -0500, Prentice Bisbal wrote: > >> #warn: counter VL15Dropped = 476 (threshold 100) lid 1 port 1 >> Error check on lid 1 (aurora HCA-1) port 1: FAILED > > IB is blissfully fading from my brain, but I think this refers to > control packets being dropped due to resource limits on the recipient. > That takes talent if you're using a Mellanox HCA, as pretty much all > of the VL15 packets are interpreted by the processor in the HCA. > > -- greg > > Just my luck. I'm using Cisco HCAs, which are really Mellanox HCAs: # lspci | grep Infini 0b:00.0 InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor compatibility mode) (rev 20) Fortunately, Gilad from Mellanox has offered me some assistance off-list. -- Prentice From prentice at ias.edu Tue Dec 2 14:02:59 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] InfiniBand VL15 error In-Reply-To: <20081202212414.GA3175@compegg.wr.niftyegg.com> References: <4935531F.6040807@ias.edu> <20081202212414.GA3175@compegg.wr.niftyegg.com> Message-ID: <4935B093.70006@ias.edu> See my answers inline. Nifty Tom Mitchell wrote: > On Tue, Dec 02, 2008 at 10:24:15AM -0500, Prentice Bisbal wrote: >> I'm getting this error when I run ibchecknet on my cluster: >> >> #warn: counter VL15Dropped = 476 (threshold 100) lid 1 port 1 >> Error check on lid 1 (aurora HCA-1) port 1: FAILED >> >> I've googled around this morning, but haven't found anything helpful. >> Most of the hits turn up code with the phrase "VL15Dropped", but nothing >> explaining what this error means, what causes it, or how to fix it. >> >> After clearing the counters with 'perfquery -r', the VL15Dropped count >> starts increasing from zero almost immediately. >> >> Any ideas what this error represents or how to fix? Could it be a bad >> cable? >> > > Can you be specific about the hardware (HCA and switch) and software? > How large is the fabric? > What subnet manager is running and where? > > The host behind LID-1 is the one of interest. IB Switch: Cisco 7012 D, 144-port HCAs: Cisco, which is really Mellanox: # lspci | grep Infini 0b:00.0 InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor compatibility mode) (rev 20) The subnet manager is OpenSM 3.1.8-1.el5, which is provided by my Linux Distro, PU_IAS 5.2, which is a rebuild of RHEL 5.2. It is running on the master node, aurora. The HCA with the error is on this node (see errors message in original post). > > If I recall correctly, VL15 is reserved exclusively for subnet management > and is not optional. Traffic to VL15 might be randomly dropped by the > switch, SMA or interrupt handler. As long as the subnet is OK modest > dropped traffic on VL15 may not be an issue. > > What is running on the fabric concurrently with ibchecknet (and on the LID-1 host)? Not sure what you mean. Do you want to see the output of ibchecknet? > > Subnet management traffic should be light, very light. Tell us about > the subnet manager situation on your fabric. There should only > be one active subnet manager. Mixed and uncooperating SMs could > cause this, as could basic IB errors (connectors, cables, connections). > If the SM is running on LID-1 then traffic will reflect the fabric size. There is only one SM running. It's running on the master node. The other nodes don't even have the OpenSM package installed. > > What other IB errors are you seeing.. If the port for LID-1 is not seeing > IB errors other than VL15 you should be OK -- do look for multiple SMs. I'm not seeing any other errors. This one is a new development, too. > If you stop your subnet manager does the counter reflect the pause. > Haven't tried yet. And since it's almost quitting time, I'm not going to try until tomorrow. -- Prentice From rgb at phy.duke.edu Tue Dec 2 14:12:09 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> Message-ID: On Tue, 2 Dec 2008, malcolm croucher wrote: > Hi? Guys , > > I am still thinking about my cluster and most probably will only begin > development next year june/july . > > Question : > > If i develop my system on? 10 computers (nodes) which are all normal > desktops and then would like to place this in data hosting facility which > has access to real time information . I am going to need to buy new servers > (thin 1 u servers ). Would this be the best choice as desktops take up more > space and therefore will be more expensive . how do you guys get around this > problem ? or dont you ? I'm not sure I understand you, but let me try. You plan to buy 10 computers to use for development of a cluster application. I presume that this means ten actual boxes, not a single box with two quad core processors and a dual core or something like that (my own choice for development would be no more than four machines, each with one quad core processor or two dual core processors depending on just where I expected to be internally bottlenecked, OR a single dual quad). Then you expect to place this in a data hosting facility -- basically buy rack space and somebody to punch reset for you plus high speed access to -- something you need in production but not in development. Is that right? The usual rule is that 1U nodes will cost a bit more than equivalently equipped desktops, primarily because the 1U cases cost more and because one has to work a bit harder to ensure that e.g. the CPUs stay cool and so on. I haven't checked marginal differences recently, but would guestimate $250-500 per box. This is a bit of a hit on ten boxes (if that's what you plan on) but not a crazy one, and if you spend it up front you a) have a MUCH smaller stack of systems in your house or regular office -- ten 1U nodes in a stack is the size of a dorm refrigerator, where ten minitowers stacked up is 2-3 times as much space/volume -- then you don't have to buy new systems to move them into your hosting facility rack, where they sell you space and where hosting the towers -- if they let you put towers there at all -- will cost you more on the far side. To put it another way, I THINK that you should probably just get 1U systems from the beginning, but there are so many variables you haven't given us I can't be sure. For example, your money or somebody else's? Academic research or entrepreneurial? HPC cluster or HA cluster? Why "ten nodes" -- a bit of an unusual number. What are the relative costs on the hosting side? Would a much smaller system (like a single $3300 eight core dual quad) work just as well for prototyping, leaving you with money later to buy as many 1U nodes as you need AFTER prototyping and estimating capacity and so on? rgb > > Regards > > Malcolm > > > > > > > > > > > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From lindahl at pbm.com Tue Dec 2 14:35:26 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> Message-ID: <20081202223526.GA12378@bx9> On Tue, Dec 02, 2008 at 05:12:09PM -0500, Robert G. Brown wrote: > The usual rule is that 1U nodes will cost a bit more than equivalently > equipped desktops, primarily because the 1U cases cost more and because > one has to work a bit harder to ensure that e.g. the CPUs stay cool and > so on. The single-socket 2U nodes that I buy cost the same as our developer desktops, which have a somewhat expensive case in order to be quiet. Since he's putting these into a hosting facility, it's likely that having 1U boxes doesn't gain him anything... the limit is generally power, not space. And 2U boxes are quieter and more reliable. -- greg From peter.st.john at gmail.com Tue Dec 2 14:52:04 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> Message-ID: Malcolm, Just a plug for the recherche' space-minimization solution (I'm still thinking about my own hypothetical cluster, as you are). This article http://www.linuxjournal.com/article/8177 is Ron Minnich's "Beowulf in a Lunchbox"; 16 mini motherboards with risers. You could possibly consider a rackable system with diskless nodes (hand-me-down desktop motherboards, with onboard NIC, would fit in 1U slots, right?) and put the disks and power supplies and routers in a dense 2 or 3U space. Then maybe you could just load it all into a rack lafter the initial development in your basement. But as RGB mentions it depends on lots of things; heat, dust in your office, the budget, etc. Incidentally, if you built say 4 single-slot quad-core nodes with 2 dual-slot 8 core (per board) nodes, then you could find out which bandwidth/mem-channel/per-core configuration works better for your app, instead of anticipating; in the case that your initial prototype needn't be ideal. Peter P.S. incidentally, I think that Ron Minnich is the one I knew in High School; which is a bit odd, parallel to RGB from Duke. Next I'll meet a bewulfer from my kindergarden class :-) On Tue, Dec 2, 2008 at 6:05 AM, malcolm croucher wrote: > Hi Guys , > > I am still thinking about my cluster and most probably will only begin > development next year june/july . > > Question : > > If i develop my system on 10 computers (nodes) which are all normal > desktops and then would like to place this in data hosting facility which > has access to real time information . I am going to need to buy new servers > (thin 1 u servers ). Would this be the best choice as desktops take up more > space and therefore will be more expensive . how do you guys get around this > problem ? or dont you ? > > Regards > > Malcolm > > > > > > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081202/3cb7e326/attachment.html From dnlombar at ichips.intel.com Tue Dec 2 15:26:54 2008 From: dnlombar at ichips.intel.com (Lombard, David N) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> Message-ID: <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> On Tue, Dec 02, 2008 at 02:12:09PM -0800, Robert G. Brown wrote: > On Tue, 2 Dec 2008, malcolm croucher wrote: > > > Hi? Guys , > > > > I am still thinking about my cluster and most probably will only begin > > development next year june/july . > > > > Question : > > > > If i develop my system on? 10 computers (nodes) which are all normal > > desktops and then would like to place this in data hosting facility which > > has access to real time information . I am going to need to buy new servers > > (thin 1 u servers ). Would this be the best choice as desktops take up more > > space and therefore will be more expensive . how do you guys get around this > > problem ? or dont you ? > > I'm not sure I understand you, but let me try. > > (my own choice for development would be no more than four machines, each > with one quad core processor or two dual core processors depending on > just where I expected to be internally bottlenecked, OR a single dual > quad). Agreed. Four nodes should see any issues that 10 would also see. Less and you can miss concurrency issues. > To put it another way, I THINK that you should probably just get 1U > systems from the beginning, but there are so many variables you haven't > given us I can't be sure. An acoustic concern. A 1U is quite a bit louder than the normal desktop as (1) they use itty-bitty fans and (b) there's no incentive to make them quiet, as nobody is expected to have to put up with their screaming... -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From niftyompi at niftyegg.com Tue Dec 2 16:04:47 2008 From: niftyompi at niftyegg.com (Nifty Tom Mitchell) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] InfiniBand VL15 error In-Reply-To: <4935B093.70006@ias.edu> References: <4935531F.6040807@ias.edu> <20081202212414.GA3175@compegg.wr.niftyegg.com> <4935B093.70006@ias.edu> Message-ID: <20081203000447.GA4279@compegg.wr.niftyegg.com> On Tue, Dec 02, 2008 at 05:02:59PM -0500, Prentice Bisbal wrote: > > See my answers inline. > > Nifty Tom Mitchell wrote: > > On Tue, Dec 02, 2008 at 10:24:15AM -0500, Prentice Bisbal wrote: > >> I'm getting this error when I run ibchecknet on my cluster: > >> > >> #warn: counter VL15Dropped = 476 (threshold 100) lid 1 port 1 > >> Error check on lid 1 (aurora HCA-1) port 1: FAILED > >> > >> I've googled around this morning, but haven't found anything helpful. > >> Most of the hits turn up code with the phrase "VL15Dropped", but nothing > >> explaining what this error means, what causes it, or how to fix it. > >> > >> After clearing the counters with 'perfquery -r', the VL15Dropped count > >> starts increasing from zero almost immediately. > >> > >> Any ideas what this error represents or how to fix? Could it be a bad > >> cable? > >> > > > > Can you be specific about the hardware (HCA and switch) and software? > > How large is the fabric? > > What subnet manager is running and where? > > > > The host behind LID-1 is the one of interest. > > IB Switch: Cisco 7012 D, 144-port > HCAs: Cisco, which is really Mellanox: > > # lspci | grep Infini > 0b:00.0 InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex > (Tavor compatibility mode) (rev 20) > > The subnet manager is OpenSM 3.1.8-1.el5, which is provided by my Linux > Distro, PU_IAS 5.2, which is a rebuild of RHEL 5.2. It is running on the > master node, aurora. The HCA with the error is on this node (see errors > message in original post). > > > > > If I recall correctly, VL15 is reserved exclusively for subnet management > > and is not optional. Traffic to VL15 might be randomly dropped by the > > switch, SMA or interrupt handler. As long as the subnet is OK modest > > dropped traffic on VL15 may not be an issue. > > > > What is running on the fabric concurrently with ibchecknet (and on the LID-1 host)? > > Not sure what you mean. Do you want to see the output of ibchecknet? What I was thinking was that on a compute bound system the subnet manager process might not get enough time to service all the management packets. In the Mellanox case the card on a local node can have many SMA actions handled inside the card larger fabric wide actions need interrupts and system time. If this was an overloaded IO or compute node the subnet manager may not wake up often enough to handle all the management packets. i.e. it may be normal and OK with this load, card, software stack and SM to see VL15 drops. Your Mellanox contact can help answer this... > > > > Subnet management traffic should be light, very light. Tell us about > > the subnet manager situation on your fabric. There should only > > be one active subnet manager. Mixed and uncooperating SMs could > > cause this, as could basic IB errors (connectors, cables, connections). > > If the SM is running on LID-1 then traffic will reflect the fabric size. > > There is only one SM running. It's running on the master node. The other > nodes don't even have the OpenSM package installed. > > > > What other IB errors are you seeing.. If the port for LID-1 is not seeing > > IB errorsu other than VL15 you should be OK -- do look for multiple SMs. > > I'm not seeing any other errors. This one is a new development, too. > > > If you stop your subnet manager does the counter reflect the pause. > > > > Haven't tried yet. And since it's almost quitting time, I'm not going to > try until tomorrow. Pausing the subnet manager can be diagnostic. If you pause/ stop the SM and reboot a free node, the free node will not be assigned a LID. If you have another SM on the fabric it will get a LID. While multiple subnet managers are legal the interactions between different versions has too many permutations for good test coverage. It can be good to 'test' for unexpected subnet managers. Do revisit your Open SM settings. Sweeps for node status may just be too aggressive. There is a chance that your opensm is dated. It look like: opensm-3.1.8-1.el5.x86_64.rpm. Build Date: Mon Mar 17 14:12:13 2008 Inspect the change log dates ;-0 ftp://ftp.cs.stanford.edu/pub/mirrors/scientific/52/x86_64/SL/repodata/repoview/opensm-0-3.1.8-1.el5.html The current OFED version looks like: opensm-3.1.11-1.ofed1.3.1.src.rpm While OFED and rpm versions do not always track consider an update. Also note RH is slow picking up many OFED changes as the OFED process is a big bang release process. Other on this list might know if the delta from 3.1.8 to 3.1.11 is important in this regard. -- T o m M i t c h e l l Found me a new hat, now what? From rgb at phy.duke.edu Tue Dec 2 16:11:37 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> Message-ID: On Tue, 2 Dec 2008, Lombard, David N wrote: > An acoustic concern. A 1U is quite a bit louder than the normal desktop as > (1) they use itty-bitty fans and (b) there's no incentive to make them > quiet, as nobody is expected to have to put up with their screaming... A good point. I actually like Greg's suggestion best -- consider (fewer) 2U nodes instead -- quieter, more robust, cooler. Perhaps four, but that strongly depends on the kind of thing you are trying to do -- tell us what it is if you can do so without having to kill and we'll try to help you estimate your communications issues and likely bottlenecks. For some tasks you are best off getting as few actual boxes as possible with as many as possible CPU cores per box. For others, having more boxes and fewer cores per box will be right. The reason I like four nodes with at least a couple of cores each is that if you don't KNOW what you are likely to need, you can find out (probably) with this many nodes and then "fix" your design if/when you scale up into production. Otherwise you buy eight single core node (if they still make single cores:-) and then learn that you would have been much better off buying a single eight core node. Or vice versa. rgb > > -- > David N. Lombard, Intel, Irvine, CA > I do not speak for Intel Corporation; all comments are strictly my own. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From eugen at leitl.org Tue Dec 2 23:37:57 2008 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <20081202223526.GA12378@bx9> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <20081202223526.GA12378@bx9> Message-ID: <20081203073757.GN11544@leitl.org> On Tue, Dec 02, 2008 at 02:35:26PM -0800, Greg Lindahl wrote: > On Tue, Dec 02, 2008 at 05:12:09PM -0500, Robert G. Brown wrote: > > > The usual rule is that 1U nodes will cost a bit more than equivalently > > equipped desktops, primarily because the 1U cases cost more and because > > one has to work a bit harder to ensure that e.g. the CPUs stay cool and > > so on. Our staple box is the SunFire X2100 M2, which sells for about 400 EUR sans VAT (kit has 1.8 GHz dual-core Opteron, 512 MByte DDR2, no disk). I've measured 115 W at idle with 2x TByte SATA disk and 8 GByte DDR2 RAM which is even better than what Sun says http://www.sun.com/servers/entry/x2100/M2calc/index.jsp#calc > The single-socket 2U nodes that I buy cost the same as our developer > desktops, which have a somewhat expensive case in order to be quiet. > Since he's putting these into a hosting facility, it's likely that > having 1U boxes doesn't gain him anything... the limit is generally > power, not space. And 2U boxes are quieter and more reliable. From malcolm.croucher at gmail.com Tue Dec 2 23:25:54 2008 From: malcolm.croucher at gmail.com (malcolm croucher) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> Message-ID: <386fa5610812022325g3b8eb4fl185aa8936399beea@mail.gmail.com> Its gonna be used for computational chemisty , not academic but more private / entrepreneurship. I been doing a lot of research in this area for a while and was hoping to do some more on my own. On Wed, Dec 3, 2008 at 2:11 AM, Robert G. Brown wrote: > On Tue, 2 Dec 2008, Lombard, David N wrote: > > An acoustic concern. A 1U is quite a bit louder than the normal desktop as >> (1) they use itty-bitty fans and (b) there's no incentive to make them >> quiet, as nobody is expected to have to put up with their screaming... >> > > A good point. I actually like Greg's suggestion best -- consider > (fewer) 2U nodes instead -- quieter, more robust, cooler. Perhaps four, > but that strongly depends on the kind of thing you are trying to do -- > tell us what it is if you can do so without having to kill and we'll try > to help you estimate your communications issues and likely bottlenecks. > For some tasks you are best off getting as few actual boxes as possible > with as many as possible CPU cores per box. For others, having more > boxes and fewer cores per box will be right. > > The reason I like four nodes with at least a couple of cores each is > that if you don't KNOW what you are likely to need, you can find out > (probably) with this many nodes and then "fix" your design if/when you > scale up into production. Otherwise you buy eight single core node (if > they still make single cores:-) and then learn that you would have been > much better off buying a single eight core node. Or vice versa. > > rgb > > >> -- >> David N. Lombard, Intel, Irvine, CA >> I do not speak for Intel Corporation; all comments are strictly my own. >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> >> > Robert G. Brown Phone(cell): 1-919-280-8443 > Duke University Physics Dept, Box 90305 > Durham, N.C. 27708-0305 > Web: http://www.phy.duke.edu/~rgb > Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php > Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 > -- Malcolm A.B Croucher -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081203/1792d1f6/attachment.html From rgb at phy.duke.edu Wed Dec 3 07:33:27 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <386fa5610812022325g3b8eb4fl185aa8936399beea@mail.gmail.com> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> <386fa5610812022325g3b8eb4fl185aa8936399beea@mail.gmail.com> Message-ID: On Wed, 3 Dec 2008, malcolm croucher wrote: > Its gonna be used for computational chemisty , not academic but more private > / entrepreneurship. I been doing a lot of research in this area for a while > and was hoping to do some more on my own. Any idea of the specific software you plan to use? Or do you plan to write your own. There are lots of people on-list that can help you e.g. estimate the likely task granularity if you identify your toolset (only, not what you hope to invent:-). Basically, more important than the case you plan to put your system(s) in is the balance between computation (computer cores at some given clock), memory (bandwidth and contention between cores and memory), and interprocessor communications both within a system (between one core/thread and another) and between systems (network based IPCs). Each of the pathways from a core outward has an associated cost in latency and bandwidth, and very different investment strategies will yield the best bang for a limited supply of bucks for different "kinds" of parallel problems. So the very first step of cluster engineering is typically to analyze you tasks' patterns of computation, memory access and interprocessor communication. Once that is known, it is usually possible to identify (for example) whether it is better to have fewer processors and a faster network or more processors and a slow network. Since a really fast network can cost as much as two or more cores and since one has to balance network needs against ALL the cores per chassis, this can be a significant tradeoff. Ditto for tasks that tend to be memory bound -- in that case one might want to opt for fewer cores per box to ensure that each core can access memory at full speed with minimal lost efficiency due to contention. rgb > > On Wed, Dec 3, 2008 at 2:11 AM, Robert G. Brown wrote: > On Tue, 2 Dec 2008, Lombard, David N wrote: > > An acoustic concern. A 1U is quite a bit louder than > the normal desktop as > (1) they use itty-bitty fans and (b) there's no > incentive to make them > quiet, as nobody is expected to have to put up with > their screaming... > > > A good point. ?I actually like Greg's suggestion best -- consider > (fewer) 2U nodes instead -- quieter, more robust, cooler. ?Perhaps > four, > but that strongly depends on the kind of thing you are trying to do -- > tell us what it is if you can do so without having to kill and we'll > try > to help you estimate your communications issues and likely > bottlenecks. > For some tasks you are best off getting as few actual boxes as > possible > with as many as possible CPU cores per box. ?For others, having more > boxes and fewer cores per box will be right. > > The reason I like four nodes with at least a couple of cores each is > that if you don't KNOW what you are likely to need, you can find out > (probably) with this many nodes and then "fix" your design if/when you > scale up into production. ?Otherwise you buy eight single core node > (if > they still make single cores:-) and then learn that you would have > been > much better off buying a single eight core node. ?Or vice versa. > > ? rgb > > > -- > David N. Lombard, Intel, Irvine, CA > I do not speak for Intel Corporation; all comments are > strictly my own. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > Robert G. Brown ? ? ? ? ? ? ? ? ? ? ? ? ? ?Phone(cell): 1-919-280-8443 > Duke University Physics Dept, Box 90305 > Durham, N.C. 27708-0305 > Web: http://www.phy.duke.edu/~rgb > Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php > Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 > > > > > -- > Malcolm A.B Croucher > > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From mathog at caltech.edu Wed Dec 3 09:47:17 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Re: Intro question Message-ID: "Lombard, David N" wrote: > An acoustic concern. A 1U is quite a bit louder than the normal desktop as > (1) they use itty-bitty fans and (b) there's no incentive to make them > quiet, as nobody is expected to have to put up with their screaming... Amen to that - the screech from those tiny fans is unbearable. Even a 2U system with larger (usually 80 mm) fans is going to be substantially louder than a typical desktop (mostly using 120mm now). The 2U, like the 1U, is designed for machine rooms, so cooling trumps quiet every time. To the OP: do not assume that your developer and 10 CPU system will happily work in the same room. (Unless your developer happens to be deaf and not at all sensitive to unusual air temperatures.) Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From eugen at leitl.org Wed Dec 3 10:02:18 2008 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Re: Intro question In-Reply-To: References: Message-ID: <20081203180218.GG11544@leitl.org> On Wed, Dec 03, 2008 at 09:47:17AM -0800, David Mathog wrote: > To the OP: do not assume that your developer and 10 CPU system will > happily work in the same room. (Unless your developer happens to be > deaf and not at all sensitive to unusual air temperatures.) Arguably, it is too distracting even if you're in the room next to it, behind two doors. When switching them on it's a fair approximation of a starting jet. Less an issue in a cube farm, of course. From hearnsj at googlemail.com Wed Dec 3 10:30:59 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Re: Intro question In-Reply-To: <20081203180218.GG11544@leitl.org> References: <20081203180218.GG11544@leitl.org> Message-ID: <9f8092cc0812031030j4827b3fej93ad2fb528f7957d@mail.gmail.com> I would think about a blade system for this particular application. (Say) one of the Supermicro blade enclosures. You could start small for development with one or two blades, then expand. Should be easy enough to transport to the hosting location when you are ready. You can get acoustically quiet racks to put them in if it is going in an office, but that will inevitably cost more. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081203/3f9c85cf/attachment.html From hahn at mcmaster.ca Wed Dec 3 11:03:20 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Re: Intro question In-Reply-To: <9f8092cc0812031030j4827b3fej93ad2fb528f7957d@mail.gmail.com> References: <20081203180218.GG11544@leitl.org> <9f8092cc0812031030j4827b3fej93ad2fb528f7957d@mail.gmail.com> Message-ID: > I would think about a blade system for this particular application. aren't blades still a significant premium? > (Say) one of the Supermicro blade enclosures. You could start small for > development with one or two blades, well, nothing beats the incremental expandability of a stack of separate boxes. given that modern cpus are drastically cooler (esp per flop) than even 2 years ago, fans rarely run fast. > then expand. Should be easy enough to transport to the hosting location when > you are ready. a blade chassis usually requires power other than the usual 15A circuit; nothing custom of course (L6-30 is my favorite) but something like a dual-board supermicro 1U is pretty convenient and flexible. > You can get acoustically quiet racks to put them in if it is going in an > office, but that will inevitably cost more. if office, I'd certainly just get minitowers. but I wouldn't consider putting more than a few in an office, since a ~250W minitower already corresponds to the power dissipation of two people... From Bogdan.Costescu at iwr.uni-heidelberg.de Wed Dec 3 12:17:00 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Re: Intro question In-Reply-To: References: <20081203180218.GG11544@leitl.org> <9f8092cc0812031030j4827b3fej93ad2fb528f7957d@mail.gmail.com> Message-ID: On Wed, 3 Dec 2008, Mark Hahn wrote: > if office, I'd certainly just get minitowers. For office, I would recommend barebones in SFF (small form factor) cases, like those commonly advertised for HTPC (Home theater PC). I have built a cluster of 80 of those (Shuttle SB75G2) in 2004, you can see a rather bad picture in the "IWR Cluster 4 part a" section of: http://www.iwr.uni-heidelberg.de/services/equipment/parallel/ Because of their number, they are located in a well cooled computer room, the total consumption being close to 9KW under load. These barebones often contain what is needed for a simple cluster node, with only CPU, memory and possibly a disk to be added. The cooling is often better designed than in a normal (mini-)tower case and because of their intended usage they can be even quiter - but the difference between the different models or manufacturers can be huge. The price is slightly higher than a comparable (mini)tower; the impression that it makes on visitors is however much greater ;-) -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8240, Fax: +49 6221 54 8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From hahn at mcmaster.ca Wed Dec 3 12:46:57 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Re: Intro question In-Reply-To: References: <20081203180218.GG11544@leitl.org> <9f8092cc0812031030j4827b3fej93ad2fb528f7957d@mail.gmail.com> Message-ID: >> if office, I'd certainly just get minitowers. > > For office, I would recommend barebones in SFF (small form factor) cases, > like those commonly advertised for HTPC (Home theater PC). I have built a well, SFF shoeboxes would be a good idea, but they tend to be somewhat specialized. for instance, they normally only have a single cpu socket. but they're good because the PSU is often sized modestly (the sweet spot for PSU efficiency is somewhere around 75% load.) I don't know whether there would be any problem putting a real interconnect card (10G, IB, etc) into one of these - some are designed for GPU cards, so would have 8 or 16x pcie slots. if you're only using gigabit, an SFF shoebox might be quite a good fit, including low-power integrated video. there are "book" format SFF's that might do well for the gigabit/integrated approach too. I wouldn't mention HTPC, though - in stores around here at least that term implies a box specialized to look like AV components, often with milled aluminum bezel, fancy displays, etc. > cluster of 80 of those (Shuttle SB75G2) in 2004, you can see a rather bad > picture in the "IWR Cluster 4 part a" section of: > http://www.iwr.uni-heidelberg.de/services/equipment/parallel/ nice. I think SFF's would be very nice, though probably would mean giving up any pretense of server-ish-ness, such as dual sockets or IPMI, and probably sticking to onboard gigabit. for an office, though, a stack of 10 of them would still probably be a problem in total dissipation. From ajt at rri.sari.ac.uk Wed Dec 3 13:40:16 2008 From: ajt at rri.sari.ac.uk (Tony Travis) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Re: Intro question In-Reply-To: References: <20081203180218.GG11544@leitl.org> <9f8092cc0812031030j4827b3fej93ad2fb528f7957d@mail.gmail.com> Message-ID: <4936FCC0.5050902@rri.sari.ac.uk> Mark Hahn wrote: >>> if office, I'd certainly just get minitowers. >> For office, I would recommend barebones in SFF (small form factor) cases, >> like those commonly advertised for HTPC (Home theater PC). I have built a > > well, SFF shoeboxes would be a good idea, but they tend to be somewhat > specialized. for instance, they normally only have a single cpu socket. > but they're good because the PSU is often sized modestly (the sweet spot > for PSU efficiency is somewhere around 75% load.) I don't know whether > there would be any problem putting a real interconnect card (10G, IB, etc) > into one of these - some are designed for GPU cards, so would have 8 or 16x > pcie slots. if you're only using gigabit, an SFF shoebox might be quite a > good fit, including low-power integrated video. there are "book" format > SFF's that might do well for the gigabit/integrated approach too. Hello, Mark. I bought some IWill Zmaxdp and Zmaxd2 SFF dual Opteron servers with registered ECC memory to use in our Beowulf, but the heat dissipation problems were really *terrible* and the boxes were incredibly noisy. > [...] > I think SFF's would be very nice, though probably would mean giving up any > pretense of server-ish-ness, such as dual sockets or IPMI, and probably > sticking to onboard gigabit. for an office, though, a stack of 10 of them > would still probably be a problem in total dissipation. The IWill Zmaxdp/d2 were the only SFF Opteron servers I could find that support registered ECC memory. To cut a long story short, I've had to replace the standard 80W Opterons with 55W Opteron HE's to get the heat burden under control. These are great little boxes, but when you make them work hard they are extremely noisy, and none of my colleagues will put up with them in an office environment! The end of this story for me was that Flextronics bought IWill for their 1U server designs and IWill immediately ceased production of Zmax retail products. They are still around on eBay and one or two vendors if anyone is interested + mine are working fine now :-) http://www.flextronics.com/iwill/product_2.asp?p_id=36 http://www.flextronics.com/iwill/product_2.asp?p_id=105 There was a lot of interest in these IWill SFF servers when they were launched, but many people said it was impossible to keep the systems cool under load. They were right: IWill said that standard 80W Opterons were supported - no way: These SFF servers do work fine with 55W HE's though and if that had been more widely known at the time they may have been more successful. Bye, Tony. -- Dr. A.J.Travis, University of Aberdeen, Rowett Institute of Nutrition and Health, Greenburn Road, Bucksburn, Aberdeen AB21 9SB, Scotland, UK tel +44(0)1224 712751, fax +44(0)1224 716687, http://www.rowett.ac.uk mailto:a.travis@abdn.ac.uk, http://bioinformatics.rri.sari.ac.uk/~ajt From Bogdan.Costescu at iwr.uni-heidelberg.de Wed Dec 3 13:54:43 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Re: Intro question In-Reply-To: References: <20081203180218.GG11544@leitl.org> <9f8092cc0812031030j4827b3fej93ad2fb528f7957d@mail.gmail.com> Message-ID: On Wed, 3 Dec 2008, Mark Hahn wrote: > I don't know whether there would be any problem putting a real > interconnect card (10G, IB, etc) into one of these - some are > designed for GPU cards, so would have 8 or 16x pcie slots. Yes, they often have 2 slots, a PCIe 16x one for a graphics card and a PCIe 1x or PCI one for a TV card (for HTPC use ;-)). > there are "book" format SFF's that might do well for the > gigabit/integrated approach too. If raw performance is not your main interest, yes. These often have previous generation or significantly lower speed CPUs and memory and a rather bad cooling solution dictated by the space contraints. In comparison, most "normal" SFFs can take current generation CPUs and memory. One advantage coming from their lower speed/power components is that some of the "book" ones use an external power supply, laptop style, which generates no noise and eliminates the possibility of broken fans. > I wouldn't mention HTPC, though - in stores around here at least > that term implies a box specialized to look like AV components, > often with milled aluminum bezel, fancy displays, etc. Is there a law against sexy cluster nodes ? :-) How about making those fancy displays show a job ID and the current CPU load ? Would there be any more need for nagios or ganglia ? :-) > I think SFF's would be very nice, though probably would mean giving up any > pretense of server-ish-ness, such as dual sockets or IPMI I have looked some years ago at such a SFF barebone with 2 CPU sockets from Iwill. For some reason I couldn't (maybe still can't) get easily Iwill products here, so I quickly lost interest. But this shows that someone did think of it and that it's technically possible. However today, with multi-core CPUs being mainstream, I can't really see a market case for a SFF with more than one CPU socket. IPMI is the one thing that I really miss in these SFF computers. Not so much for sensors monitoring as for power on/power off/reset and console redirection. I was hoping that at least the Intel vPro would be adopted for SFFs... -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8240, Fax: +49 6221 54 8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From gdjacobs at gmail.com Wed Dec 3 15:42:13 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Re: Intro question In-Reply-To: References: <20081203180218.GG11544@leitl.org> <9f8092cc0812031030j4827b3fej93ad2fb528f7957d@mail.gmail.com> Message-ID: <49371955.20200@gmail.com> Bogdan Costescu wrote: > On Wed, 3 Dec 2008, Mark Hahn wrote: > >> I don't know whether there would be any problem putting a real >> interconnect card (10G, IB, etc) into one of these - some are designed >> for GPU cards, so would have 8 or 16x pcie slots. > > Yes, they often have 2 slots, a PCIe 16x one for a graphics card and a > PCIe 1x or PCI one for a TV card (for HTPC use ;-)). > >> there are "book" format SFF's that might do well for the >> gigabit/integrated approach too. > > If raw performance is not your main interest, yes. These often have > previous generation or significantly lower speed CPUs and memory and a > rather bad cooling solution dictated by the space contraints. In > comparison, most "normal" SFFs can take current generation CPUs and memory. > > One advantage coming from their lower speed/power components is that > some of the "book" ones use an external power supply, laptop style, > which generates no noise and eliminates the possibility of broken fans. > >> I wouldn't mention HTPC, though - in stores around here at least that >> term implies a box specialized to look like AV components, often with >> milled aluminum bezel, fancy displays, etc. > > Is there a law against sexy cluster nodes ? :-) > > How about making those fancy displays show a job ID and the current CPU > load ? Would there be any more need for nagios or ganglia ? :-) > >> I think SFF's would be very nice, though probably would mean giving up >> any >> pretense of server-ish-ness, such as dual sockets or IPMI > > I have looked some years ago at such a SFF barebone with 2 CPU sockets > from Iwill. For some reason I couldn't (maybe still can't) get easily > Iwill products here, so I quickly lost interest. But this shows that > someone did think of it and that it's technically possible. However > today, with multi-core CPUs being mainstream, I can't really see a > market case for a SFF with more than one CPU socket. > > IPMI is the one thing that I really miss in these SFF computers. Not so > much for sensors monitoring as for power on/power off/reset and console > redirection. I was hoping that at least the Intel vPro would be adopted > for SFFs... > What is the capability of EDAC on AM2 and AM2+ CPUs? Does the motherboard chipset impose any limitations? -- Geoffrey D. Jacobs From jan.heichler at gmx.net Thu Dec 4 00:19:27 2008 From: jan.heichler at gmx.net (Jan Heichler) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Re: Intro question In-Reply-To: References: <20081203180218.GG11544@leitl.org> <9f8092cc0812031030j4827b3fej93ad2fb528f7957d@mail.gmail.com> Message-ID: <1722728979.20081204091927@gmx.net> Hallo Bogdan, Mittwoch, 3. Dezember 2008, meintest Du: BC> On Wed, 3 Dec 2008, Mark Hahn wrote: >> I don't know whether there would be any problem putting a real >> interconnect card (10G, IB, etc) into one of these - some are >> designed for GPU cards, so would have 8 or 16x pcie slots. BC> Yes, they often have 2 slots, a PCIe 16x one for a graphics card and a BC> PCIe 1x or PCI one for a TV card (for HTPC use ;-)). But be careful here. It wouldn't be the first PCIe x16 where nothing else but a Graphic card work properly. If you browse through web-forums you see a lot of people trying to get a RAID-Controller (for example) working in a PCIe x16 - some boards just don't allow that. So better test before you buy.... Ja -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081204/363f5a2e/attachment.html From diep at xs4all.nl Thu Dec 4 04:23:22 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> Message-ID: Oh you mention realtime information. I assume that's stockexchange information. In these volatile markets there is a LOT to earn with that. Last time i discussed with some sysadmins that are busy with this subject, this is all about latency to the RAM, and a LOT of ram and i/o that needs to get replaced within 2 minutes realtime when a disk fails. What you need most likely is 2 identical machines A and B. Fastest latency you get for now with quad socket AMDs. The quad socket intels with CSI aren't there yet regrettably. Maybe end 2009. Quad socket AMD with a mainboard that you can later upgrade the cpu's to shanghai core. Initially you could start maybe even with high clocked dual cores. Memory controller is on die, so the higher the clock frequency of each core, the faster the latency to RAM. Equip each box with 64-128 GB ECC ddr2 ram and a BIG raid10 array U320. Most here will know how to deal with i/o in best manner. Then build a cluster of 2 nodes, so that machine B runs as a backup of A, so in case of a problem with A then B can take over glueless. You can build these nodes pretty cheap nowadays I saw these quad socket mainboards for like 800 euro and really a lot of dimm slots. Question is how fast you want to replace the disks. If you really want personnel that replaces 'em within 2 minutes and not a second slower, if a disk breaks, that's gonna be costly. In any case the network between the 2 nodes is pretty important. Good luck, Vincent On Dec 2, 2008, at 12:05 PM, malcolm croucher wrote: > Hi Guys , > > I am still thinking about my cluster and most probably will only > begin development next year june/july . > > Question : > > If i develop my system on 10 computers (nodes) which are all > normal desktops and then would like to place this in data hosting > facility which has access to real time information . I am going to > need to buy new servers (thin 1 u servers ). Would this be the best > choice as desktops take up more space and therefore will be more > expensive . how do you guys get around this problem ? or dont you ? > > Regards > > Malcolm > > > > > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From diep at xs4all.nl Thu Dec 4 09:58:34 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <386fa5610812022325g3b8eb4fl185aa8936399beea@mail.gmail.com> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> <386fa5610812022325g3b8eb4fl185aa8936399beea@mail.gmail.com> Message-ID: <2A28BD5B-CD38-4DA4-AF77-8308848FEB7D@xs4all.nl> On Dec 3, 2008, at 8:25 AM, malcolm croucher wrote: > Its gonna be used for computational chemisty , not academic but > more private / entrepreneurship. I been doing a lot of research in > this area for a while and was hoping to do some more on my own. > That's most interesting, if i google for your name i just get hits in the financial world. How's that possible? Vincent > On Wed, Dec 3, 2008 at 2:11 AM, Robert G. Brown > wrote: > On Tue, 2 Dec 2008, Lombard, David N wrote: > > An acoustic concern. A 1U is quite a bit louder than the normal > desktop as > (1) they use itty-bitty fans and (b) there's no incentive to make them > quiet, as nobody is expected to have to put up with their screaming... > > A good point. I actually like Greg's suggestion best -- consider > (fewer) 2U nodes instead -- quieter, more robust, cooler. Perhaps > four, > but that strongly depends on the kind of thing you are trying to do -- > tell us what it is if you can do so without having to kill and > we'll try > to help you estimate your communications issues and likely > bottlenecks. > For some tasks you are best off getting as few actual boxes as > possible > with as many as possible CPU cores per box. For others, having more > boxes and fewer cores per box will be right. > > The reason I like four nodes with at least a couple of cores each is > that if you don't KNOW what you are likely to need, you can find out > (probably) with this many nodes and then "fix" your design if/when you > scale up into production. Otherwise you buy eight single core node > (if > they still make single cores:-) and then learn that you would have > been > much better off buying a single eight core node. Or vice versa. > > rgb > > > -- > David N. Lombard, Intel, Irvine, CA > I do not speak for Intel Corporation; all comments are strictly my > own. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > Robert G. Brown Phone(cell): 1-919-280-8443 > Duke University Physics Dept, Box 90305 > > Durham, N.C. 27708-0305 > Web: http://www.phy.duke.edu/~rgb > Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php > Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 > > > > -- > Malcolm A.B Croucher > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From peter.st.john at gmail.com Thu Dec 4 11:55:07 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <2A28BD5B-CD38-4DA4-AF77-8308848FEB7D@xs4all.nl> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> <386fa5610812022325g3b8eb4fl185aa8936399beea@mail.gmail.com> <2A28BD5B-CD38-4DA4-AF77-8308848FEB7D@xs4all.nl> Message-ID: Vincent, All the guys I know, myself, in finance, started in mathematics. Including one Cole medal logician. I'm sure there are some chemists and physicists who got into finance also. I did myself, that was what sucked me into software engineering long ago. we don't necessarily lose our theoretical interests. Peter On 12/4/08, Vincent Diepeveen wrote: > > > On Dec 3, 2008, at 8:25 AM, malcolm croucher wrote: > > Its gonna be used for computational chemisty , not academic but more >> private / entrepreneurship. I been doing a lot of research in this area for >> a while and was hoping to do some more on my own. >> >> > That's most interesting, if i google for your name i just get hits in the > financial world. How's that possible? > > Vincent > > On Wed, Dec 3, 2008 at 2:11 AM, Robert G. Brown wrote: >> On Tue, 2 Dec 2008, Lombard, David N wrote: >> >> An acoustic concern. A 1U is quite a bit louder than the normal desktop as >> (1) they use itty-bitty fans and (b) there's no incentive to make them >> quiet, as nobody is expected to have to put up with their screaming... >> >> A good point. I actually like Greg's suggestion best -- consider >> (fewer) 2U nodes instead -- quieter, more robust, cooler. Perhaps four, >> but that strongly depends on the kind of thing you are trying to do -- >> tell us what it is if you can do so without having to kill and we'll try >> to help you estimate your communications issues and likely bottlenecks. >> For some tasks you are best off getting as few actual boxes as possible >> with as many as possible CPU cores per box. For others, having more >> boxes and fewer cores per box will be right. >> >> The reason I like four nodes with at least a couple of cores each is >> that if you don't KNOW what you are likely to need, you can find out >> (probably) with this many nodes and then "fix" your design if/when you >> scale up into production. Otherwise you buy eight single core node (if >> they still make single cores:-) and then learn that you would have been >> much better off buying a single eight core node. Or vice versa. >> >> rgb >> >> >> -- >> David N. Lombard, Intel, Irvine, CA >> I do not speak for Intel Corporation; all comments are strictly my own. >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> >> >> Robert G. Brown Phone(cell): 1-919-280-8443 >> Duke University Physics Dept, Box 90305 >> >> Durham, N.C. 27708-0305 >> Web: http://www.phy.duke.edu/~rgb >> Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php >> Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 >> >> >> >> -- >> Malcolm A.B Croucher >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081204/7379f5d3/attachment.html From prentice at ias.edu Thu Dec 4 12:53:19 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <2A28BD5B-CD38-4DA4-AF77-8308848FEB7D@xs4all.nl> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> <386fa5610812022325g3b8eb4fl185aa8936399beea@mail.gmail.com> <2A28BD5B-CD38-4DA4-AF77-8308848FEB7D@xs4all.nl> Message-ID: <4938433F.1030002@ias.edu> Vincent Diepeveen wrote: > > On Dec 3, 2008, at 8:25 AM, malcolm croucher wrote: > >> Its gonna be used for computational chemisty , not academic but more >> private / entrepreneurship. I been doing a lot of research in this >> area for a while and was hoping to do some more on my own. >> > > That's most interesting, if i google for your name i just get hits in > the financial world. How's that possible? > Vincent, You clearly haven't heard of David E. Shaw or know what he's up these days. Here's the Cliff Notes version: He made billions on Wall Street using computer models of the market, then started Schrodinger, a leading vendor of computational chemistry software (my previous employer used it heavily), where he actually wrote some of the code in their products himself. A couple of years ago, his company (D.E Shaw REsearch)had large (~1/4 page) ads in Linux Magazine looking for top Linux HPC admins to build/maintain a highly specialized computer that would be the fastest computer in the world for biochemistry applications (protein folding, etc.). He's very secretive, but the word on the street is that it will use custom processors (FPGAs?) specially designed for molecular bio/comp chem calculations, and is being built at a site in upstate NY. Google him. He's an interesting fellow. Here's a few links to get you started: http://en.wikipedia.org/wiki/David_E._Shaw http://www.deshawresearch.com/ http://www.deshawresearch.com/chiefscientist.html http://www.motherjones.com/news/special_reports/mojo_400/43_shaw.html -- Prentice From rgb at phy.duke.edu Thu Dec 4 17:44:37 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <4938433F.1030002@ias.edu> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> <386fa5610812022325g3b8eb4fl185aa8936399beea@mail.gmail.com> <2A28BD5B-CD38-4DA4-AF77-8308848FEB7D@xs4all.nl> <4938433F.1030002@ias.edu> Message-ID: On Thu, 4 Dec 2008, Prentice Bisbal wrote: > You clearly haven't heard of David E. Shaw or know what he's up these > days. Here's the Cliff Notes version: He has had ads in computer magazines -- at least small ads in the back -- for years and years. Usually for physicists and mathematicians. One of the few people it looked like it would be interesting to work for, actually. rgb > > He made billions on Wall Street using computer models of the market, > then started Schrodinger, a leading vendor of computational chemistry > software (my previous employer used it heavily), where he actually wrote > some of the code in their products himself. > > A couple of years ago, his company (D.E Shaw REsearch)had large (~1/4 > page) ads in Linux Magazine looking for top Linux HPC admins to > build/maintain a highly specialized computer that would be the fastest > computer in the world for biochemistry applications (protein folding, > etc.). He's very secretive, but the word on the street is that it will > use custom processors (FPGAs?) specially designed for molecular bio/comp > chem calculations, and is being built at a site in upstate NY. > > Google him. He's an interesting fellow. Here's a few links to get you > started: > > http://en.wikipedia.org/wiki/David_E._Shaw > http://www.deshawresearch.com/ > http://www.deshawresearch.com/chiefscientist.html > http://www.motherjones.com/news/special_reports/mojo_400/43_shaw.html > > > -- > Prentice > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From hearnsj at googlemail.com Fri Dec 5 00:56:09 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> <386fa5610812022325g3b8eb4fl185aa8936399beea@mail.gmail.com> <2A28BD5B-CD38-4DA4-AF77-8308848FEB7D@xs4all.nl> <4938433F.1030002@ias.edu> Message-ID: <9f8092cc0812050056l3f7c65c8ob676bc132a2bb97@mail.gmail.com> 2008/12/5 Robert G. Brown > > He has had ads in computer magazines -- at least small ads in the back > -- for years and years. Usually for physicists and mathematicians. One > of the few people it looked like it would be interesting to work for, > actually. > > I was very interested in working for DE Shaw - they have an office in London, and it would have been an easy commute for me. It sounded an interesting place to work, and they obviously have some very bright people there. Sadly I didn't make it past the application stage. Que sera sera. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081205/b9dc4a65/attachment.html From eugen at leitl.org Fri Dec 5 04:48:43 2008 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers Message-ID: <20081205124843.GM11544@leitl.org> (Well, duh). http://www.spectrum.ieee.org/nov08/6912 Multicore Is Bad News For Supercomputers By Samuel K. Moore Image: Sandia Trouble Ahead: More cores per chip will slow some programs [red] unless there?s a big boost in memory bandwidth [yellow With no other way to improve the performance of processors further, chip makers have staked their future on putting more and more processor cores on the same chip. Engineers at Sandia National Laboratories, in New Mexico, have simulated future high-performance computers containing the 8-core, 16?core, and 32-core microprocessors that chip makers say are the future of the industry. The results are distressing. Because of limited memory bandwidth and memory-management schemes that are poorly suited to supercomputers, the performance of these machines would level off or even decline with more cores. The performance is especially bad for informatics applications?data-intensive programs that are increasingly crucial to the labs? national security function. High-performance computing has historically focused on solving differential equations describing physical systems, such as Earth?s atmosphere or a hydrogen bomb?s fission trigger. These systems lend themselves to being divided up into grids, so the physical system can, to a degree, be mapped to the physical location of processors or processor cores, thus minimizing delays in moving data. But an increasing number of important science and engineering problems?not to mention national security problems?are of a different sort. These fall under the general category of informatics and include calculating what happens to a transportation network during a natural disaster and searching for patterns that predict terrorist attacks or nuclear proliferation failures. These operations often require sifting through enormous databases of information. For informatics, more cores doesn?t mean better performance [see red line in ?Trouble Ahead?], according to Sandia?s simulation. ?After about 8 cores, there?s no improvement,? says James Peery, director of computation, computers, information, and mathematics at Sandia. ?At 16 cores, it looks like 2.? Over the past year, the Sandia team has discussed the results widely with chip makers, supercomputer designers, and users of high-performance computers. Unless computer architects find a solution, Peery and others expect that supercomputer programmers will either turn off the extra cores or use them for something ancillary to the main problem. At the heart of the trouble is the so-called memory wall?the growing disparity between how fast a CPU can operate on data and how fast it can get the data it needs. Although the number of cores per processor is increasing, the number of connections from the chip to the rest of the computer is not. So keeping all the cores fed with data is a problem. In informatics applications, the problem is worse, explains Richard C. Murphy, a senior member of the technical staff at Sandia, because there is no physical relationship between what a processor may be working on and where the next set of data it needs may reside. Instead of being in the cache of the core next door, the data may be on a DRAM chip in a rack 20 meters away and need to leave the chip, pass through one or more routers and optical fibers, and find its way onto the processor. In an effort to get things back on track, this year the U.S. Department of Energy formed the Institute for Advanced Architectures and Algorithms. Located at Sandia and at Oak Ridge National Laboratory, in Tennessee, the institute?s work will be to figure out what high-performance computer architectures will be needed five to 10 years from now and help steer the industry in that direction. ?The key to solving this bottleneck is tighter, and maybe smarter, integration of memory and processors,? says Peery. For its part, Sandia is exploring the impact of stacking memory chips atop processors to improve memory bandwidth. The results, in simulation at least, are promising [see yellow line in ?Trouble Ahead From hahn at mcmaster.ca Fri Dec 5 05:44:43 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <20081205124843.GM11544@leitl.org> References: <20081205124843.GM11544@leitl.org> Message-ID: > (Well, duh). yeah - the point seems to be that we (still) need to scale memory along with core count. not just memory bandwidth but also concurrency (number of banks), though "ieee spectrum online for tech insiders" doesn't get into that kind of depth :( I still usually explain this as "traditional (ie Cray) supercomputing requires a balanced system." commodity processors are always less balanced than ideal, but to varying degrees. intel dual-socket quad-core was probably the worst for a long time, but things are looking up as intel joins AMD with memory connected to each socket. stacking memory on the processor is a red herring IMO, though they appear to assumed that the number of dram banks will scale linearly with cores. to me that sounds more like dram-based per-core cache. From prentice at ias.edu Fri Dec 5 05:47:53 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <9f8092cc0812050056l3f7c65c8ob676bc132a2bb97@mail.gmail.com> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> <386fa5610812022325g3b8eb4fl185aa8936399beea@mail.gmail.com> <2A28BD5B-CD38-4DA4-AF77-8308848FEB7D@xs4all.nl> <4938433F.1030002@ias.edu> <9f8092cc0812050056l3f7c65c8ob676bc132a2bb97@mail.gmail.com> Message-ID: <49393109.40300@ias.edu> John Hearns wrote: > > > 2008/12/5 Robert G. Brown > > > > He has had ads in computer magazines -- at least small ads in the back > -- for years and years. Usually for physicists and mathematicians. One > of the few people it looked like it would be interesting to work for, > actually. > > I was very interested in working for DE Shaw - they have an office in > London, and it would > have been an easy commute for me. It sounded an interesting place to > work, and they obviously > have some very bright people there. > Sadly I didn't make it past the application stage. Que sera sera. I read an article about D.E. Shaw that I wanted to link to, but couldn't find it. In that article, it said that not only is his company very secretive, but you don't apply there as much as they find you. They allegedly read all the academic journals and find the top scientists in the world, and then try to court them to work for D.E. Shaw. Being offered a job there is like winning a Nobel, allegedly. I guess that doesn't necessarily apply to sys admins, since they were advertising very heavily for them a couple of years ago. -- Prentice From rgb at phy.duke.edu Fri Dec 5 05:58:07 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <20081205124843.GM11544@leitl.org> References: <20081205124843.GM11544@leitl.org> Message-ID: On Fri, 5 Dec 2008, Eugen Leitl wrote: > > (Well, duh). Good article, though, thanks. Of course the same could have been written (and probably was) back when dual processors came out sharing a single memory bus, and for every generation since. The memory lag has been around forever -- multicores simply widen the gap out of step with Moore's Law (again). Intel and/or AMD people on list -- any words you want to say about a "road map" or other plan to deal with this? In the context of ordinary PCs the marginal benefit of additional cores after (say) four seems minimal as most desktop users don't need all that much parallelism -- enough to manage multimedia decoding in parallel with the OS base function in parallel with "user activity". Higher numbers of cores seem to be primarily of interest to H[A,PC] users -- stacks of VMs or server daemons, large scale parallel numerical computation. In both of these general arenas increasing cores/processor/memory channel beyond a critical limit that I think we're already at simply ensures that a significant number of your cores will be idling as they wait for memory access at any given time... rgb > > http://www.spectrum.ieee.org/nov08/6912 > > Multicore Is Bad News For Supercomputers > > By Samuel K. Moore > > Image: Sandia > > Trouble Ahead: More cores per chip will slow some programs [red] unless > there?s a big boost in memory bandwidth [yellow > > With no other way to improve the performance of processors further, chip > makers have staked their future on putting more and more processor cores on > the same chip. Engineers at Sandia National Laboratories, in New Mexico, have > simulated future high-performance computers containing the 8-core, 16?core, > and 32-core microprocessors that chip makers say are the future of the > industry. The results are distressing. Because of limited memory bandwidth > and memory-management schemes that are poorly suited to supercomputers, the > performance of these machines would level off or even decline with more > cores. The performance is especially bad for informatics > applications?data-intensive programs that are increasingly crucial to the > labs? national security function. > > High-performance computing has historically focused on solving differential > equations describing physical systems, such as Earth?s atmosphere or a > hydrogen bomb?s fission trigger. These systems lend themselves to being > divided up into grids, so the physical system can, to a degree, be mapped to > the physical location of processors or processor cores, thus minimizing > delays in moving data. > > But an increasing number of important science and engineering problems?not to > mention national security problems?are of a different sort. These fall under > the general category of informatics and include calculating what happens to a > transportation network during a natural disaster and searching for patterns > that predict terrorist attacks or nuclear proliferation failures. These > operations often require sifting through enormous databases of information. > > For informatics, more cores doesn?t mean better performance [see red line in > ?Trouble Ahead?], according to Sandia?s simulation. ?After about 8 cores, > there?s no improvement,? says James Peery, director of computation, > computers, information, and mathematics at Sandia. ?At 16 cores, it looks > like 2.? Over the past year, the Sandia team has discussed the results widely > with chip makers, supercomputer designers, and users of high-performance > computers. Unless computer architects find a solution, Peery and others > expect that supercomputer programmers will either turn off the extra cores or > use them for something ancillary to the main problem. > > At the heart of the trouble is the so-called memory wall?the growing > disparity between how fast a CPU can operate on data and how fast it can get > the data it needs. Although the number of cores per processor is increasing, > the number of connections from the chip to the rest of the computer is not. > So keeping all the cores fed with data is a problem. In informatics > applications, the problem is worse, explains Richard C. Murphy, a senior > member of the technical staff at Sandia, because there is no physical > relationship between what a processor may be working on and where the next > set of data it needs may reside. Instead of being in the cache of the core > next door, the data may be on a DRAM chip in a rack 20 meters away and need > to leave the chip, pass through one or more routers and optical fibers, and > find its way onto the processor. > > In an effort to get things back on track, this year the U.S. Department of > Energy formed the Institute for Advanced Architectures and Algorithms. > Located at Sandia and at Oak Ridge National Laboratory, in Tennessee, the > institute?s work will be to figure out what high-performance computer > architectures will be needed five to 10 years from now and help steer the > industry in that direction. > > ?The key to solving this bottleneck is tighter, and maybe smarter, > integration of memory and processors,? says Peery. For its part, Sandia is > exploring the impact of stacking memory chips atop processors to improve > memory bandwidth. > > The results, in simulation at least, are promising [see yellow line in > ?Trouble Ahead > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From rgb at phy.duke.edu Fri Dec 5 06:22:12 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <49393109.40300@ias.edu> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> <386fa5610812022325g3b8eb4fl185aa8936399beea@mail.gmail.com> <2A28BD5B-CD38-4DA4-AF77-8308848FEB7D@xs4all.nl> <4938433F.1030002@ias.edu> <9f8092cc0812050056l3f7c65c8ob676bc132a2bb97@mail.gmail.com> <49393109.40300@ias.edu> Message-ID: On Fri, 5 Dec 2008, Prentice Bisbal wrote: > John Hearns wrote: >> >> >> 2008/12/5 Robert G. Brown > >> >> >> He has had ads in computer magazines -- at least small ads in the back >> -- for years and years. Usually for physicists and mathematicians. One >> of the few people it looked like it would be interesting to work for, >> actually. >> >> I was very interested in working for DE Shaw - they have an office in >> London, and it would >> have been an easy commute for me. It sounded an interesting place to >> work, and they obviously >> have some very bright people there. >> Sadly I didn't make it past the application stage. Que sera sera. > > I read an article about D.E. Shaw that I wanted to link to, but couldn't > find it. In that article, it said that not only is his company very > secretive, but you don't apply there as much as they find you. They > allegedly read all the academic journals and find the top scientists in > the world, and then try to court them to work for D.E. Shaw. Being > offered a job there is like winning a Nobel, allegedly. I guess that > doesn't necessarily apply to sys admins, since they were advertising > very heavily for them a couple of years ago. Sure, but remember the numbers. There are a LOT of scientists, and most of them are busy, too busy to move to NY and work for DES. I think that they used the ad as one way of learning about physics, math, etc Ph.D's who were self-selected interested and demonstrably computer geeks, because the ads appeared as one tiny box at the end of e.g. Sun Expert or Byte -- a cheap ad in the classified section, no pictures, very mysterious, clearly a think-tank sort of thing -- in every issue. Clearly looking for e.g. physicists who were into neural networks and complex systems and so on. Which I was and am, but a) I don't publish in the field -- they're the basis for my own entrepreneurial activities and "secret"; and b) I wouldn't live in NYC for literally any money in the world. If one made a million a year and spent most of it one could probably live decently and not save very much. I really like living in one of the most civilized enclaves in the world in NC. Wait, forget I said that. North Carolina is a TERRIBLE place to live. Nobody should ever move here. It's just awful, hot in the summer and cold in the winter, and who cares about basketball and the arts anyway? Yes, you're really better off living where you do already...;-) rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From james.p.lux at jpl.nasa.gov Fri Dec 5 06:53:54 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <49393109.40300@ias.edu> Message-ID: On 12/5/08 5:47 AM, "Prentice Bisbal" wrote: > John Hearns wrote: >> >> >> 2008/12/5 Robert G. Brown > >> >> >> He has had ads in computer magazines -- at least small ads in the back >> -- for years and years. Usually for physicists and mathematicians. One >> of the few people it looked like it would be interesting to work for, >> actually. >> >> I was very interested in working for DE Shaw - they have an office in >> London, and it would >> have been an easy commute for me. It sounded an interesting place to >> work, and they obviously >> have some very bright people there. >> Sadly I didn't make it past the application stage. Que sera sera. > > I read an article about D.E. Shaw that I wanted to link to, but couldn't > find it. In that article, it said that not only is his company very > secretive, but you don't apply there as much as they find you. They > allegedly read all the academic journals And this list? > and find the top scientists in > the world, and then try to court them to work for D.E. Shaw. Being > offered a job there is like winning a Nobel, allegedly. I guess that > doesn't necessarily apply to sys admins, since they were advertising > very heavily for them a couple of years ago. > From rgb at phy.duke.edu Fri Dec 5 07:08:33 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: References: Message-ID: On Fri, 5 Dec 2008, Lux, James P wrote: >> I read an article about D.E. Shaw that I wanted to link to, but couldn't >> find it. In that article, it said that not only is his company very >> secretive, but you don't apply there as much as they find you. They >> allegedly read all the academic journals > > And this list? Omygawsh. All right, will the D. E. Shaw spy please raise his or her hand? rgb > >> and find the top scientists in >> the world, and then try to court them to work for D.E. Shaw. Being >> offered a job there is like winning a Nobel, allegedly. I guess that >> doesn't necessarily apply to sys admins, since they were advertising >> very heavily for them a couple of years ago. >> > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From Dan.Kidger at quadrics.com Fri Dec 5 07:12:44 2008 From: Dan.Kidger at quadrics.com (Dan.Kidger@quadrics.com) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <9f8092cc0812050056l3f7c65c8ob676bc132a2bb97@mail.gmail.com> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <20081202232654.GA16003@nlxdcldnl2.cl.intel.com> <386fa5610812022325g3b8eb4fl185aa8936399beea@mail.gmail.com> <2A28BD5B-CD38-4DA4-AF77-8308848FEB7D@xs4all.nl> <4938433F.1030002@ias.edu> <9f8092cc0812050056l3f7c65c8ob676bc132a2bb97@mail.gmail.com> Message-ID: <0D49B15ACFDF2F46BF90B6E08C90048A064D922B98@quadbrsex1.quadrics.com> I too had an interview with DE Shaw Research a while back before I took up my current position At the time they did not have a UK office, and moving to NY was out of the question for me. In recent times they have been more open - and even gave a Keynote talk at this years' ISC in Dresden. As presented, they are designing custom hardware for computational chemistry. The aim is not how many teraflops they can do, but how quickly they can do each timestep of the simulation (irrespective of the molecule size). Since to model reactions with computation chemistry you need billions of timesteps - simulations would take months/years of wallclock - even for a tiny molecule. The target is to push down the wallclock of a single timestep from say 1ms wallclock to perhaps 0.1us. With current interconnects being no better than 1us latency this must be quite a challenge. Daniel. From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of John Hearns Sent: 05 December 2008 08:56 To: beowulf@beowulf.org Subject: Re: [Beowulf] Intro question 2008/12/5 Robert G. Brown > He has had ads in computer magazines -- at least small ads in the back -- for years and years. Usually for physicists and mathematicians. One of the few people it looked like it would be interesting to work for, actually. I was very interested in working for DE Shaw - they have an office in London, and it would have been an easy commute for me. It sounded an interesting place to work, and they obviously have some very bright people there. Sadly I didn't make it past the application stage. Que sera sera. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081205/b7f55bbc/attachment.html From larry.stewart at sicortex.com Fri Dec 5 08:17:30 2008 From: larry.stewart at sicortex.com (Lawrence Stewart) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: References: Message-ID: <4939541A.4000600@sicortex.com> I've been to a couple of DE Shaw talks and I always come away puzzled. It's tempting to conclude that they are just smarter than I am, but maybe they are just wrong. My understanding is they are building a special purpose molecular dynamics machine because it will be far faster than a general purpose machine programmed to do MD. In principle this might work, if you get the problem statement right, and you can design and build the machine before the general purpose machines catch up, and you don't make any mistakes, and after it is built you can keep designing new ones. In practice it always seems to take longer than you expected and cost more, and maybe that 7 bit ALU really has to be changed to an 8 bit ALU to keep the precision up. The most effective example I know of are the QCD machines like QCDOC that led to BlueGene, but it was far more general purpose than Shaw's machine. Trying it seems harmless, and a better use of excess capital than buying basketball teams or yachts, but it does divert smart people from other activities. Of course if they succeed I'll have been behind it all the way. -- -Larry / Sector IX From Dan.Kidger at quadrics.com Fri Dec 5 08:43:19 2008 From: Dan.Kidger at quadrics.com (Dan.Kidger@quadrics.com) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <4939541A.4000600@sicortex.com> References: <4939541A.4000600@sicortex.com> Message-ID: <0D49B15ACFDF2F46BF90B6E08C90048A064D922BA8@quadbrsex1.quadrics.com> If I had that much money, I too would try and buy a Nobel Prize in preference to a yacht. D. -----Original Message----- From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Lawrence Stewart Sent: 05 December 2008 16:18 To: Robert G. Brown Cc: Beowulf Mailing List; Lux, James P Subject: Re: [Beowulf] Intro question I've been to a couple of DE Shaw talks and I always come away puzzled. It's tempting to conclude that they are just smarter than I am, but maybe they are just wrong. My understanding is they are building a special purpose molecular dynamics machine because it will be far faster than a general purpose machine programmed to do MD. In principle this might work, if you get the problem statement right, and you can design and build the machine before the general purpose machines catch up, and you don't make any mistakes, and after it is built you can keep designing new ones. In practice it always seems to take longer than you expected and cost more, and maybe that 7 bit ALU really has to be changed to an 8 bit ALU to keep the precision up. The most effective example I know of are the QCD machines like QCDOC that led to BlueGene, but it was far more general purpose than Shaw's machine. Trying it seems harmless, and a better use of excess capital than buying basketball teams or yachts, but it does divert smart people from other activities. Of course if they succeed I'll have been behind it all the way. -- -Larry / Sector IX _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From james.p.lux at jpl.nasa.gov Fri Dec 5 08:59:22 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <4939541A.4000600@sicortex.com> Message-ID: On 12/5/08 8:17 AM, "Lawrence Stewart" wrote: > I've been to a couple of DE Shaw talks and I always come away puzzled. > > It's tempting to conclude that they are just smarter than I am, but > maybe they are just wrong. > > My understanding is they are building a special purpose molecular > dynamics machine because it will be far faster than a general purpose > machine programmed to do MD. > > In principle this might work, if you get the problem statement right, > and you can design and build the machine before the general purpose > machines catch up, and you don't make any mistakes, and after it is > built you can keep designing new ones. In practice it always seems to > take longer than you expected and cost more, and maybe that 7 bit ALU > really has to be changed to an 8 bit ALU to keep the precision up. If the machine is built of reconfigurable FPGAs, then such a change is pretty quick. If you have a basic hardware infrastructure, and just respin an ASIC, and that's a bit more time consuming, but not particularly expensive. e.g. Say it costs, in round numbers, $1M to do an ASIC. That's 2-3 work years labor costs, so in the overall scheme of things, it's not very expensive, in a relative way. If your overall research effort is, say, $20M/yr (which is big, but not huge), then budgeting for a complete machine rebuild every year is only 5-10%. If that gives you a factor of 3 speed increase, it's probably worth it. Think about it.. You check out your design in FPGAs to make sure it works, then do FPGA>ASIC and crank out a quick 10,000 customized processors, have them assembled into boards, fire it up and go. There are all sorts of economies of scale possible (if you're building 1000 PC boards, on an automated line, it's just not that expensive. For comparison, we regularly have prototype boards made with more than 20 layers and a dozen or so fairly high density parts (a couple Xilinx Virtex II FPGAs, RAMs, CPUs, etc.) and all the stuff around them. In single quantities, it might cost around $15K-$20K each to do these (parts cost included). If we were doing 100 of them, so we could spread the cost of the pick-and-place programming over all of them, etc., it would probably be down in the $5-10K/each range. Get into the 1000 unit quantities where it pays to go to a higher volume house, and you might be down in the few hundred bucks each to fab the board, and now you're just talking parts cost. Consider PC mobos.. The manufacturing cost (including parts) is well under $100. Now consider using that nifty compchem box to go examine thousands of possible drugs. Get a hit, and it can be a real money maker. Consider that Claritin was responsible for about $2B of Schering-Plough's revenue in just 2001. Plavix was almost $4B in 2005. That ED drug that starts with a V that we all get mail about was in the $1B/yr area, although its dropping. (One article comments that when it comes off patent in 2012 that they'll see a bump in sales:"Recreational use of the product could also be expected to generate substantial revenues.") In this context, spending $100M isn't a huge sum, now, is it. Jim From landman at scalableinformatics.com Fri Dec 5 09:00:18 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Intro question In-Reply-To: <0D49B15ACFDF2F46BF90B6E08C90048A064D922BA8@quadbrsex1.quadrics.com> References: <4939541A.4000600@sicortex.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BA8@quadbrsex1.quadrics.com> Message-ID: <49395E22.6090707@scalableinformatics.com> Dan.Kidger@quadrics.com wrote: > If I had that much money, I too would try and buy a Nobel Prize in preference to a yacht. > > D. > > > -----Original Message----- > From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Lawrence Stewart > Sent: 05 December 2008 16:18 > To: Robert G. Brown > Cc: Beowulf Mailing List; Lux, James P > Subject: Re: [Beowulf] Intro question > > I've been to a couple of DE Shaw talks and I always come away puzzled. > > It's tempting to conclude that they are just smarter than I am, but > maybe they are just wrong. > > My understanding is they are building a special purpose molecular > dynamics machine because it will be far faster than a general purpose > machine programmed to do MD. > > In principle this might work, if you get the problem statement right, > and you can design and build the machine before the general purpose > machines catch up, and you don't make any mistakes, and after it is > built you can keep designing new ones. In practice it always seems to > take longer than you expected and cost more, and maybe that 7 bit ALU > really has to be changed to an 8 bit ALU to keep the precision up. The MDGrape guys might have a thing or three to say. They have been demonstrating some pretty awesome performance for years. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From coutinho at dcc.ufmg.br Fri Dec 5 09:09:12 2008 From: coutinho at dcc.ufmg.br (Bruno Coutinho) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: References: <20081205124843.GM11544@leitl.org> Message-ID: 2008/12/5 Robert G. Brown > On Fri, 5 Dec 2008, Eugen Leitl wrote: > > >> (Well, duh). >> > > Good article, though, thanks. > > Of course the same could have been written (and probably was) back when > dual processors came out sharing a single memory bus, and for every > generation since. The memory lag has been around forever -- multicores > simply widen the gap out of step with Moore's Law (again). > > Intel and/or AMD people on list -- any words you want to say about a > "road map" or other plan to deal with this? In the context of ordinary > PCs the marginal benefit of additional cores after (say) four seems > minimal as most desktop users don't need all that much parallelism -- > enough to manage multimedia decoding in parallel with the OS base > function in parallel with "user activity". Higher numbers of cores seem > to be primarily of interest to H[A,PC] users -- stacks of VMs or server > daemons, large scale parallel numerical computation. Datamining is useful for both commercial and scientific world and is very data-intensive, so I think this issue will be adressed, or at least someone (Sun, for example) will build processors for data intensive applications that are more balanced, but several times more expensive. > In both of these > general arenas increasing cores/processor/memory channel beyond a > critical limit that I think we're already at simply ensures that a > significant number of your cores will be idling as they wait for > memory access at any given time... > > rgb > > > >> http://www.spectrum.ieee.org/nov08/6912 >> >> Multicore Is Bad News For Supercomputers >> >> By Samuel K. Moore >> >> Image: Sandia >> >> Trouble Ahead: More cores per chip will slow some programs [red] unless >> there's a big boost in memory bandwidth [yellow >> >> With no other way to improve the performance of processors further, chip >> makers have staked their future on putting more and more processor cores >> on >> the same chip. Engineers at Sandia National Laboratories, in New Mexico, >> have >> simulated future high-performance computers containing the 8-core, >> 16?core, >> and 32-core microprocessors that chip makers say are the future of the >> industry. The results are distressing. Because of limited memory bandwidth >> and memory-management schemes that are poorly suited to supercomputers, >> the >> performance of these machines would level off or even decline with more >> cores. The performance is especially bad for informatics >> applications?data-intensive programs that are increasingly crucial to the >> labs' national security function. >> >> High-performance computing has historically focused on solving >> differential >> equations describing physical systems, such as Earth's atmosphere or a >> hydrogen bomb's fission trigger. These systems lend themselves to being >> divided up into grids, so the physical system can, to a degree, be mapped >> to >> the physical location of processors or processor cores, thus minimizing >> delays in moving data. >> >> But an increasing number of important science and engineering problems?not >> to >> mention national security problems?are of a different sort. These fall >> under >> the general category of informatics and include calculating what happens >> to a >> transportation network during a natural disaster and searching for >> patterns >> that predict terrorist attacks or nuclear proliferation failures. These >> operations often require sifting through enormous databases of >> information. >> >> For informatics, more cores doesn't mean better performance [see red line >> in >> "Trouble Ahead"], according to Sandia's simulation. "After about 8 cores, >> there's no improvement," says James Peery, director of computation, >> computers, information, and mathematics at Sandia. "At 16 cores, it looks >> like 2." Over the past year, the Sandia team has discussed the results >> widely >> with chip makers, supercomputer designers, and users of high-performance >> computers. Unless computer architects find a solution, Peery and others >> expect that supercomputer programmers will either turn off the extra cores >> or >> use them for something ancillary to the main problem. >> >> At the heart of the trouble is the so-called memory wall?the growing >> disparity between how fast a CPU can operate on data and how fast it can >> get >> the data it needs. Although the number of cores per processor is >> increasing, >> the number of connections from the chip to the rest of the computer is >> not. >> So keeping all the cores fed with data is a problem. In informatics >> applications, the problem is worse, explains Richard C. Murphy, a senior >> member of the technical staff at Sandia, because there is no physical >> relationship between what a processor may be working on and where the next >> set of data it needs may reside. Instead of being in the cache of the core >> next door, the data may be on a DRAM chip in a rack 20 meters away and >> need >> to leave the chip, pass through one or more routers and optical fibers, >> and >> find its way onto the processor. >> >> In an effort to get things back on track, this year the U.S. Department of >> Energy formed the Institute for Advanced Architectures and Algorithms. >> Located at Sandia and at Oak Ridge National Laboratory, in Tennessee, the >> institute's work will be to figure out what high-performance computer >> architectures will be needed five to 10 years from now and help steer the >> industry in that direction. >> >> "The key to solving this bottleneck is tighter, and maybe smarter, >> integration of memory and processors," says Peery. For its part, Sandia is >> exploring the impact of stacking memory chips atop processors to improve >> memory bandwidth. >> >> The results, in simulation at least, are promising [see yellow line in >> "Trouble Ahead >> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> >> > Robert G. Brown http://www.phy.duke.edu/~rgb/ > Duke University Dept. of Physics, Box 90305 > Durham, N.C. 27708-0305 > Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081205/a2264a98/attachment.html From diep at xs4all.nl Fri Dec 5 09:15:01 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:01 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: References: <20081205124843.GM11544@leitl.org> Message-ID: <5FB03D13-79AF-48B2-8B68-A60A2559E26B@xs4all.nl> Well every scientist who says he needs a lot of RAM now, ECC-DDR2 ram has a cost of near nothing right now. Very cheaply you can build nodes now with like 4 cheapo cpu's and 128 GB ram inside. There is no excuse for those who beg for big RAM to not buy a bunch of those nodes. What happens each time is that at the moment that finally the price of some sort of RAM drops (note that ECC-Registered DDR ram never has gotten cheap, much to my disappointment), that a newer generation RAM is there which again is really expensive. I tend to believe that many algorithms that require really a lot of ram can do with a bit less and profit from todays huge cpu power, using some clever tricks and enhancements and/or new algorithms (sometimes it is difficult to define what is a new algorithm, if it looks so much like a previous one with just a few new enhancements), which probably are far from trivial. Usually programming the 'new' algorithm efficiently low level is the big killerproblem why it doesn't get used yet (as there is no budget to hire people who are specialized here, or simply because they work for some other company or other government body). I would really argue that sometimes you have to give industry some time to mass produce memory, just design a new generation cpu based upon the RAM that's there now and just read massively parallel from that RAM. That also gives a HUGE bandwidth. If some older GPU based upon DDR3 ram claims 106GB/s bandwidth to RAM, versus todays Nehalem claims 32GB/s and is achieving a 17 to 18GB/s, then obviously it wasn't important enough for intel to give us more bandwidth to the RAM. If nvidia/amd GPU's can do it years before, and latest cpu is a factor 4+ off then discussions about bandwidth to RAM are quite artificial. The reason for that is the limitations of SPEC to RAM consumption. They design a benchmark years beforehand to use an amount of RAM that is "common" now. I would argue that those most hungry for bandwidth/core crunching power is the scientific world and/or safety research (air and car industry). Note that i'm speaking of streaming bandwidth above. Most scientists do not know the difference between bandwidth and latency, basically because they are right that in the end it is all bandwidth related from theoretical viewpoint. Yet practical there is so many factors influencing the latency. Intel/ AMD/IBM are doing big efforts of course to reduce latency a lot. Maybe 95% of all their work onto a cpu (blindfolded guess from a computer science guy - so not hardware designer)? In the end it is all about the testsets in spec. If we manage to get a bunch of real WELL OPTIMIZED low level codes that eat gigabytes of RAM finally into that spec then within years AMD and Intel will show up with some real fast cpu's for scientific workloads. If all "professors" type RGB make a lot of noise world wide to get that done, then they have to follow. Any criticism against intel and amd with respect to: "why not do this and that", i'm doing it also all the time, but at the same time if you look to what happens in spec, spec is only about "who has the best compiler and the biggest L2 cache that nearly can contain the entire working set size of this tiny RAM program". Get some serious software into SPEC i'd argue. To start looking at myself: the reason i didn't donate Diep is because competitors can also obtain my code, whereas all those compiler and hardware manufacturers i don't care if they have my proggies source code. Vincent On Dec 5, 2008, at 2:44 PM, Mark Hahn wrote: >> (Well, duh). > > yeah - the point seems to be that we (still) need to scale memory > along with core count. not just memory bandwidth but also concurrency > (number of banks), though "ieee spectrum online for tech insiders" > doesn't get into that kind of depth :( > > I still usually explain this as "traditional (ie Cray) supercomputing > requires a balanced system." commodity processors are always less > balanced > than ideal, but to varying degrees. intel dual-socket quad-core > was probably the worst for a long time, but things are looking up > as intel > joins AMD with memory connected to each socket. > > stacking memory on the processor is a red herring IMO, though they > appear > to assumed that the number of dram banks will scale linearly with > cores. > to me that sounds more like dram-based per-core cache. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From gdjacobs at gmail.com Fri Dec 5 10:24:10 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: References: <20081205124843.GM11544@leitl.org> Message-ID: <493971CA.9000307@gmail.com> Bruno Coutinho wrote: > Datamining is useful for both commercial and scientific world and is > very data-intensive, so I think this issue will be addressed, or at least > someone (Sun, for example) will build processors for data intensive > applications that are more balanced, but several times more expensive. Here's some current hardware which is superior to the norm in terms of I/O (so the manufacturers would claim). http://en.wikipedia.org/wiki/POWER6 http://en.wikipedia.org/wiki/Ultrasparc http://en.wikipedia.org/wiki/Itanium For those with really deep pockets: http://en.wikipedia.org/wiki/NEC_SX-9 Q: Do Hitachi and Fujitsu still do vector machines? I guess my point is that the article itself is a little fluffy. This is just the old problem of the kernel size overflowing cache/memory boundaries inconveniently. The answer is always more I/O and tighter integration. -- Geoffrey D. Jacobs From prentice at ias.edu Fri Dec 5 10:30:36 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <5FB03D13-79AF-48B2-8B68-A60A2559E26B@xs4all.nl> References: <20081205124843.GM11544@leitl.org> <5FB03D13-79AF-48B2-8B68-A60A2559E26B@xs4all.nl> Message-ID: <4939734C.9060604@ias.edu> Vincent Diepeveen wrote: > Very cheaply you can build nodes now with like 4 cheapo cpu's > and 128 GB ram inside. > Not exactly. 2 GB DIMMs are cheap, but as soon as you go to larger DIMMs (4 GB, 8 GB, etc.), the price goes up exponentially. Less than a year ago, we purchased a couple of server with 32 GB RAM. We then wanted to purchase one with 64 GB RAM. The cost of the system tripled! Instead, we bought 3 more 32 GB systems. Dell and others advertise systems that support up to 128 GB RAM, but I have yet to meet someone who can afford to put all 128 GB RAM in a single box. -- Prentice From lindahl at pbm.com Fri Dec 5 10:53:04 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <4939734C.9060604@ias.edu> References: <20081205124843.GM11544@leitl.org> <5FB03D13-79AF-48B2-8B68-A60A2559E26B@xs4all.nl> <4939734C.9060604@ias.edu> Message-ID: <20081205185304.GA27201@bx9> On Fri, Dec 05, 2008 at 01:30:36PM -0500, Prentice Bisbal wrote: > Not exactly. 2 GB DIMMs are cheap, but as soon as you go to larger DIMMs > (4 GB, 8 GB, etc.), the price goes up exponentially. My last quote for 2GB and 4GB dimms was linear. -- greg From landman at scalableinformatics.com Fri Dec 5 10:57:25 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <4939734C.9060604@ias.edu> References: <20081205124843.GM11544@leitl.org> <5FB03D13-79AF-48B2-8B68-A60A2559E26B@xs4all.nl> <4939734C.9060604@ias.edu> Message-ID: <49397995.4080701@scalableinformatics.com> Prentice Bisbal wrote: > Vincent Diepeveen wrote: > >> Very cheaply you can build nodes now with like 4 cheapo cpu's >> and 128 GB ram inside. >> > > Not exactly. 2 GB DIMMs are cheap, but as soon as you go to larger DIMMs > (4 GB, 8 GB, etc.), the price goes up exponentially. > > Less than a year ago, we purchased a couple of server with 32 GB RAM. We > then wanted to purchase one with 64 GB RAM. The cost of the system > tripled! Instead, we bought 3 more 32 GB systems. The cost of 64GB went down quite recently. We have sold quite a few of these at this size due to memory cost drops. > > Dell and others advertise systems that support up to 128 GB RAM, but I > have yet to meet someone who can afford to put all 128 GB RAM in a > single box. We have seen a few. Its not as expensive as you think. 256 GB ... yeah thats more. And again, if you don't mind a quick plug for ScaleMP (we are not currently a reseller of theirs, we make no money from them), you can tie a few machines together (various constraints) with lower cost memory/components, and get some nice sized single system images. Of course YMMV. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From lindahl at pbm.com Fri Dec 5 11:03:10 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Intro question In-Reply-To: <4939541A.4000600@sicortex.com> References: <4939541A.4000600@sicortex.com> Message-ID: <20081205190309.GB27201@bx9> On Fri, Dec 05, 2008 at 11:17:30AM -0500, Lawrence Stewart wrote: > In principle this might work, if you get the problem statement right, > and you can design and build the machine before the general purpose > machines catch up, and you don't make any mistakes, and after it is > built you can keep designing new ones. In practice it always seems to > take longer than you expected and cost more, and maybe that 7 bit ALU > really has to be changed to an 8 bit ALU to keep the precision up. I've seen David talk about this machine a couple of time, and he addressed this issue: he realizes it's risky, and he was hoping to advance the state of the art by 5 years over a commodity cluster. While I was at D. E. Shaw (1996), the most effective headhunter for the systems department was the guy who cold-called sysadmins at good computer science departments. My office-mate was formally a sysadmin at Princeton. He still lived there, too; rgb might want to keep mass transit in mind when he's dissing living near NYC. For the strategies, they mostly hired folks who'd just finished degrees in the hard sciences. -- greg From prentice at ias.edu Fri Dec 5 11:31:15 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <49397995.4080701@scalableinformatics.com> References: <20081205124843.GM11544@leitl.org> <5FB03D13-79AF-48B2-8B68-A60A2559E26B@xs4all.nl> <4939734C.9060604@ias.edu> <49397995.4080701@scalableinformatics.com> Message-ID: <49398183.3060901@ias.edu> Joe Landman wrote: > Prentice Bisbal wrote: >> Vincent Diepeveen wrote: >> >>> Very cheaply you can build nodes now with like 4 cheapo cpu's >>> and 128 GB ram inside. >>> >> >> Not exactly. 2 GB DIMMs are cheap, but as soon as you go to larger DIMMs >> (4 GB, 8 GB, etc.), the price goes up exponentially. >> >> Less than a year ago, we purchased a couple of server with 32 GB RAM. We >> then wanted to purchase one with 64 GB RAM. The cost of the system >> tripled! Instead, we bought 3 more 32 GB systems. > > The cost of 64GB went down quite recently. We have sold quite a few of > these at this size due to memory cost drops. > My experience was 6-9 months ago. >> >> Dell and others advertise systems that support up to 128 GB RAM, but I >> have yet to meet someone who can afford to put all 128 GB RAM in a >> single box. > > We have seen a few. Its not as expensive as you think. 256 GB ... yeah > thats more. > > And again, if you don't mind a quick plug for ScaleMP (we are not > currently a reseller of theirs, we make no money from them), you can tie > a few machines together (various constraints) with lower cost > memory/components, and get some nice sized single system images. Of > course YMMV. > > Funny you mention ScaleMP. I was thinking the same thing, but forgot to mention it in my last e-mail. I was talking to them at SC08. about that sort of thing. I wonder of cost of ScaleMP is less than large RAM premium. -- Prentice From rpnabar at gmail.com Fri Dec 5 16:21:50 2008 From: rpnabar at gmail.com (Rahul Nabar) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] mpi error: mca_oob_tcp_accept: accept() failed: Too many open files (24). Message-ID: I'm getting huge logs with repeated errors of this sort: mca_oob_tcp_accept: accept() failed: Too many open files (24). I googled a bit and see that this seems to be an MPI complaint about too many files open. I checked ulimit and that says "unlimited" Any tips about what I ought to be looking at; I'm a bit lost as to how I can get to the source of this particular error. Furthermore it does not occur systematically but only once in a while. Unfortunately once it happens it seems to be a catastrophe! Any suggestions? -- Rahul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081205/a39de30c/attachment.html From niftyompi at niftyegg.com Fri Dec 5 16:36:21 2008 From: niftyompi at niftyegg.com (Nifty Tom Mitchell) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <20081205124843.GM11544@leitl.org> References: <20081205124843.GM11544@leitl.org> Message-ID: <20081206003621.GB3134@compegg.wr.niftyegg.com> On Fri, Dec 05, 2008 at 01:48:43PM +0100, Eugen Leitl wrote: > > (Well, duh). > > http://www.spectrum.ieee.org/nov08/6912 > > Multicore Is Bad News For Supercomputers > Where do GPUs fit in this? On the surface a handful of cores in a system with decent cache would quickly displace the need for GPUs and would have about as simple a programming+ compiler model as can be had today. Additional cores are not magic but can set the stage for better math and IO libraries. Me I would rather see more transistors thrown at 128+ bit math. In the back of my mind I suspect that current 64bit IEEE math is getting in the way of global science (Weather, Global warming...). Perhaps 128 integer math ops would be a better place to start. And, Any day now we may need 256 bit integers to manage the national debt. -- T o m M i t c h e l l Found me a new hat, now what? From jhh3851 at yahoo.com Fri Dec 5 18:01:10 2008 From: jhh3851 at yahoo.com (Joseph Han) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Re: Intro question In-Reply-To: <200812051709.mB5H9iZl030028@bluewest.scyld.com> Message-ID: <692200.60027.qm@web55007.mail.re4.yahoo.com> > > Message: 1 > Date: Fri, 5 Dec 2008 08:59:22 -0800 > From: "Lux, James P" > Subject: Re: [Beowulf] Intro question > To: Lawrence Stewart , "Robert G. Brown" > > Cc: Beowulf Mailing List > Message-ID: > Content-Type: text/plain; charset="iso-8859-1" > > > > > On 12/5/08 8:17 AM, "Lawrence Stewart" wrote: > > > I've been to a couple of DE Shaw talks and I always come away puzzled. > > > > It's tempting to conclude that they are just smarter than I am, but > > maybe they are just wrong. > > > > My understanding is they are building a special purpose molecular > > dynamics machine because it will be far faster than a general purpose > > machine programmed to do MD. > > > > In principle this might work, if you get the problem statement right, > > and you can design and build the machine before the general purpose > > machines catch up, and you don't make any mistakes, and after it is > > built you can keep designing new ones. In practice it always seems to > > take longer than you expected and cost more, and maybe that 7 bit ALU > > really has to be changed to an 8 bit ALU to keep the precision up. > > If the machine is built of reconfigurable FPGAs, then such a change is > pretty quick. > > If you have a basic hardware infrastructure, and just respin an ASIC, and > that's a bit more time consuming, but not particularly expensive. e.g. Say > it costs, in round numbers, $1M to do an ASIC. That's 2-3 work years labor > costs, so in the overall scheme of things, it's not very expensive, in a > relative way. If your overall research effort is, say, $20M/yr (which is > big, but not huge), then budgeting for a complete machine rebuild every year > is only 5-10%. If that gives you a factor of 3 speed increase, it's > probably worth it. > > Think about it.. You check out your design in FPGAs to make sure it works, > then do FPGA>ASIC and crank out a quick 10,000 customized processors, have > them assembled into boards, fire it up and go. There are all sorts of > economies of scale possible (if you're building 1000 PC boards, on an > automated line, it's just not that expensive. For comparison, we regularly > have prototype boards made with more than 20 layers and a dozen or so fairly > high density parts (a couple Xilinx Virtex II FPGAs, RAMs, CPUs, etc.) and > all the stuff around them. In single quantities, it might cost around > $15K-$20K each to do these (parts cost included). If we were doing 100 of > them, so we could spread the cost of the pick-and-place programming over all > of them, etc., it would probably be down in the $5-10K/each range. Get into > the 1000 unit quantities where it pays to go to a higher volume house, and > you might be down in the few hundred bucks each to fab the board, and now > you're just talking parts cost. > > Consider PC mobos.. The manufacturing cost (including parts) is well under > $100. > > Now consider using that nifty compchem box to go examine thousands of > possible drugs. Get a hit, and it can be a real money maker. Consider that > Claritin was responsible for about $2B of Schering-Plough's revenue in just > 2001. Plavix was almost $4B in 2005. That ED drug that starts with a V that > we all get mail about was in the $1B/yr area, although its dropping. (One > article comments that when it comes off patent in 2012 that they'll see a > bump in sales:"Recreational use of the product could also be expected to > generate substantial revenues.") > > In this context, spending $100M isn't a huge sum, now, is it. > > > > Jim > > They've actually become quite a bit more transparent lately because I think that they are close to "releasing" a product. Their website actually has quite a bit of detail now: http://www.deshawresearch.com/publications.html This paper was a good introduction IMHO: David E. Shaw, Martin M. Deneroff, Ron O. Dror, Jeffrey S. Kuskin, Richard H. Larson, John K. Salmon, Cliff Young, Brannon Batson, Kevin J. Bowers, Jack C. Chao, Michael P. Eastwood, Joseph Gagliardo, J.P. Grossman, C. Richard Ho, Douglas J. Ierardi, Istv?n Kolossv?ry, John L. Klepeis, Timothy Layman, Christine McLeavey, Mark A. Moraes, Rolf Mueller, Edward C. Priest, Yibing Shan, Jochen Spengler, Michael Theobald, Brian Towles, and Stanley C. Wang, "Anton, A Special-Purpose Machine for Molecular Dynamics Simulation," Communications of the ACM, vol. 51, no. 7, July 2008, pp. 91?97. Text And there is even a free link on their website. Joseph From bill at cse.ucdavis.edu Fri Dec 5 18:32:24 2008 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: References: <20081205124843.GM11544@leitl.org> Message-ID: <4939E438.4040803@cse.ucdavis.edu> Mark Hahn wrote: >> (Well, duh). > > yeah - the point seems to be that we (still) need to scale memory > along with core count. Which seems to be happening. Suddenly designers can get more real world performance by adding bandwidth. This isn't new in the GPU world of course where ATI and Nvidia have been selling devices for $250-$600 with 70-140GB/sec. This is however rather new for CPUs, Intel's been dominating the market with sub 10GB/sec memory systems for some time now, while AMD has had > 10GB/sec for er, 3 generations now to little effect. So the older machines had less cores and were more sensitive to latency (and the resulting nasty laws of physics) are transforming into bandwidth limited problems that are very friendly to multicore. So now Intel's shipping a CPU that can run 8 threads and suddenly has 2-3 times the memory bandwidth. Suddenly intel's gone from trailing AMD by a factor of 2 or more to matching AMD dual sockets with a single socket. AMD dual socket shanghai: Number of Threads requested = 8 Function Rate (MB/s) Avg time Min time Max time Copy: 21638.5276 0.0370 0.0370 0.0371 Scale: 21605.3675 0.0371 0.0370 0.0371 Add: 21451.1315 0.0560 0.0559 0.0562 Triad: 21399.5102 0.0562 0.0561 0.0563 techreport.com reports 21GB/sec on sandra memory bandwidth with a core i7 and 3 x 1333 MHz. If anyone has a core i7 around I'd be interested in the stream numbers. > not just memory bandwidth but also concurrency Indeed, so now amd dual sockets have 4 memory systems, Intel single sockets have 3. Not familiar with ATI/Nvidia details but I assume to make useful of 100-140GB/sec memory systems that they much have a high degree of parallelism. AMD dual socket shanghai: min threads=1 max threads=8 pagesize=4096 cacheline=64 Each threads will access a 262144 KB array 20 times 1 thread(s), a random cacheline per 73.31 ns, 73.31 ns per thread 2 thread(s), a random cacheline per 37.45 ns, 74.90 ns per thread. 4 thread(s), a random cacheline per 19.28 ns, 77.11 ns per thread. 8 thread(s), a random cacheline per 9.84 ns, 78.74 ns per thread. > (number of banks), though "ieee spectrum online for tech insiders" > doesn't get into that kind of depth :( > > I still usually explain this as "traditional (ie Cray) supercomputing > requires a balanced system." commodity processors are always less balanced > than ideal, but to varying degrees. If you ignore multicore bandwidth and the effective use of bandwidth (read that as application performance) is going up. Who cares if the "unbalanced" machines are running at 5% of peak, as long as HPC application performance (more closely tied to bandwidth) keeps increasing. > intel dual-socket quad-core was > probably the worst for a long time, but things are looking up as intel > joins AMD with memory connected to each socket. Indeed, so maybe bandwidth will become more of a design constraint. Possibly a fixed amount of memory per CPU, surface mounted memory, and memory busses wider than is practical with the traditional socket with 4-6 dimms a few inches away.... till it's feasible to put ram and CPU on the same die anyways... IRam here we come. In the mean time maybe motherboards will start looking like more video cards. So maybe something like: * 32-64 cores per socket, less than 5 GHz * 4 GB of high speed ram ( > 150GB/sec) per socket * multiple hypertransport like connections to slower memory From lindahl at pbm.com Fri Dec 5 19:20:25 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <4939E438.4040803@cse.ucdavis.edu> References: <20081205124843.GM11544@leitl.org> <4939E438.4040803@cse.ucdavis.edu> Message-ID: <20081206032025.GA26076@bx9> On Fri, Dec 05, 2008 at 06:32:24PM -0800, Bill Broadley wrote: > This is however rather new for CPUs, Intel's been dominating the market with > sub 10GB/sec memory systems for some time now, while AMD has had > 10GB/sec > for er, 3 generations now to little effect. Hey, now, that's a huge overgeneralization. The HPC people who bought AMD after Core2 came out mostly did so for memory bandwidth reasons. Before that, AMD was better on flops as well as stream. -- greg From richard.walsh at comcast.net Fri Dec 5 20:52:35 2008 From: richard.walsh at comcast.net (richard.walsh@comcast.net) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <20081205124843.GM11544@leitl.org> Message-ID: <1216473113.715371228539155366.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> All, Yes, the stacked DRAM stuff is interesting.? Anyone visit the siXis booth at SC08?? They are stacking DRAM and FPGA dies directly onto SiCBs (Silicon Circuits Boards).? This allows for dramatically more IOs per chip and finer traces throughout?the board which is small, but made entirely of silicon.? They promise better byte/flop ratios and more total memory per unit volume. rbw ----- Original Message ----- From: "Eugen Leitl" To: info@postbiota.org, Beowulf@beowulf.org Sent: Friday, December 5, 2008 7:48:43 AM GMT -05:00 US/Canada Eastern Subject: [Beowulf] Multicore Is Bad News For Supercomputers (Well, duh). http://www.spectrum.ieee.org/nov08/6912 Multicore Is Bad News For Supercomputers By Samuel K. Moore Image: Sandia Trouble Ahead: More cores per chip will slow some programs [red] unless there?s a big boost in memory bandwidth [yellow With no other way to improve the performance of processors further, chip makers have staked their future on putting more and more processor cores on the same chip. Engineers at Sandia National Laboratories, in New Mexico, have simulated future high-performance computers containing the 8-core, 16?core, and 32-core microprocessors that chip makers say are the future of the industry. The results are distressing. Because of limited memory bandwidth and memory-management schemes that are poorly suited to supercomputers, the performance of these machines would level off or even decline with more cores. The performance is especially bad for informatics applications?data-intensive programs that are increasingly crucial to the labs? national security function. High-performance computing has historically focused on solving differential equations describing physical systems, such as Earth?s atmosphere or a hydrogen bomb?s fission trigger. These systems lend themselves to being divided up into grids, so the physical system can, to a degree, be mapped to the physical location of processors or processor cores, thus minimizing delays in moving data. But an increasing number of important science and engineering problems?not to mention national security problems?are of a different sort. These fall under the general category of informatics and include calculating what happens to a transportation network during a natural disaster and searching for patterns that predict terrorist attacks or nuclear proliferation failures. These operations often require sifting through enormous databases of information. For informatics, more cores doesn?t mean better performance [see red line in ?Trouble Ahead?], according to Sandia?s simulation. ?After about 8 cores, there?s no improvement,? says James Peery, director of computation, computers, information, and mathematics at Sandia. ?At 16 cores, it looks like 2.? Over the past year, the Sandia team has discussed the results widely with chip makers, supercomputer designers, and users of high-performance computers. Unless computer architects find a solution, Peery and others expect that supercomputer programmers will either turn off the extra cores or use them for something ancillary to the main problem. At the heart of the trouble is the so-called memory wall?the growing disparity between how fast a CPU can operate on data and how fast it can get the data it needs. Although the number of cores per processor is increasing, the number of connections from the chip to the rest of the computer is not. So keeping all the cores fed with data is a problem. In informatics applications, the problem is worse, explains Richard C. Murphy, a senior member of the technical staff at Sandia, because there is no physical relationship between what a processor may be working on and where the next set of data it needs may reside. Instead of being in the cache of the core next door, the data may be on a DRAM chip in a rack 20 meters away and need to leave the chip, pass through one or more routers and optical fibers, and find its way onto the processor. In an effort to get things back on track, this year the U.S. Department of Energy formed the Institute for Advanced Architectures and Algorithms. Located at Sandia and at Oak Ridge National Laboratory, in Tennessee, the institute?s work will be to figure out what high-performance computer architectures will be needed five to 10 years from now and help steer the industry in that direction. ?The key to solving this bottleneck is tighter, and maybe smarter, integration of memory and processors,? says Peery. For its part, Sandia is exploring the impact of stacking memory chips atop processors to improve memory bandwidth. The results, in simulation at least, are promising [see yellow line in ?Trouble Ahead _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081206/7074cbb0/attachment.html From james.p.lux at jpl.nasa.gov Fri Dec 5 21:18:38 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <1216473113.715371228539155366.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> Message-ID: On 12/5/08 8:52 PM, "richard.walsh@comcast.net" wrote: > Yes, the stacked DRAM stuff is interesting. Anyone visit the siXis booth at > SC08? They are stacking DRAM and FPGA dies directly onto SiCBs (Silicon > Circuits Boards). This allows for dramatically more IOs per chip and finer > traces throughout the board which is small, but made entirely of silicon. > They > promise better byte/flop ratios and more total memory per unit volume. > > 3dplus also does this sort of thing. They take standard Ics and machine the package and stack them with little PC boards on the leads. We use this sort of thing in space applications to get the density up. Particularly if you're looking at something like Flash or DRAM, where the power dissipation isn't huge, it's a very clever idea. 3dplus has done much more sophisticated stacks, too. From csamuel at vpac.org Sat Dec 6 04:03:07 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <460867941.3143441228564946208.JavaMail.root@mail.vpac.org> Message-ID: <1195007640.3143461228564987619.JavaMail.root@mail.vpac.org> ----- "Eugen Leitl" wrote: > (Well, duh). Hmm, they seem to be rehashing the whole SC'06 "Multicore: Breakthrough or Breakdown?" session which looked at this through various peoples eyes. Presentations by the various speakers (including a certain list admin) here: http://www.cct.lsu.edu/~tron/SC06.html I'm wondering though if we're starting to see a subtle shift in direction with more and more emphasis getting placed on accelerators (mainly GPGPU, but including Cell, FPGA's, etc) ? If OpenCL can deliver on its promise of a hardware independent platform that's open source then perhaps it could assist with the proliferation of cores, considering that GPGPUs aren't known for a large RAM:core ratio ? Caveat: I'm a sysadmin, not a programmer, so be gentle.. :) -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Sat Dec 6 04:21:35 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <1064990643.3143521228565902389.JavaMail.root@mail.vpac.org> Message-ID: <276455194.3143541228566095953.JavaMail.root@mail.vpac.org> ----- "Prentice Bisbal" wrote: > Dell and others advertise systems that support up > to 128 GB RAM, but I have yet to meet someone who > can afford to put all 128 GB RAM in a single box. The geophysics people at Monash University who we do a lot of work with got one back in mid 2007 for the AuScope project. :-) http://www.xenon.com.au/press/releases/?i=5 cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From franz.marini at mi.infn.it Sat Dec 6 06:13:15 2008 From: franz.marini at mi.infn.it (Franz Marini) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <1195007640.3143461228564987619.JavaMail.root@mail.vpac.org> References: <1195007640.3143461228564987619.JavaMail.root@mail.vpac.org> Message-ID: <1228572795.15695.15.camel@merlino.mi.infn.it> On Sat, 2008-12-06 at 23:03 +1100, Chris Samuel wrote: > I'm wondering though if we're starting to see a > subtle shift in direction with more and more > emphasis getting placed on accelerators (mainly > GPGPU, but including Cell, FPGA's, etc) ? Starting ? Am I the only one remembering accelerators boards (based on FPGA, Transputers, Motorola 88k, Intel i960, various DSPs and other processors) being produced and advertised in, e.g., Byte magazine back in the 80s and early 90s ? The problems with those solutions have always been the extremely proprietary nature of the products, and therefore the lack of libraries and (community) support, and last but not least, cost. Things are better now with, say, CUDA, mainly because of the huge installed base and the low cost. OpenCL may shape to be an interesting solution. Should someone develop a, e.g., FPGA-based accelerator board, he would (only) need to support OpenCL to overcome all, except maybe cost, the problems that plagued the older solutions I mentioned before... Interesting times ahead ;) F. --------------------------------------------------------- Franz Marini Prof. R. A. Broglia Theoretical Physics of Nuclei, Atomic Clusters and Proteins Research Group Dept. of Physics, University of Milan, Italy. web : http://merlino.mi.infn.it/proteins/ email : franz.marini@mi.infn.it phone : +39 02 50317226 --------------------------------------------------------- From james.p.lux at jpl.nasa.gov Sat Dec 6 07:26:20 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <1228572795.15695.15.camel@merlino.mi.infn.it> Message-ID: On 12/6/08 6:13 AM, "Franz Marini" wrote: > > > On Sat, 2008-12-06 at 23:03 +1100, Chris Samuel wrote: >> I'm wondering though if we're starting to see a >> subtle shift in direction with more and more >> emphasis getting placed on accelerators (mainly >> GPGPU, but including Cell, FPGA's, etc) ? > > Starting ? Am I the only one remembering accelerators boards (based on > FPGA, Transputers, Motorola 88k, Intel i960, various DSPs and other > processors) being produced and advertised in, e.g., Byte magazine back > in the 80s and early 90s ? > > The problems with those solutions have always been the extremely > proprietary nature of the products, and therefore the lack of libraries > and (community) support, and last but not least, cost. I don't think "proprietary" is quite the right word here, at least in the sense of a closed architecture. A lot of those coprocessor boards had complete documentation and anyone who knew how to program, say, a TMS320, could use them. I think the real problem was that they were always sort of niche products (often, a commercial product derived from a specific custom device meeting a specific custom need) and unless you had just the right problem to solve, they didn't buy you very much in performance. The other problem was toolchains. Back then, there was no gnu tool chain. The FPGA folks (like xilinx and altera) were using the ASIC design model for their tools (i.e. Charge a huge amount, because they save enough engineer time over graph paper and rubylith that you can charge a FTE's wages as an annual license fee and still come out ahead). The boards themselves weren't particularly expensive compared to other add-on boards for your PC or (dare I say it) S-100 chassis. (I note that some of these things are really still available, at least in functionally similar form. A lot of FPGA development is done on various cards that plug into a PCI bus..See the offerings from, e.g., Nallatech) > > Things are better now with, say, CUDA, mainly because of the huge > installed base and the low cost. That's exactly it. The special purpose hardware has become commodity. > > OpenCL may shape to be an interesting solution. Should someone develop > a, e.g., FPGA-based accelerator board, he would (only) need to support > OpenCL to overcome all, except maybe cost, the problems that plagued the > older solutions I mentioned before... My general impression is that it is an order of magnitude more difficult to build a FPGA solution for a given computational problem than for a general purpose CPU/VonNeumann style machine. So, you're not going to see compilers that take an algorithm description (at a high level) and crank out optimized FPGA bitstreams any time soon. After all, we've had 50 years to do compilers for conventional architectures. (I'm not talking here about generating code for a CPU instantiated on an FPGA.. I'm talking purpose specific gate designs). There are high level design tools for FPGAs (Signal Processing Workbench, etc.) but they're hardly common or cheap. For all intents and purposes, doing FPGA designs today is basically like coding in assembler on a bare machine with no operating system, etc. There are libraries of standard components available under GPL (e.g. Gaisler's GRLIB), but it's still pretty low level. (in software terms: Oh, we've got MACROS in our assembler! And include files! And a linker!) Jim From mathog at caltech.edu Sat Dec 6 11:05:34 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] cloning issue, hidden module dependency Message-ID: Short version: there is an odd problem cloning a Mandriva 2008.1 system (2.6.24.7 kernel) from one type of motherboard to another. (Source Tyan S2466 Athlon MP, destination Gigabyte Athlon XP). On previous releases all that was needed was: 1. image /boot and / (S2466->Gigabyte) 2. put in appropriate /etc/modprobe.conf and /etc/modprobe.preload 3. run lilo 4. reboot Here that doesn't work. First, I obtained the proper files by doing a fresh install (with as few packages as possible) on the gigabyte system, tested that it would reboot ok, then saving /etc before doing the preceding steps, copying the two modprobe files from the saved /etc. Once treated as above, the new system (gigabyte) tries to start using a module from the old system (amd74xx, it should use via82cxxx) and it is all downhill from there. I tried making a zero length /etc/sysconfig/harddrake2/previous_hw (which initially contained the amd74 string), and replacing the entire /etc/udev directory structure from the saved one, neither of which made any difference. I could not even find the string amd74 elsewhere in the cloned /etc, other than in modprobe.conf.s2466 and modprobe.preload.s2466, which were also present but should not have had any effect. More details here: http://groups.google.com/group/alt.os.linux.mandriva/browse_thread/thread/e83093381c57058d?hl=en&q=mathog+modprobe.conf#50fbb0e96cc09180 Or google in groups for: mathog "where is Mandriva 2008.1 hiding" Contents of modprobe.conf: alias eth0 8139too install usb-interface /sbin/modprobe uhci_hcd; /bin/true install ide-controller /sbin/modprobe via82cxxx; /bin/true alias pci:v000010ECd00008139sv000010ECsd00008139bc02sc00i00 8139too Contents of modprobe.preload: via_agp Contents of lilo.conf default="linux" boot=/dev/hda map=/boot/map install=menu menu-scheme=wb:bw:wb:bw compact prompt nowarn timeout=100 message=/boot/message image=/boot/vmlinuz label="linux" root="/dev/hda3" initrd=/boot/initrd.img append=" resume=/dev/hda2" image=/boot/vmlinuz label="failsafe" root="/dev/hda3" initrd=/boot/initrd.img append=" failsafe" Am I missing something obvious? Any ideas where else the amd74xx module load command might be hidden away? Thanks, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From hahn at mcmaster.ca Sat Dec 6 11:33:21 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] cloning issue, hidden module dependency In-Reply-To: References: Message-ID: > Any ideas where else the amd74xx module load command might be hidden away? in the initrd, I bet... From csamuel at vpac.org Sun Dec 7 19:33:57 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Odd SuperMicro power off issues In-Reply-To: <1678446336.3242931228707093445.JavaMail.root@mail.vpac.org> Message-ID: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> Hi folks, We've been tearing our hair out over this for a little while and so I'm wondering if anyone else has seen anything like this before, or has any thoughts about what could be happening ? Very occasionally we find one of our Barcelona nodes with a SuperMicro H8DM8-2 motherboard powered off. IPMI reports it as powered down too. No kernel panic, no crash, nothing in the system logs. Nothing in the IPMI logs either, it's just sitting there as if someone has yanked the power cable (and we're pretty sure that's not the cause!). There had not been any discernible pattern to the nodes affected, and we've only a couple nodes where it's happened twice, the rest only have had it happen once and scattered over the 3 racks of the cluster. For the longest time we had no way to reproduce it, but then we noticed that for 3 of the power off's there was a particular user running Fluent on there. They've provided us with a copy of their problem and we can (often) reproduce it now with that problem. Sometimes it'll take 30 minutes or so, sometimes it'll take 4-5 hours, sometimes it'll take 3 days or so and sometimes it won't do it at all. It doesn't appear to be thermal issues as (a) there's nothing in the IPMI logs about such problems and (b) we inject CPU and system temperature into Ganglia and we don't see anything out of the ordinary in those logs. :-( We've tried other codes, including HPL, and Advanced Clustering's Breakin PXE version, but haven't managed to (yet) get one of the nodes to fail with anything except Fluent. :-( The only oddity about Fluent is that it's the only code on the system that uses HP-MPI, but we used the command line switches to tell it to use the Intel MPI it ships with and it did the same then too! I just cannot understand what is special about Fluent, or even how a user code could cause a node to just turn off without a trace in the logs. Obviously we're pursuing this through the local vendor and (through them) SuperMicro, but to be honest we're all pretty stumped by this. Does anyone have any bright ideas ? cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From award at uda.ad Mon Dec 8 02:39:39 2008 From: award at uda.ad (Alan Ward) Date: Wed Nov 25 01:08:02 2009 Subject: RS: [Beowulf] Odd SuperMicro power off issues References: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> Message-ID: Hi. Dunno if this is a bright idea, but what about the power supply temperature? There are usually no measurements done in there, and a hot power supply could easily have a thermal fuse that gets tripped. It maybe worthwhile trying with a different power box, if possible with a higher power rating. Cheers, -Alan -----Missatge original----- De: beowulf-bounces@beowulf.org en nom de Chris Samuel Enviat el: dl. 08/12/2008 04:33 Per a: Beowulf List A/c: David Bannon; Brett Pemberton Tema: [Beowulf] Odd SuperMicro power off issues Hi folks, We've been tearing our hair out over this for a little while and so I'm wondering if anyone else has seen anything like this before, or has any thoughts about what could be happening ? Very occasionally we find one of our Barcelona nodes with a SuperMicro H8DM8-2 motherboard powered off. IPMI reports it as powered down too. No kernel panic, no crash, nothing in the system logs. Nothing in the IPMI logs either, it's just sitting there as if someone has yanked the power cable (and we're pretty sure that's not the cause!). There had not been any discernible pattern to the nodes affected, and we've only a couple nodes where it's happened twice, the rest only have had it happen once and scattered over the 3 racks of the cluster. For the longest time we had no way to reproduce it, but then we noticed that for 3 of the power off's there was a particular user running Fluent on there. They've provided us with a copy of their problem and we can (often) reproduce it now with that problem. Sometimes it'll take 30 minutes or so, sometimes it'll take 4-5 hours, sometimes it'll take 3 days or so and sometimes it won't do it at all. It doesn't appear to be thermal issues as (a) there's nothing in the IPMI logs about such problems and (b) we inject CPU and system temperature into Ganglia and we don't see anything out of the ordinary in those logs. :-( We've tried other codes, including HPL, and Advanced Clustering's Breakin PXE version, but haven't managed to (yet) get one of the nodes to fail with anything except Fluent. :-( The only oddity about Fluent is that it's the only code on the system that uses HP-MPI, but we used the command line switches to tell it to use the Intel MPI it ships with and it did the same then too! I just cannot understand what is special about Fluent, or even how a user code could cause a node to just turn off without a trace in the logs. Obviously we're pursuing this through the local vendor and (through them) SuperMicro, but to be honest we're all pretty stumped by this. Does anyone have any bright ideas ? cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081208/e40e8550/attachment.html From atchley at myri.com Mon Dec 8 04:10:06 2008 From: atchley at myri.com (Scott Atchley) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Odd SuperMicro power off issues In-Reply-To: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> References: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> Message-ID: <03CF324A-B4E7-4942-9595-97E9EB8AF5CF@myri.com> Hi Chris, We had a customer with Opterons experience reboots with nothing in the logs, etc. The only thing we saw with "ipmitool sel list" was: 1 | 11/13/2007 | 10:49:44 | System Firmware Error | We traced to a HyperTransport deadlock, which by default reboots the node. Our engineer found this AMD note: reset through sync-flooding is described in chapter "13.15 Error Handling" in the following document: http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/32559.pdf When we changed the default PCI setting for this option (0x50) to off (i.e. no reboot, 0x40), the node did not reboot but it did hang and required a IPMI reboot. Our working assumption is that the traffic of one particular application running over our NICs induced some pattern of traffic that caused a flow-control deadlock in HT. Scott On Dec 7, 2008, at 10:33 PM, Chris Samuel wrote: > Hi folks, > > We've been tearing our hair out over this for a little > while and so I'm wondering if anyone else has seen anything > like this before, or has any thoughts about what could be > happening ? > > Very occasionally we find one of our Barcelona nodes with > a SuperMicro H8DM8-2 motherboard powered off. IPMI reports > it as powered down too. > > No kernel panic, no crash, nothing in the system logs. > > Nothing in the IPMI logs either, it's just sitting there > as if someone has yanked the power cable (and we're pretty > sure that's not the cause!). > > There had not been any discernible pattern to the nodes > affected, and we've only a couple nodes where it's happened > twice, the rest only have had it happen once and scattered > over the 3 racks of the cluster. > > For the longest time we had no way to reproduce it, but then > we noticed that for 3 of the power off's there was a particular > user running Fluent on there. They've provided us with a copy > of their problem and we can (often) reproduce it now with that > problem. Sometimes it'll take 30 minutes or so, sometimes it'll > take 4-5 hours, sometimes it'll take 3 days or so and sometimes > it won't do it at all. > > It doesn't appear to be thermal issues as (a) there's nothing in > the IPMI logs about such problems and (b) we inject CPU and system > temperature into Ganglia and we don't see anything out of the > ordinary in those logs. :-( > > We've tried other codes, including HPL, and Advanced Clustering's > Breakin PXE version, but haven't managed to (yet) get one of the > nodes to fail with anything except Fluent. :-( > > The only oddity about Fluent is that it's the only code on > the system that uses HP-MPI, but we used the command line > switches to tell it to use the Intel MPI it ships with and > it did the same then too! > > I just cannot understand what is special about Fluent, > or even how a user code could cause a node to just turn > off without a trace in the logs. > > Obviously we're pursuing this through the local vendor > and (through them) SuperMicro, but to be honest we're > all pretty stumped by this. > > Does anyone have any bright ideas ? > > cheers, > Chris > -- > Christopher Samuel - (03) 9925 4751 - Systems Manager > The Victorian Partnership for Advanced Computing > P.O. Box 201, Carlton South, VIC 3053, Australia > VPAC is a not-for-profit Registered Research Agency > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From larry.stewart at sicortex.com Mon Dec 8 04:10:21 2008 From: larry.stewart at sicortex.com (Lawrence Stewart) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Odd SuperMicro power off issues In-Reply-To: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> References: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> Message-ID: <3D3A6B5B-4C93-4ADA-87A6-2048C0C3B210@sicortex.com> I agree with Alan, this sort of sounds like power. Proving it might be difficult, but some ideas are: * Use a different PS on a unit you can make fail * Reduce the power demand somehow: unplug memories, disks, whatever is unpluggable that you don't need The ugliest idea I have is that fluent might have a pattern of power demand that is resonant with something in the power system, so it causes cyclical voltage or current demand that trips the power supply. Proving that could be really hard. These are multicore processors so does it depend on how many of them are running fluent and how many are doing something else? -L From smulcahy at aplpi.com Mon Dec 8 04:59:00 2008 From: smulcahy at aplpi.com (stephen mulcahy) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Odd SuperMicro power off issues In-Reply-To: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> References: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> Message-ID: <493D1A14.7060300@aplpi.com> Chris Samuel wrote: > Very occasionally we find one of our Barcelona nodes with > a SuperMicro H8DM8-2 motherboard powered off. IPMI reports > it as powered down too. Hi Chris, We had a similar exerience with one of our compute nodes - intermittent power-offs when running our model and absolutely nothing in the logs. I modified Ganglia to track voltage and temp in an effort to see if anything unusual happened to those before-hand but there was no discernable trends. I can memtest86+ a number of times on the problem node and neither it nor mcelog showed any problems. Subsequent to that, I found aBIOS upgrade for those systems which included an Opteron microcode update to fix an AMD processor erratum (sp?) - I can dig out the details if the specific problem is of interest. Around the same time, we finally started to see memory errors, so we also replaced the bad mmory in the system. Unfortunately I can't tell you which was responsible for fixing the problem. My understanding is that Fluent is quite memory and I/O intensive - do you run other equally intensive models without seeing the failure? Anyways, in summary - if you're totally stumped - try swapping out the memory and/or rolling to the latest firmware and see if that improves the stability. -stephen -- Stephen Mulcahy Applepie Solutions Ltd. http://www.aplpi.com Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway) From Bogdan.Costescu at iwr.uni-heidelberg.de Mon Dec 8 05:29:45 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Odd SuperMicro power off issues In-Reply-To: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> References: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> Message-ID: On Mon, 8 Dec 2008, Chris Samuel wrote: > Very occasionally we find one of our Barcelona nodes with > a SuperMicro H8DM8-2 motherboard powered off. IPMI reports > it as powered down too. > > No kernel panic, no crash, nothing in the system logs. So IPMI still works ? Then this is _not_ like yanking the power cable, in which case IPMI would not work anymore. I've seen this exact behaviour (computer is off, IPMI works and reports that the computer is off) being triggered by computational loads on SuperMicro H8QC8. I've had several nodes and I was able to swap power supplies - the problem moved with the power supplies, so exchanging the "faulty" ones made this behaviour disappear. There is no Fluent running here, but other codes like Gromacs that are known to load the system quite well. The power supplies are supposed to deliver a max. of 1KW for a system with 4 Opteron 875, 8GB RAM and 2 internal disks. The "turning off" behaviour was also quite random, sometimes appearing within an hour, sometimes taking hours-days; it has started to appear about 5-6 months after the nodes were purchased. I still have one node where this occurs so rarely (about once a month) that it's not accepted as an excuse for exchange ;-( -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8240, Fax: +49 6221 54 8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From gerry.creager at tamu.edu Mon Dec 8 05:51:32 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Odd SuperMicro power off issues In-Reply-To: References: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> Message-ID: <493D2664.2030703@tamu.edu> Bogdan Costescu wrote: > On Mon, 8 Dec 2008, Chris Samuel wrote: > >> Very occasionally we find one of our Barcelona nodes with >> a SuperMicro H8DM8-2 motherboard powered off. IPMI reports >> it as powered down too. >> >> No kernel panic, no crash, nothing in the system logs. > > So IPMI still works ? Then this is _not_ like yanking the power cable, > in which case IPMI would not work anymore. > > I've seen this exact behaviour (computer is off, IPMI works and reports > that the computer is off) being triggered by computational loads on > SuperMicro H8QC8. I've had several nodes and I was able to swap power > supplies - the problem moved with the power supplies, so exchanging the > "faulty" ones made this behaviour disappear. There is no Fluent running > here, but other codes like Gromacs that are known to load the system > quite well. The power supplies are supposed to deliver a max. of 1KW for > a system with 4 Opteron 875, 8GB RAM and 2 internal disks. The "turning > off" behaviour was also quite random, sometimes appearing within an > hour, sometimes taking hours-days; it has started to appear about 5-6 > months after the nodes were purchased. I still have one node where this > occurs so rarely (about once a month) that it's not accepted as an > excuse for exchange ;-( Continuing on the thread of power-related issues, this is beginning to sound like a thermal-related mechanical problem. In the power industry it is common to assume that there is a finite life for circuit breakers based on the number of times they cycle (are tripped and reset). I'm extrapolating here, as I've not had time to track down my power supply guru and ask him... however, some time back there was a company that introduced the "polyfuse" which is a thermal-trip breaker that auto-resets after it trips, upon cooling down. I used a number of these years ago while at NASA, and saw some evidence of a phenomenon similar to the breaker limited life scenario described above. I'm wondering if there might be a single voltage that's over-taxed and that opening a breaker in that supply might cause the halt-to-quiescent while leaving IPMI alive... gerry -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From prentice at ias.edu Mon Dec 8 05:58:03 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <276455194.3143541228566095953.JavaMail.root@mail.vpac.org> References: <276455194.3143541228566095953.JavaMail.root@mail.vpac.org> Message-ID: <493D27EB.2060606@ias.edu> Chris Samuel wrote: > ----- "Prentice Bisbal" wrote: > >> Dell and others advertise systems that support up >> to 128 GB RAM, but I have yet to meet someone who >> can afford to put all 128 GB RAM in a single box. > > The geophysics people at Monash University who we do > a lot of work with got one back in mid 2007 for the > AuScope project. :-) > > http://www.xenon.com.au/press/releases/?i=5 Technically, my statement is still correct - I haven't met anyone from Monash University. Being that Monash U. is in Australia and I'm in the US, my statement may still be true for some time. ;) -- Prentice From Bogdan.Costescu at iwr.uni-heidelberg.de Mon Dec 8 06:11:04 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Intro question In-Reply-To: References: Message-ID: On Fri, 5 Dec 2008, Lux, James P wrote: > Now consider using that nifty compchem box to go examine thousands > of possible drugs. I don't think that this is the main purpose of the DEShaw machine. Thousands of independent trials can be done today with thousands of machines: f.e. one trial per machine, much like Monte Carlo calculations. Machines like the DEShaw one are useful in 2 cases: - simulating a much larger system, getting to the point of simulating whole cells or at least mitochondria - simulating a much longer period of simulated time in the same real time, as processes of biological interest happen on timescales that are many orders of magnitude larger than what can be currently simulated. In each of these cases there is only one simulation running - a very fine grained one... -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8240, Fax: +49 6221 54 8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From Bogdan.Costescu at iwr.uni-heidelberg.de Mon Dec 8 06:20:31 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Intro question In-Reply-To: <49395E22.6090707@scalableinformatics.com> References: <4939541A.4000600@sicortex.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BA8@quadbrsex1.quadrics.com> <49395E22.6090707@scalableinformatics.com> Message-ID: On Fri, 5 Dec 2008, Joe Landman wrote: > The MDGrape guys might have a thing or three to say. They have been > demonstrating some pretty awesome performance for years. True, but my impression was that they were focused on getting the most performance from one unit, while the DEShaw approach factored in a high degree of parallelization from the beginning. I've heard the talk in Dresden earlier this year and I liked hearing about an idea that I've also had some time ago but not heard talking about on this list: interconnect hardware being able to DMA directly to/from CPU cache. I don't know how useful such a feature is for a general purpose interconnect (or MPI library) but it certainly fits well in the specialized frame of molecular dynamics (or rather, of how MD is implemented today). -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8240, Fax: +49 6221 54 8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From r.rankin at qub.ac.uk Mon Dec 8 06:22:18 2008 From: r.rankin at qub.ac.uk (Richard Rankin) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] RE: moab In-Reply-To: <8E50F960A9F3F6448D39155B882544E817D4FE2E@EX2K7-VIRT-1.ads.qub.ac.uk> References: <8E50F960A9F3F6448D39155B882544E817D4FE2E@EX2K7-VIRT-1.ads.qub.ac.uk> Message-ID: <8E50F960A9F3F6448D39155B882544E818717F79@EX2K7-VIRT-1.ads.qub.ac.uk> I will have funding available to purchase some new clusters in the new year. I was hoping to be able to have a cluster with a mix of Linux and windows nodes so that the mix could be varied depending on the work load. I have been pointed to http://www.clusterresources.com/pages/products/moab-hybrid-cluster.php Has anyone any experience of this Ricky ______________________ Principal Analyst Information Services Queen's University Belfast tel: 02890 974824 fax: 02890 976586 email: r.rankin@qub.ac.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081208/cab575ba/attachment.html From Bogdan.Costescu at iwr.uni-heidelberg.de Mon Dec 8 07:00:40 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Odd SuperMicro power off issues In-Reply-To: <493D2664.2030703@tamu.edu> References: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> <493D2664.2030703@tamu.edu> Message-ID: On Mon, 8 Dec 2008, Gerry Creager wrote: > I'm wondering if there might be a single voltage that's over-taxed > and that opening a breaker in that supply might cause the > halt-to-quiescent while leaving IPMI alive... I don't quite understand the "opening a breaker in that supply might cause the halt-to-quiescent" part, but just to clear up some of the things I've written before: if the computer crashes, I expect IPMI to tell me that the computer is in "on" power state; I might be able to use the IPMI console redirection to see what (if any) is printed on the console, like OOM, kernel oops, etc. In the behaviour that I have described previously for these SuperMicro boards, IPMI reported power to be "off", similar to the result of running "/sbin/poweroff" from Linux or sending a "power off" IPMI command; at the end of these 2 commands there is no console output anymore as the CPU is powered off. So somehow the BMC was notified that the computer is not "on" anymore... or maybe it was the BMC which made the decision to turn off in the fisrt place. -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8240, Fax: +49 6221 54 8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From larry.stewart at sicortex.com Mon Dec 8 07:19:17 2008 From: larry.stewart at sicortex.com (Lawrence Stewart) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Intro question In-Reply-To: References: <4939541A.4000600@sicortex.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BA8@quadbrsex1.quadrics.com> <49395E22.6090707@scalableinformatics.com> Message-ID: <493D3AF5.2090806@sicortex.com> Bogdan Costescu wrote: > On Fri, 5 Dec 2008, Joe Landman wrote: > >> The MDGrape guys might have a thing or three to say. They have been >> demonstrating some pretty awesome performance for years. > > True, but my impression was that they were focused on getting the most > performance from one unit, while the DEShaw approach factored in a > high degree of parallelization from the beginning. > > I've heard the talk in Dresden earlier this year and I liked hearing > about an idea that I've also had some time ago but not heard talking > about on this list: interconnect hardware being able to DMA directly > to/from CPU cache. I don't know how useful such a feature is for a > general purpose interconnect (or MPI library) but it certainly fits > well in the specialized frame of molecular dynamics (or rather, of how > MD is implemented today). > Well the NIC should read from cache or update the cache if the data happens to be there. Don't all well designed I/O systems do that? A mathematician woke up one night to find his wastebasket on fire. He poured water into it and went back to sleep. The next night, he woke up again to find the desk lamp on fire, so he put it in the wastebasket, reducing the issue to a previously solved problem. Flushing caches for I/O is like that. -- -Larry / Sector IX From patrick at myri.com Mon Dec 8 07:21:03 2008 From: patrick at myri.com (Patrick Geoffray) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Intro question In-Reply-To: References: <4939541A.4000600@sicortex.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BA8@quadbrsex1.quadrics.com> <49395E22.6090707@scalableinformatics.com> Message-ID: <493D3B5F.3030208@myri.com> Bogdan Costescu wrote: > about on this list: interconnect hardware being able to DMA directly > to/from CPU cache. I don't know how useful such a feature is for a You can do something similar today using Direct Cache Access (DCA) on (recent) Intel chips with IOAT. It's an indirect cache access, you tag a DMA to automatically prefetch the data in the L3 of a specific socket. It does nothing for latency, since polling will fetch the cache line just as fast, but it works well if there is a delay between the data being delivered and the data being used. The best example is a communication overlapped by computation: cache prefetching is overlapped as well, no more memory latency. Patrick From herborn at usna.edu Mon Dec 8 08:12:15 2008 From: herborn at usna.edu (Steve Herborn) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> Message-ID: <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> Good day to the group. I would like to make a brief introduction to myself and raise my first question to the forum. My name is Steve Herborn and I am a new employee at the United States Naval Academy in the Advanced Research Computing group which supports the IT systems used for faculty research. Part of my responsibilities will be the care & feeding of our Beowulf Cluster which is a commercially procured Cluster from Aspen Systems. It purchased & installed about four or five years ago. As delivered the system was originally configured with two Head nodes each with 32 compute nodes. One head node was running SUSE 9.x and the other Head Node was running Scyld (version unknown) also with 32 compute nodes. While I don't know all of the history, apparently this system was not very actively maintain and had numerous hardware & software issues, to include losing the array on which Scyld was installed. Prior to my arrival a decision was made to reconfigure the system from having two different head nodes running two different OS Distributions to one Head Node controlling all 64 Compute Nodes. In addition SUSE Linux Enterprise Server (10SP2) (X86-64) was selected as the OS for all of the nodes. Now on to my question which will more then likely be the first of many. In the collective group wisdom what would be the most efficient & effective way to "push" the SLES OS out to all of the compute nodes once it is fully installed & configured on the Head Node. In my research I've read about various Cluster packages/distributions that have that capability built in, such as ROCKS & OSCAR which appear to have the innate capability to do this as well as some additional tools that would be very nice to use in managing the system. However, from my current research in appears that they do not support SLES 10sp2 for the AMD 64-bit Architecture (although since I am so new at this I could be wrong). Are there any other "free" (money is always an issue) products or methodologies I should be looking at to push the OS out & help me manage the system? It appears that a commercial product Moab Cluster Builder will do everything I need & more, but I do not have the funds to purchase a solution. I also certainly do not want to perform a manual OS install on all 64 Compute Nodes. Thanks in advance for any & all help, advice, guidance, or pearls of wisdom that you can provide this Neophyte. Oh and please don't ask why SLES 10sp2, I've already been through that one with management. It is what I have been provided & will make work. Steven A. Herborn U.S. Naval Academy Advanced Research Computing 410-293-6480 (Desk) 757-418-0505 (Cell) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081208/abb20a3a/attachment.html From hearnsj at googlemail.com Mon Dec 8 09:14:20 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> Message-ID: <9f8092cc0812080914t27a2c87o606d6d02c7650c98@mail.gmail.com> 2008/12/8 Steve Herborn > * While I don't know all of the history, apparently this system was not > very actively maintain and had numerous hardware & software issues, to > include losing the array on which Scyld was installed. * > Cough. Splutter. Out of the mouths of babes etc... Seriously, have you thought of either: a) arranging a convenient fire which (sadly) gets oh-so-close to this system (*) b) contacting Aspen Systems and seeing to what extent they will still support this system c) as in my previous email, tell your bosses this system is too old/unreliable and get pricing for a new one (*) Not a good idea in Great Britain, where arson in the Queen's Dockyard is still a hanging offence, and I'd bet the judges would say that a Naval Academy was part of a Dockyard. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081208/b6b7568f/attachment.html From hearnsj at googlemail.com Mon Dec 8 09:40:29 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> Message-ID: <9f8092cc0812080940k47176cfpaf0fa551d660b567@mail.gmail.com> Aha! Are you referring to the OS Detect scripts in: /opt/oscar/lib/OSCAR/OCA/OS_Detect? On an SGI Tempo system there are the following: CentOS.pm Debian.pm Mandriva.pm RedHat.pm SLES.pm ScientificLinux.pm SuSE.pm A quick look reveals that SLES.pm looks for different strings in /etc/SuSE-release - which of course it should. Can anyone confirm/deny if SLES.pm is present in a 'plain vanilla' Oscar install? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081208/0094fd87/attachment.html From landman at scalableinformatics.com Mon Dec 8 10:34:44 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> Message-ID: <493D68C4.8060400@scalableinformatics.com> Steve Herborn wrote: > > > Good day to the group. I would like to make a brief introduction to > myself and raise my first question to the forum. > > > > My name is Steve Herborn and I am a new employee at the United States > Naval Academy in the Advanced Research Computing group which supports Greetings Steve > the IT systems used for faculty research. Part of my responsibilities > will be the care & feeding of our Beowulf Cluster which is a > commercially procured Cluster from Aspen Systems. It purchased & > installed about four or five years ago. As delivered the system was > originally configured with two Head nodes each with 32 compute nodes. > One head node was running SUSE 9.x and the other Head Node was running > // Scyld (version unknown) also with 32 compute nodes. While I don?t > know all of the history, apparently this system was not very actively > maintain and had numerous hardware & software issues, to include losing > the array on which Scyld was installed. //Prior to my arrival a Ouch ... if you call the good folks at Aspen, they could help with that (ping me if you need a contact) > decision was made to reconfigure the system from having two different > head nodes running two different OS Distributions to one Head Node > controlling all 64 Compute Nodes. In addition SUSE Linux Enterprise > Server (10SP2) (X86-64) was selected as the OS for all of the nodes. Ok. > Now on to my question which will more then likely be the first of many. > In the collective group wisdom what would be the most efficient & Danger Will Robinson ... for the N people who answer, you are likely to get N+2 answers, and N/2 arguments going ... not a bad thing, but to steal from the Perl motto "there is more than one way to do these things ..." > effective way to ?push? the SLES OS out to all of the compute nodes once > it is fully installed & configured on the Head Node. In my research First: Stateless (e.g. diskless) versus Stateful (e.g. local installation). Scyld is "stateless" though Don will likely correct me (as this is massively oversimpilfied). SuSE can be installed Stateless or Stateful. Its installation can be automated ... we have been doing this for years (one of the few vendors to have done this with SuSE). It can also be run diskless ... we have booted compute nodes with Infiniband to fully operational compute nodes visible in all aspects within the cluster in under 60 seconds. This is the case for 9.3, 10.x SuSE flavors. > I?ve read about various Cluster packages/distributions that have that > capability built in, such as ROCKS & OSCAR which appear to have the > innate capability to do this as well as some additional tools that would > be very nice to use in managing the system. However, from my current > research in appears that they do not support SLES 10sp2 for the AMD Rocks only supports Redhat and rebuilds, I wouldn't recommend it for the task as you have indicated. Oscar might be able to handle this, though I haven't kept up on it, so I am not sure how active it is. You want to look at xCat v2 (open source), and Warewulf/Perceus (open source). Our package (Tiburon) is not ready to be released, and we will likely make it a meta package atop Perceus at some point soon. Though it is used in production at several large commercial companies specifically for SuSE clusters. > 64-bit Architecture (although since I am so new at this I could be > wrong). Are there any other ?free? (money is always an issue) products > or methodologies I should be looking at to push the OS out & help me > manage the system? It appears that a commercial product Moab Cluster See above. If you want a prepackaged system, likely you are going to need to spend money. Moab is a possibility, though for SuSE, I would recommend looking at Concurrent Thinking's appliance. It will cost money, but they solve pretty much all of the problems for you. > Builder will do everything I need & more, but I do not have the funds to > purchase a solution. I also certainly do not want to perform a manual > OS install on all 64 Compute Nodes. No... in all likelihood, you really don't want to do any installation to the nodes (stateless if possible). > > > > Thanks in advance for any & all help, advice, guidance, or pearls of > wisdom that you can provide this Neophyte. Oh and please don?t ask why > SLES 10sp2, I?ve already been through that one with management. It is > what I have been provided & will make work. It's not an issue, though we recommend better kernels/kernel updates. Compared to the RHEL kernels, it uses modern stuff. Joe > > > > > > ** Steven A. Herborn ** > > * * U.S. * * ** Naval Academy ** > > ** Advanced Research Computing ** > > ** 410-293-6480 (Desk) ** > > ** 757-418-0505 (Cell) **** ** > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From gus at ldeo.columbia.edu Mon Dec 8 10:45:23 2008 From: gus at ldeo.columbia.edu (Gus Correa) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> Message-ID: <493D6B43.2090004@ldeo.columbia.edu> Hello Steve and list In the likely case that the original vendor will no longer support this 5-year old cluster, you can try installing the Rocks cluster suite, which is free from SDSC, and you already came across to: http://www.rocksclusters.org/wordpress/ This would be a path or least resistance, and may get your cluster up and running again with relatively small effort. Of course there are many other solutions, but they may require more effort from the system administrator. Rocks is well supported and documented. It is based on CentOS (free version of RHEL). There is no support for SLES on Rocks, so if you must keep the current OS distribution, it won't work for you. I read your last paragraph, but you may argue with your bosses that the age of this machine doesn't justify being picky about the particular OS flavor. Bringing it back to life, making it an useful asset, with a free software stack, would be a great benefit. You would spend money only in application software (e.g. Fortran compiler, Matlab, etc). Other solutions (e.g. Moab) will cost money, and may not work with this old hardware. Sticking to SLES may be a catch-22, a shot on the foot. Rocks has a relatively large user base, and an active mailing list for help. Moreover, for Rocks minimally you must have 1GB of RAM on every node, two Ethernet ports on the head node, and one Ethernet port on each compute node. Check the hardware you have. Although PXE boot capability is not strictly required, it makes installation much easier. Check your motherboard and BIOS. I have a small cluster made of five salvaged Dell Precision 410 (dual Pentium III) running Rocks 4.3, and it works well. For old hardware Rocks is a very good solution, requiring a modest investment of time, and virtually no money. (In my case I only had to buy cheap SOHO switches and Ethernet cables, but you probably already have switches.) If you are going to run parallel programs with MPI, the cheapest thing would be to have GigE ports and switches. I wouldn't invest on fancier interconnect on such an old machine. (Do you have any fancier interconnect already, say Myrinet?) However, you can buy cheap GigE NICs for $15-$20, and high end ones (say Intel Pro 1000) for $30 or less. This would be needed only if you don't have GigE ports on the nodes already. Probably your motherboards have dual GigE ports, I don't know. MPI over 100T Ethernet is a real pain, don't do it, unless you are a masochist. A 64-port GigE switch to support MPI traffic would also be a worthwhile investment. Keeping MPI on a separate network, distinct from the I/O and cluster control net, is a good thing. It avoids contention and improves performance. A natural precaution would be to backup all home directories before you start, and any precious data or filesystems. I suggest sorting out the hardware issues before anything else. It would be good to evaluate the status of your RAID, and perhaps use that particular node as a separate storage appliance. You can try just rebuilding the RAID, and see if it works, or perhaps replace the defective disk(s), if the RAID controller is still good. Another thing to look at is how functional your Ethernet (or GigE) switch or switches are, and if you have more than one switch how they are/can be connected to each other. (One for the whole cluster? Two or more separate? Some specific topology connecting many switches?) I hope this helps, Gus Correa -- --------------------------------------------------------------------- Gustavo J. Ponce Correa, PhD - Email: gus@ldeo.columbia.edu Lamont-Doherty Earth Observatory - Columbia University P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA --------------------------------------------------------------------- Steve Herborn wrote: > Good day to the group. I would like to make a brief introduction to > myself and raise my first question to the forum. > > My name is Steve Herborn and I am a new employee at the United States > Naval Academy in the Advanced Research Computing group which supports > the IT systems used for faculty research. Part of my responsibilities > will be the care & feeding of our Beowulf Cluster which is a > commercially procured Cluster from Aspen Systems. It purchased & > installed about four or five years ago. As delivered the system was > originally configured with two Head nodes each with 32 compute nodes. > One head node was running SUSE 9.x and the other Head Node was running > //Scyld (version unknown) also with 32 compute nodes. While I don?t > know all of the history, apparently this system was not very actively > maintain and had numerous hardware & software issues, to include > losing the array on which Scyld was installed. //Prior to my arrival a > decision was made to reconfigure the system from having two different > head nodes running two different OS Distributions to one Head Node > controlling all 64 Compute Nodes. In addition SUSE Linux Enterprise > Server (10SP2) (X86-64) was selected as the OS for all of the nodes. > > Now on to my question which will more then likely be the first of > many. In the collective group wisdom what would be the most efficient > & effective way to ?push? the SLES OS out to all of the compute nodes > once it is fully installed & configured on the Head Node. In my > research I?ve read about various Cluster packages/distributions that > have that capability built in, such as ROCKS & OSCAR which appear to > have the innate capability to do this as well as some additional tools > that would be very nice to use in managing the system. However, from > my current research in appears that they do not support SLES 10sp2 for > the AMD 64-bit Architecture (although since I am so new at this I > could be wrong). Are there any other ?free? (money is always an issue) > products or methodologies I should be looking at to push the OS out & > help me manage the system? It appears that a commercial product Moab > Cluster Builder will do everything I need & more, but I do not have > the funds to purchase a solution. I also certainly do not want to > perform a manual OS install on all 64 Compute Nodes. > > Thanks in advance for any & all help, advice, guidance, or pearls of > wisdom that you can provide this Neophyte. Oh and please don?t ask why > SLES 10sp2, I?ve already been through that one with management. It is > what I have been provided & will make work. > > **Steven A. Herborn** > > **U.S.**** Naval Academy** > > **Advanced Research Computing** > > **410-293-6480 (Desk)** > > **757-418-0505 (Cell)****** > >------------------------------------------------------------------------ > >_______________________________________________ >Beowulf mailing list, Beowulf@beowulf.org >To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > From djholm at fnal.gov Mon Dec 8 11:05:19 2008 From: djholm at fnal.gov (Don Holmgren) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Odd SuperMicro power off issues In-Reply-To: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> References: <1782429857.3242961228707237079.JavaMail.root@mail.vpac.org> Message-ID: Hi Chris - We've had similar problems on two different clusters using Barcelonas with two different motherboards. Our new cluster uses SuperMicro TwinU's (two H8DMT-INF+ motherboards in each) and was delivered in early November. Out of the roughly 590 motherboards, we had maybe 20 that powered down under load. Like yours, IPMI was still working, and so we could power these up remotely. For nearly all of these, swapping memory fixed the problem. For systems that multiple memory swaps did not fix the problem, the vendor swapped motherboards. I do not believe we've had to swap a power supply yet for this. On an older, smaller cluster, which uses Asus KFSN4-DRE motherboards, the incidence rate has been much higher - 20% or so - and swapping memory has not fixed the problem. On some of the systems, slowing the memory clock fixes this, but of course this causes lower computational throughput. We are still working with the vendor to fix the problem nodes; for now, we are scheduling only 6 of 8 available cores. For the job mix on that cluster, this has been a temporary solution for most of the power off issues. Like you, many of the codes that our users run do not cause a problem. On the Asus-based cluster, a computational cosmology code will trigger the power shutdowns. The best torture code that we've found has been xhpl (linpack) built using a threaded version of libgoto; when this is executed on a single dual Barcelona node with "-np 8", each of the 8 MPI processes spawns 8 threads. This particular binary will cause our bad nodes to power off very quickly (you are welcome to a copy of the binary - just let me know). The power draw from our Barcelona systems is very strongly dependent on the code. The power draw difference between the xhpl binary mentioned above and the typical Lattice QCD codes we run is at least 25%. Because of this we've always suspected thermal or power issues, but the vendor of our Asus-based cluster has done the obvious things to check both (eg, using active coolers on the CPU's, using larger power supplies, and so forth) and hasn't had any luck. Also, the fact that swapping memory on our SuperMicro systems helps without affecting computational performance probably means that it is not a thermal issue on the CPU's. Don Holmgren Fermilab On Mon, 8 Dec 2008, Chris Samuel wrote: > Hi folks, > > We've been tearing our hair out over this for a little > while and so I'm wondering if anyone else has seen anything > like this before, or has any thoughts about what could be > happening ? > > Very occasionally we find one of our Barcelona nodes with > a SuperMicro H8DM8-2 motherboard powered off. IPMI reports > it as powered down too. > > No kernel panic, no crash, nothing in the system logs. > > Nothing in the IPMI logs either, it's just sitting there > as if someone has yanked the power cable (and we're pretty > sure that's not the cause!). > > There had not been any discernible pattern to the nodes > affected, and we've only a couple nodes where it's happened > twice, the rest only have had it happen once and scattered > over the 3 racks of the cluster. > > For the longest time we had no way to reproduce it, but then > we noticed that for 3 of the power off's there was a particular > user running Fluent on there. They've provided us with a copy > of their problem and we can (often) reproduce it now with that > problem. Sometimes it'll take 30 minutes or so, sometimes it'll > take 4-5 hours, sometimes it'll take 3 days or so and sometimes > it won't do it at all. > > It doesn't appear to be thermal issues as (a) there's nothing in > the IPMI logs about such problems and (b) we inject CPU and system > temperature into Ganglia and we don't see anything out of the > ordinary in those logs. :-( > > We've tried other codes, including HPL, and Advanced Clustering's > Breakin PXE version, but haven't managed to (yet) get one of the > nodes to fail with anything except Fluent. :-( > > The only oddity about Fluent is that it's the only code on > the system that uses HP-MPI, but we used the command line > switches to tell it to use the Intel MPI it ships with and > it did the same then too! > > I just cannot understand what is special about Fluent, > or even how a user code could cause a node to just turn > off without a trace in the logs. > > Obviously we're pursuing this through the local vendor > and (through them) SuperMicro, but to be honest we're > all pretty stumped by this. > > Does anyone have any bright ideas ? > > cheers, > Chris From Bogdan.Costescu at iwr.uni-heidelberg.de Mon Dec 8 11:32:07 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <4939734C.9060604@ias.edu> References: <20081205124843.GM11544@leitl.org> <5FB03D13-79AF-48B2-8B68-A60A2559E26B@xs4all.nl> <4939734C.9060604@ias.edu> Message-ID: On Fri, 5 Dec 2008, Prentice Bisbal wrote: > Dell and others advertise systems that support up to 128 GB RAM, but > I have yet to meet someone who can afford to put all 128 GB RAM in a > single box. Rather than saying "we're doing this for a long time", I'll mention that we've had lots of problems with some AMD Opteron based systems. We've always filled up all possible memory slots with the highest capacity (but still payable ;-)) memory modules in mainboards with 4 or 8 sockets; this allowed f.e. reaching 64GB in 2006 and 128GB in 2007, but created lots of problems with instability under load. Although we've been given many assurances that the configurations were fully supported by CPU, mainboard and memory manufacturers, in practice random memory errors occured and they could only be eliminated by running the memory at a lower speed or halving the memory size - unacceptable as these computers were by contract required to run the full memory at the full speed. Some of the involved manufacturers denied any knowledge of problems on similar configurations, only to say 6 months later that such problems do exist in many cases; after having many memory modules, CPUs and mainboards exchanged, we could have arrived to the same conclusion by ourselves ;-| For the latest purchase of this type, we have chosen a Tier 1 vendor and also changed the memory architecture to Intel shared bus - but for a different reason - and so far the 128GB didn't show any errors. Hope they stay that way ;-) -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8240, Fax: +49 6221 54 8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From mathog at caltech.edu Mon Dec 8 11:37:07 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] cloning issue, hidden module dependency Message-ID: Mark Hahn wrote: > > Any ideas where else the amd74xx module load command might be hidden away? > > in the initrd, I bet... That was it, unfortunately. Mandriva went from a very small, and very general sort of "init" script to a larger, very machine specific "init". See the thread cited in the original post for more details. Sort of a PITA for cloning purposes - much easier to shuffle around a couple of small files like modprobe.conf than to have to build a new initrd for each node with different hardware. FC, RedHat, CentOS etc. might have similar changes, since they are all closely related. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From landman at scalableinformatics.com Mon Dec 8 13:05:20 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] cloning issue, hidden module dependency In-Reply-To: References: Message-ID: <493D8C10.5080007@scalableinformatics.com> David Mathog wrote: > Mark Hahn wrote: >>> Any ideas where else the amd74xx module load command might be hidden > away? >> in the initrd, I bet... > > That was it, unfortunately. Mandriva went from a very small, and very > general sort of "init" script to a larger, very machine specific "init". > See the thread cited in the original post for more details. Sort of a > PITA for cloning purposes - much easier to shuffle around a couple of > small files like modprobe.conf than to have to build a new initrd for > each node with different hardware. FC, RedHat, CentOS etc. might > have similar changes, since they are all closely related. Well RHEL is annoying in that if you decide to use a custom kernel and a software raid, you are, for lack of a better term, toast (if you stick with their tools/config). This is not to say that it is impossible, in fact it works very well in other distributions. We worked around this for some customers, but the surgery is neither easy nor pleasant. It involves upgrading nash, initrd-tools, and quite a few other things. This is because RHEL 5.x still uses dmraid for building software RAID while FCx (x>=8) have switched to mdadm (go figure). The latter works. Unfortunately, all of this is buried in initrd. Doing initrd surgery is not for the faint of heart. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From gus at ldeo.columbia.edu Mon Dec 8 13:17:47 2008 From: gus at ldeo.columbia.edu (Gus Correa) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> <493D6B43.2090004@ldeo.columbia.edu> Message-ID: <493D8EFB.5080004@ldeo.columbia.edu> Hello Steve and list Steve Herborn wrote: >The hardware suite is actually quite sweet, but has been mismanaged rather >badly. It has been left in a machine room that is too hot & on power that >is more then flaky with no line conditioners. One of the very first things >I had to do was replace almost two-dozen Power Supplies that were DOA. > > Yes, 24 power supplies may cost as much as the savings in UPS, plus the headache of replacing them, plus failing nodes. >I think I have most of the hardware issues squared away right now and need >to focus on getting here up & running, but even installing the OS on a >head-Node is proving to be troublesome. > > Besides my naive encouragement to use Rocks, I remember some recent discussions here on the Beowulf list about different techniques to setup a cluster. See this thread, and check the postings by Bogdan Cotescu, from the University of Heidelberg. He seems to administer a number of clusters, some of which have constraints comparable to yours, and to use a variety of tools for this: http://www.beowulf.org/archive/2008-October/023433.html http://www.iwr.uni-heidelberg.de/services/equipment/parallel/ >I really wish I could get away with using ROCKS as there would be such a >greater reach back for me over SUSE. Right now I am exploring AutoYast to >push the OS out to the compute nodes, > Long ago I looked into System Imager, which was then part of Oscar, but I don't know if it is current/maintained: http://wiki.systemimager.org/index.php/Main_Page >but that is still going to leave me >short on any management tools. > > > > That is true. Tell bosses they are asking you to reinvent the Rocks wheel. Good luck, Gus Correa -- --------------------------------------------------------------------- Gustavo J. Ponce Correa, PhD - Email: gus@ldeo.columbia.edu Lamont-Doherty Earth Observatory - Columbia University P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA --------------------------------------------------------------------- >Steven A. Herborn >U.S. Naval Academy >Advanced Research Computing >410-293-6480 (Desk) >757-418-0505 (Cell) > > >-----Original Message----- >From: Gus Correa [mailto:gus@ldeo.columbia.edu] >Sent: Monday, December 08, 2008 1:45 PM >To: Beowulf >Cc: Steve Herborn >Subject: Re: [Beowulf] Personal Introduction & First Beowulf Cluster >Question > >Hello Steve and list > >In the likely case that the original vendor will no longer support this >5-year old cluster, >you can try installing the Rocks cluster suite, which is free from SDSC, >and you already came across to: > >http://www.rocksclusters.org/wordpress/ > >This would be a path or least resistance, and may get your cluster up and >running again with relatively small effort. >Of course there are many other solutions, but they may require more effort >from the system administrator. > >Rocks is well supported and documented. >It is based on CentOS (free version of RHEL). > >There is no support for SLES on Rocks, >so if you must keep the current OS distribution, it won't work for you. >I read your last paragraph, but you may argue with your bosses that the >age of this >machine doesn't justify being picky about the particular OS flavor. >Bringing it back to life, making it an useful asset, >with a free software stack, would be a great benefit. >You would spend money only in application software (e.g. Fortran >compiler, Matlab, etc). >Other solutions (e.g. Moab) will cost money, and may not work with >this old hardware. >Sticking to SLES may be a catch-22, a shot on the foot. > >Rocks has a relatively large user base, and an active mailing list for help. > >Moreover, for Rocks minimally you must have 1GB of RAM on every node, >two Ethernet ports on the head node, and one Ethernet port on each >compute node. >Check the hardware you have. >Although PXE boot capability is not strictly required, it makes >installation much easier. >Check your motherboard and BIOS. > >I have a small cluster made of five salvaged Dell Precision 410 (dual >Pentium III) >running Rocks 4.3, and it works well. >For old hardware Rocks is a very good solution, requiring a modest >investment of time, >and virtually no money. >(In my case I only had to buy cheap SOHO switches and Ethernet cables, >but you probably already have switches.) > >If you are going to run parallel programs with MPI, >the cheapest thing would be to have GigE ports and switches. >I wouldn't invest on fancier interconnect on such an old machine. >(Do you have any fancier interconnect already, say Myrinet?) >However, you can buy cheap GigE NICs for $15-$20, and high end ones (say >Intel Pro 1000) for $30 or less. >This would be needed only if you don't have GigE ports on the nodes already. >Probably your motherboards have dual GigE ports, I don't know. >MPI over 100T Ethernet is a real pain, don't do it, unless you are a >masochist. >A 64-port GigE switch to support MPI traffic would also be a worthwhile >investment. >Keeping MPI on a separate network, distinct from the I/O and cluster >control net, is a good thing. >It avoids contention and improves performance. > >A natural precaution would be to backup all home directories before you >start, >and any precious data or filesystems. > >I suggest sorting out the hardware issues before anything else. > >It would be good to evaluate the status of your RAID, >and perhaps use that particular node as a separate storage appliance. >You can try just rebuilding the RAID, and see if it works, or perhaps >replace the defective disk(s), >if the RAID controller is still good. > >Another thing to look at is how functional your Ethernet (or GigE) >switch or switches are, >and if you have more than one switch how they are/can be connected to >each other. >(One for the whole cluster? Two or more separate? Some specific topology >connecting many switches?) > >I hope this helps, >Gus Correa > > > From landman at scalableinformatics.com Mon Dec 8 13:31:12 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: <493D8EFB.5080004@ldeo.columbia.edu> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> <493D6B43.2090004@ldeo.columbia.edu> <493D8EFB.5080004@ldeo.columbia.edu> Message-ID: <493D9220.3080200@scalableinformatics.com> Gus Correa wrote: >> I really wish I could get away with using ROCKS as there would be such a >> greater reach back for me over SUSE. Right now I am exploring >> AutoYast to >> push the OS out to the compute nodes, This is what we use for building SuSE clusters. It is actually not painful. We have operational autoyast.xml for 10.x and 9.3. Basically you boot it with a pointer to the autoyast file and get out of the way. [...] > That is true. > Tell bosses they are asking you to reinvent the Rocks wheel. Hmmm .... Rocks isn't everything, and there are a number of criticisms that certainly could be leveled at the system. I won't go there right now, but it is worth noting, just as with Linux != Redhat, Linux Cluster != Rocks clusters. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From mathog at caltech.edu Mon Dec 8 13:55:09 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] cloning issue, hidden module dependency Message-ID: Joe Landman wrote: > Well RHEL is annoying in that if you decide to use a custom kernel and a > software raid, you are, for lack of a better term, toast (if you stick > with their tools/config). My compute nodes aren't that complicated. The straw that broke this particular camel's back was a decision (presumably by Mandriva, maybe by RedHat) to change in the kernel config BLK_DEV_IDE and BLK_DEV_IDEDISK from y to m, similarly, DEV_AMD74XX (and etc.) also changed from y to m. As a consequence, they went from a system where a simple initrd would boot anywhere (as all the needed drivers were built into the kernel) to one where a much more complex initrd ended up being highly machine specific. In terms of having it "just work", having the disk drivers built into the kernel is lot simpler. Ubuntu 8.04.1 also has these as modules, and it has an immense initrd. (Even so, it might not have worked on my S2466 systems because the initrd did not build an AMD74XX module.) On the plus side, I did finally figure out how to PXE boot the PLD rescue CD, which in the last week has helped me escape from a couple of tight spots. The tricky part was that the online instructions said to use this in the APPEND line: initrd=rescue_pld_201/rescue.cpi,rescue_pld_201/custom/custom.cpi and that doesn't work, at least not for me. The two cpio archives are treated as if it was one file name, which of course does not exist. For future reference, it is done this way: 1. download and mount the PLD rescue cd ISO, copy the file hierarchy into /tftpboot/rescue_pld_201 (or whatever) 2. Put this in /tftpboot/pxelinux.cfg/default LABEL PLD_X86_201 KERNEL rescue_pld_201/boot/isolinux/vmlinuz APPEND initrd=rescue_pld_201/rescue.cpi root=/dev/ram0 3. Put this in /tftpboot/message.txt PLD_X86_201 : PLD rescue disk 2.01 (hdX disks) PXE boot a node, chose PLD_X86_201 on the PXE menu on its console, and it comes up at the text prompt for the PLD rescue CD. This provides a lot more tools than boel, but it is still light enough at 58M to boot over a 100baseT network. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From gus at ldeo.columbia.edu Mon Dec 8 14:41:00 2008 From: gus at ldeo.columbia.edu (Gus Correa) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: <493D9220.3080200@scalableinformatics.com> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> <493D6B43.2090004@ldeo.columbia.edu> <493D8EFB.5080004@ldeo.columbia.edu> <493D9220.3080200@scalableinformatics.com> Message-ID: <493DA27C.40707@ldeo.columbia.edu> Hi Joe, Steve, list Joe Landman wrote: > Gus Correa wrote: > >> That is true. >> Tell bosses they are asking you to reinvent the Rocks wheel. > > > Hmmm .... Rocks isn't everything, There is no doubt about this ... > and there are a number of criticisms that certainly could be leveled > at the system. I won't go there right now, but it is worth noting, > just as with Linux != Redhat, Linux Cluster != Rocks clusters. > ... or about that either. The argument here, which was lost on previous emails, is that, for 5-year old cluster hardware as this one, from which very high performance is not expected, Rocks is a quite convenient and cost-effective solution. It won't take much effort and time to install it out of the box, have the cluster up and running, and the basic tools to administer the cluster will be also available, no major tweaking required. That is how I maintain a Pentium III little cluster, and my 1993 Honda. :) Would you take such a jewel to the dealership for an oil change? In any case, Steve has a requirement to use SUSE, and this rules Rocks out. Cheers, Gus Correa -- --------------------------------------------------------------------- Gustavo J. Ponce Correa, PhD - Email: gus@ldeo.columbia.edu Lamont-Doherty Earth Observatory - Columbia University P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA --------------------------------------------------------------------- From Bogdan.Costescu at iwr.uni-heidelberg.de Mon Dec 8 14:40:35 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] cloning issue, hidden module dependency In-Reply-To: References: Message-ID: On Mon, 8 Dec 2008, David Mathog wrote: > The straw that broke this particular camel's back was a decision > (presumably by Mandriva, maybe by RedHat) to change in the kernel > config BLK_DEV_IDE and BLK_DEV_IDEDISK from y to m, similarly, > DEV_AMD74XX (and etc.) also changed from y to m. You were just lucky previously that Red Hat engineers found a good idea to put those into the kernel. How would you have felt if you were booting an all-SCSI (to stay with old tech) system, where the IDE drivers present in the kernel would not have helped ? > As a consequence, they went from a system where a simple initrd > would boot anywhere (as all the needed drivers were built into the > kernel) to one where a much more complex initrd ended up being > highly machine specific. Sorry to disapoint you... the initrd was always machine specific. All Red Hat docs specify that after modifying /etc/modules.conf or /etc/modprobe.conf the initrd should be regenerated via mkinitrd so that the next boot will use the proper drivers/settings. As to the complexity of initrd: my current method choice for setting up compute nodes is to sync a root FS from the master server during the initrd, which means that I have to build an initrd. As I already know what hardware components are in the node (which is also the case f.e. when I run mkinitrd), it's easy to just add these modules to the initrd archive and insert a few 'insmod module.ko' in the proper order in the init script. Having a monolithic kernel that "just works" on a large variety of hardware means answering "y" to most drivers; the kernel itself would then grow as large as the "immense initrd" that you mention. How would that be better ? -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8240, Fax: +49 6221 54 8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From alsimao at gmail.com Fri Dec 5 09:52:36 2008 From: alsimao at gmail.com (Alcides Simao) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Re: Beowulf Digest, Vol 58, Issue 9 In-Reply-To: <200812051644.mB5GhqRt029376@bluewest.scyld.com> References: <200812051644.mB5GhqRt029376@bluewest.scyld.com> Message-ID: <7be8c36b0812050952h4225e5d3hd15bc9431906ead3@mail.gmail.com> Hello all! I was thinking of how to 'enpower' a Beowulf cluster. I remember back a while ago that a Intel Atom was overclocked sucessfully to 2.4 GHz Could it be possible to build a cooling apparatus sufficient to upgrade the velocity of the beowulf cpu? Cumpz Alcides -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081205/0f83ffb9/attachment.html From rssr at lncc.br Fri Dec 5 12:21:36 2008 From: rssr at lncc.br (rssr@lncc.br) Date: Wed Nov 25 01:08:02 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <20081205185304.GA27201@bx9> References: <20081205124843.GM11544@leitl.org> <5FB03D13-79AF-48B2-8B68-A60A2559E26B@xs4all.nl> <4939734C.9060604@ias.edu> <20081205185304.GA27201@bx9> Message-ID: <42238.128.83.67.198.1228508496.squirrel@webmail.lncc.br> Hi The problem is not only the size, the most important is? how we can access the menory, Bus or other topology.? Do not forget? the memory refresh time that is always bigger than the memory access time Renato > On Fri, Dec 05, 2008 at 01:30:36PM -0500, Prentice Bisbal wrote: > >> Not exactly. 2 GB DIMMs are cheap, but as soon as you go to larger DIMMs >> (4 GB, 8 GB, etc.), the price goes up exponentially. > > My last quote for 2GB and 4GB dimms was linear. > > -- greg > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081205/8c3bcd3b/attachment.html From spambox at emboss.co.nz Fri Dec 5 12:36:44 2008 From: spambox at emboss.co.nz (Michael Brown) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: References: <20081205124843.GM11544@leitl.org> Message-ID: <1280DBE407554B99A636961A75B1DDD1@Forethought> Mark Hahn wrote: >> (Well, duh). > > yeah - the point seems to be that we (still) need to scale memory > along with core count. not just memory bandwidth but also concurrency > (number of banks), though "ieee spectrum online for tech insiders" > doesn't get into that kind of depth :( I think this needs to be elaborated a little for those who don't know the layout of SDRAM ... A typical chip that may be used in a 4 GB DIMM would be a 2 Gbit SDRAM chip, of which there would be 16 (total 32 Gbits = 4 Gbytes). Each chip contributes 8 bits towards the 64-bit DIMM interface, so there's two "ranks", each comprised of 8 chips. Each rank operates independently from the other, but share (and are limited by) the bandwidth of the memory channel. From here I'm going to be using the Micron MT47H128M16 as the SDRAM chip, because I have the datasheet, though other chips are probably very similar. Each SDRAM chip internally is make up of 8 banks of 32 K * 8 Kbit memory arrays. Each bank can be controlled seperately but shares the DIMM bandwidth, much like each rank does. Before accessing a particular memory cell, the whole 8 Kbit "row" needs to be activated. Only one row can be active per bank at any point in time. Once the memory controller is done with a particular row, it needs to be "precharged", which basically equates to writing it back into the main array. Activating and precharging are relatively expensive operations - precharging one row and activating another takes at least 11 cycles (tRTP + tRP) and 7 cycles (tRCD) respectively at top speed (DDR2-1066) for the Micron chips mentioned, during which no data can be read from or written to the bank. Precharging takes another 4 cycles if you've just written to the bank. The second thing to know is that processors operate in cacheline sized blocks. Current x86 cache lines are 64 bytes, IIRC. In a dual-channel system with channel interleaving, odd-numbered cachelines come from one channel, and even numbered cachelines from the other. So each cacheline fill requires 8 bytes read per chip (which fits in nicely with the standard burst length of 8, since each read is 8 bits), coming out to 128 cachelines per row. Like channel interleaving, bank interleaving is also used. So: [] Cacheline 0 comes from channel 0, bank 0 [] Cacheline 1 comes from channel 1, bank 0 [] Cacheline 2 comes from channel 0, bank 1 [] Cacheline 3 comes from channel 1, bank 1 : : [] Cacheline 14 comes from channel 0, bank 7 [] Cacheline 15 comes from channel 1, bank 7 So this pattern repeats every 1 KB, and every 128 KB a new row needs to be opened on each bank. IIRC, rank interleaving is done on AMD quad-core processors, but not the older dual-core processors nor Intel's discrete northbridges. I'm not sure about Nehalem. This is all fine and dandy on a single-core system. The bank interleaving allows the channel to be active by using another bank when one bank is being activated or precharged. With a good prefetcher, you can hit close to 100% utilization of the channel. However, it can cause problems on a multi-core system. Say if you have two cores, each scanning through separate 1 MB blocks of memory. Each core is demanding a different row from the same bank, so the memory controller has to keep on changing rows. This may not appear to be an issue at first glance - after all, we have 128 cycles between each CPU hitting a particular bank (8 bursts * 8 cycles per burst * 2 processors sharing bandwidth), so we've got 64 cycles between row changes. That's over twice what we need (unless we're using 1 GB or smaller DIMMS, which only have 4 pages so things become tight). The killer though is latency - instead of 4-ish cycles CAS delay per read, we're now looking at 22 for a precharge + activate + CAS. In a streaming situation, this doesn't hurt too much as a good prefetcher would already be indicating it needs the next cacheline. But if you've got access patterns that aren't extremely prefetcher-friendly, you're going to suffer. Simply cranking up the number of banks doesn't help this. You've still got thrashing, you're just thrashing more banks. Turning up the cacheline size can help, as you transfer more data per stall. The extreme solution is to turn off bank interleaving. Our memory layout now looks like: [] Cacheline 0 comes from channel 0, bank 0, row 0, offset 0 bits [] Cacheline 1 comes from channel 1, bank 0, row 0, offset 0 bits [] Cacheline 2 comes from channel 0, bank 0, row 0, offset 64 bits [] Cacheline 3 comes from channel 1, bank 0, row 0, offset 64 bits : : [] Cacheline 254 comes from channel 0, bank 0, row 0, offset 8 K - 64 bits [] Cacheline 255 comes from channel 1, bank 0, row 0, offset 8 K - 64 bits [] Cacheline 256 comes from channel 0, bank 0, row 1, offset 0 bits [] Cacheline 257 comes from channel 1, bank 0, row 1, offset 0 bits So a new row every 16 KB, and a new bank every 512 MB (and a new rank every 4 GB). For a single core, this generally doesn't have a big effect, since the 18 cycle precharge+activate delay can often be hidden by a good prefetcher, and in any case only comes around every 16 KB (as opposed to every 128 KB for bank interleaving, so it's a bit more frequent, though for large memory blocks it's a wash). However, this is a big killer for multicore - if you have two cores walking through the same 512 MB area, they'll be thrashing the same bank. Not only does latency suffer, but bandwidth as well since the other 7 banks can't be used to cover up the wasted time. Every 8 cycles of reading will require 18 cycles of sitting around waiting for the bank, dropping bandwidth by about 70%. However, with proper OS support this can be a bit of a win. By associating banks (512 MB memory blocks) to cores in the standard NUMA way, each core can be operating out of its own bank. There's no bank thrashing at all, which allows much looser requirements on activation and precharge, which in turn can allow higher speeds. With channel interleaving, we can have up to 8 cores/threads operating in this way. With independent channels (ala Barcelona) we can do 16. Of course, this isn't ideal either. A row change will stall the associated CPU and can't be hidden, so ideally we want at least 2 banks per CPU, interleaved. Also, shared memory will be hurt under this scheme (bandwidth and latency) since it will experience bank thrashing and will only have 2 banks. To cover the activate and precharge times, we need at least 4 banks, so for a quad core CPU we need a total of 16 memory banks in the system, partly interleaved. 8 banks per core can improve performance further with certain access patterns. Also, to keep good single-core performance, we'll need to use both channels. In this case, 4-way bank interleaving per channel (so two sets of 4-way interleaves), with channel interleaving and no rank interleaving would work, though again 8-way bank interleaving would be better if there's enough to go around. This setup is electronically obtainable in current systems, if you use two dual-rank DIMMS per channel and no rank interleaving. In this case, you have 8-way bank interleaving, with channel interleaving and with the 4 ranks in contiguous memory blocks. With AMD's Barcelona, you can get away with a single dual-rank DIMM per channel if you run the two channels independently (though in this case single-threaded performance is compromised, because each core will tend to only access memory on a single controller). An 8-thread system like Nehalam + hyperthreading would ideally like 64 banks. Because of Nehalem's wonky memory controller (seriously, who was the guy in charge who settled on three channels? I can imagine the joy of the memory controller engineers when they found out they'd have to implement a divide-by-three in a critical path) it'd be a little more difficult to get working there, though there's still enough banks to go around (12 banks per thread). However, I'm not sure of any OSes that support this quasi-NUMA. I'm guessing it could be hacked into Linux without too much trouble, given that real NUMA support is already there. It's something I've been meaning to look into for a while, but I've never had the time to really get my hands dirty trying to figure out Linux's NUMA architecture. Cheers, Michael From crhea at mayo.edu Sat Dec 6 13:22:38 2008 From: crhea at mayo.edu (Cris Rhea) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Re: Multicore Is Bad News For Supercomputers In-Reply-To: <200812062000.mB6K07M9029867@bluewest.scyld.com> References: <200812062000.mB6K07M9029867@bluewest.scyld.com> Message-ID: <20081206212238.GA23215@kaizen.mayo.edu> ----- "Prentice Bisbal" wrote: > Dell and others advertise systems that support up > to 128 GB RAM, but I have yet to meet someone who > can afford to put all 128 GB RAM in a single box. They aren't *that* expensive these days... the key for these boxes is that they have 4 CPU sockets-- this allows one to use lower-density DIMMS than trying to put 128GB on a dual socket board. Without getting into discounts, a fairly decked-out Dell R905 (4 x quad-core 2.7GHz Opteron, 128GB memory) is under $35K (USD) (Assuming no Microsoft Licenses). If you have large memory apps (and users who don't want to break them down to run on cluster nodes), these are sweet machines for the money. --- Cris -- Cristopher J. Rhea Mayo Clinic - Research Computing Facility 200 First St SW, Rochester, MN 55905 crhea@Mayo.EDU (507) 284-0587 From james.p.lux at jpl.nasa.gov Mon Dec 8 15:14:52 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: <493DA27C.40707@ldeo.columbia.edu> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> <493D6B43.2090004@ldeo.columbia.edu> <493D8EFB.5080004@ldeo.columbia.edu> <493D9220.3080200@scalableinformatics.com> <493DA27C.40707@ldeo.columbia.edu> Message-ID: > very high performance is not expected, Rocks is a quite > convenient and cost-effective solution. > > That is how I maintain a Pentium III little cluster, and my > 1993 Honda. :) Would you take such a jewel to the dealership > for an oil change? > You put rocks in the crankcase of your 93 Honda? Doesn't that make a lot of noise at high revs? Jim From mathog at caltech.edu Mon Dec 8 15:15:40 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] cloning issue, hidden module dependency Message-ID: Bogdan Costescu wrote: > Having a monolithic kernel that "just works" on a large variety of > hardware means answering "y" to most drivers; the kernel itself would > then grow as large as the "immense initrd" that you mention. I don't think so. It only has to work with a large variety of disks, and at that, not necessarily at optimal speeds. Basically it just has to function well enough to access the OS files on disk, where the rest of the modules are, so that those drivers can be loaded later. The boot kernel need not have every video, network, etc. driver in it. In any case, the sizes of the vmlinuz/initrd files discussed so far are: Distro Kernel vmlinuz initrd Kernel has IDE builtin Mandriva 2007.1 (2.6.19.3) 1607583 357892 Y Mandriva 2008.1 (2.6.24.7) 1787352 2214302 N Ubuntu 8.04.1 (2.6.24.16) 1903448 7906356 N Sure this is apples and oranges, but to me it looks like taking the IDE stuff (and maybe other drivers) out of the boot kernel is resulting in larger and larger initrd files, with the size of initrd going up faster than the size of vmlinuz, by a lot. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From gus at ldeo.columbia.edu Mon Dec 8 16:32:20 2008 From: gus at ldeo.columbia.edu (Gus Correa) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> <493D6B43.2090004@ldeo.columbia.edu> <493D8EFB.5080004@ldeo.columbia.edu> <493D9220.3080200@scalableinformatics.com> <493DA27C.40707@ldeo.columbia.edu> Message-ID: <493DBC94.9030201@ldeo.columbia.edu> Hello James, list Lux, James P wrote: >>very high performance is not expected, Rocks is a quite >>convenient and cost-effective solution. >> >>That is how I maintain a Pentium III little cluster, and my >>1993 Honda. :) Would you take such a jewel to the dealership >>for an oil change? >> >> >> > >You put rocks in the crankcase of your 93 Honda? Doesn't that make a lot of noise at high revs? > >Jim > > Had I figured how to break polymers of Silicon, rather than polymers of Carbon, to fuel my 1993 Honda (that lovely XX century relic), I might as well sell the technology to NASA, for the rovers. No more heavy batteries or big solar panels required. Get a sack of Martian pebbles, and move! No global warming either, environmentally friendly just like the Flintstones. Unfortunately the reactions are not exothermic, as Bowen discovered long ago: http://en.wikipedia.org/wiki/Bowen's_reaction_series To break pyroxene into olivine, you need to heat things up, and oxidation (weathering) takes a looong time ... Not like burning octane into CO2, where to make a big boom all you need is air and a spark. In any case, I use rocks in the trunk of my Honda, for winter ballast against skidding on the snow. Very effective, and high-tech. I recommended to BMW and other brands, for improved stability. Gus Correa -- --------------------------------------------------------------------- Gustavo J. Ponce Correa, PhD - Email: gus@ldeo.columbia.edu Lamont-Doherty Earth Observatory - Columbia University P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA --------------------------------------------------------------------- From lindahl at pbm.com Mon Dec 8 17:08:08 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Intro question In-Reply-To: <493D3AF5.2090806@sicortex.com> References: <4939541A.4000600@sicortex.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BA8@quadbrsex1.quadrics.com> <49395E22.6090707@scalableinformatics.com> <493D3AF5.2090806@sicortex.com> Message-ID: <20081209010808.GC28677@bx9> On Mon, Dec 08, 2008 at 10:19:17AM -0500, Lawrence Stewart wrote: > Well the NIC should read from cache or update the cache if the > data happens to be there. Don't all well designed I/O systems do that? There are a small number of systems that don't. Needless to say, it's a bit confusing for library writers to get I/O right in that circumstances. BTW, any interconnect that sends messages using PIO sends from cache. I wish people would invent a real receive-to-cache, since it would be nice overhead reducer for small messages on InfiniPath -- a small latency benefit (~ 7%), a bigger overhead benefit (~ 15%). -- greg From csamuel at vpac.org Mon Dec 8 19:06:49 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Odd SuperMicro power off issues In-Reply-To: <1227275645.3297031228791704093.JavaMail.root@mail.vpac.org> Message-ID: <1100390970.3297111228792009220.JavaMail.root@mail.vpac.org> ----- "Chris Samuel" wrote: > Does anyone have any bright ideas ? Wow, thanks so much to everyone who responded on this both to the list and in private, very much appreciated! Given there were so many of these I thought I'd try and comment on the main points that people raised rather than reply individually. 1) Power (lots of people) The vendor swapped in a new PSU in one of these nodes this morning, so we are resuming attempts to reproduce this failure now. The odd thing that we've noticed is that this often seems to happen when the node is only partly loaded (though not exclusively); for instance at one point we saw a node fail with Fluent running on 4 cores and a home grown code on another core (3 spare). 2) HT lockups (Scott and potentially Don) We've seen the same "System Firmware Error" messages on some of our nodes, sometimes associated with a system lockup, so we're going to look into BIOS upgrades. 3) Fluent Well we had a node power off this morning that wasn't running Fluent, but instead had a 4 CPU Gaussian job, some NAMD processes from various jobs and some random user compiled code. I don't know whether to be glad that I Fluent isn't so special or worried that other code can kill nodes. :-/ 4) IPMI (Bogdan) We wondered if the IPMI/BMC module might have done the power off too, but we would hope that we would see something in the logs. Anyway, we'll carry on with this using the hints and tips that people have provided and when (if?) we solve this I'll certainly update the list with what we find! Once again thanks so much to all of you who took the time to reply. All the best, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Mon Dec 8 19:48:04 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: <1395903999.3297661228794272014.JavaMail.root@mail.vpac.org> Message-ID: <484419194.3297711228794483965.JavaMail.root@mail.vpac.org> ----- "John Hearns" wrote: > (*) Not a good idea in Great Britain, where arson in the Queen's > Dockyard is still a hanging offence, and I'd bet the judges would > say that a Naval Academy was part of a Dockyard. No longer the case I'm afraid (well, actually quite glad!): http://www.capitalpunishmentuk.org/abolish.html # On the 10th of December 1999, International Human Rights Day, # the government ratified Second Optional Protocol to the # International Covenant on Civil and Political Rights thus # totally abolishing capital punishment in Britain. -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From iioleynik at gmail.com Mon Dec 8 21:54:31 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Cluster quote Message-ID: I know that many readers of this forum work for cluser vendors. Therefore, I am sending this email to get some responses from interested parties. I am going to purchase a computational cluster very soon (by the end of this year) and would like to get a quote for the configuration: 36 compute nodes (no dedicated master node), node config: 2x AMD Shanghai Opteron 2380, 2.5 GHz CPUs per node, 8 GB (2x4) DDR2 667 GHz memory, 250 GB HD, IPMI, Infiniband DDR card. Networking: 36 port Infiniband DDR switch (Melanox, not interested in expensive Qlogic), 48 port managed Gigabit switch Rack: standard size, 3 simple PDUs (no expensive network managed) I would appreciate receiving your quotes asap. Best wishes, Ivan Oleynik -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081209/95231efa/attachment.html From jan.heichler at gmx.net Mon Dec 8 22:03:39 2008 From: jan.heichler at gmx.net (Jan Heichler) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Cluster quote In-Reply-To: References: Message-ID: <152838081.20081209070339@gmx.net> Hallo Ivan, since that list is read by readers from many countries and vendors are normally active in certain geographical areas you should specify where the cluster will be located... Jan Dienstag, 9. Dezember 2008, meintest Du: I know that many readers of this forum work for cluser vendors. Therefore, I am sending this email to get some responses from interested parties. I am going to purchase a computational cluster very soon (by the end of this year) and would like to get a quote for the configuration: 36 compute nodes (no dedicated master node), node config: 2x AMD Shanghai Opteron 2380, 2.5 GHz CPUs per node, 8 GB (2x4) DDR2 667 GHz memory, 250 GB HD, IPMI, Infiniband DDR card. Networking: 36 port Infiniband DDR switch (Melanox, not interested in expensive Qlogic), 48 port managed Gigabit switch Rack: standard size, 3 simple PDUs (no expensive network managed) I would appreciate receiving your quotes asap. Best wishes, Ivan Oleynik Bye Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081209/e1993213/attachment.html From iioleynik at gmail.com Mon Dec 8 22:18:30 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Re: Cluster quote In-Reply-To: References: Message-ID: In my previous post I forgot to mention location of my cluster: Tampa, FL, USA. Thanks to Jan Heichler (from Germany?) who sent me this reminder. Ivan On Tue, Dec 9, 2008 at 12:54 AM, Ivan Oleynik wrote: > I know that many readers of this forum work for cluser vendors. Therefore, > I am sending this email to get some responses from interested parties. > > I am going to purchase a computational cluster very soon (by the end of > this year) and would like to get a quote for the configuration: > > 36 compute nodes (no dedicated master node), > > node config: 2x AMD Shanghai Opteron 2380, 2.5 GHz CPUs per node, 8 GB > (2x4) DDR2 667 GHz memory, 250 GB HD, IPMI, Infiniband DDR card. > > Networking: 36 port Infiniband DDR switch (Melanox, not interested in > expensive Qlogic), 48 port managed Gigabit switch > > Rack: standard size, 3 simple PDUs (no expensive network managed) > > I would appreciate receiving your quotes asap. > > Best wishes, > > Ivan Oleynik > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081209/ed91ac0f/attachment.html From eugen at leitl.org Mon Dec 8 23:51:40 2008 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Multicore Is Bad News For Supercomputers In-Reply-To: <1280DBE407554B99A636961A75B1DDD1@Forethought> References: <20081205124843.GM11544@leitl.org> <1280DBE407554B99A636961A75B1DDD1@Forethought> Message-ID: <20081209075140.GZ11544@leitl.org> On Sat, Dec 06, 2008 at 07:36:44AM +1100, Michael Brown wrote: > I think this needs to be elaborated a little for those who don't know the > layout of SDRAM ... Thank you, most useful information. [SNIP] I don't think this is very applicable to custom DRAM stacked on top of core, or SRAM/eDRAM (eventually MRAM?) in the core (e.g. like the Cell does it). There the most natural way is structure it into very wide words, and access it a that way. Add an array of ALUs on top of it along with shifts, n-bit swaps and the like and you'll get a very beefy machine on each die. Add a router to each die, and you've got potential for wafer-scale integration, by routing around dead dies from production or dynamically remapping failed grains during operation. This might not look like commodity, but eventually graphics accelerators must go there due to memory bandwidth limitations, and eventually CPUs will converge. -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE From tjrc at sanger.ac.uk Tue Dec 9 01:07:14 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] cloning issue, hidden module dependency In-Reply-To: References: Message-ID: <09FE4A3A-EB6B-4A10-A30F-7FA659806BD8@sanger.ac.uk> On 8 Dec 2008, at 11:15 pm, David Mathog wrote: > Bogdan Costescu wrote: > >> Having a monolithic kernel that "just works" on a large variety of >> hardware means answering "y" to most drivers; the kernel itself would >> then grow as large as the "immense initrd" that you mention. > > I don't think so. It only has to work with a large variety of disks, > and > at that, not necessarily at optimal speeds. Basically it just has to > function well enough to access the OS files on disk, where the rest of > the modules are, so that those drivers can be loaded later. The boot > kernel need not have every video, network, etc. driver in it. > > In any case, the sizes of the vmlinuz/initrd files discussed so far > are: > > Distro Kernel vmlinuz initrd Kernel has IDE builtin > Mandriva 2007.1 (2.6.19.3) 1607583 357892 Y > Mandriva 2008.1 (2.6.24.7) 1787352 2214302 N > Ubuntu 8.04.1 (2.6.24.16) 1903448 7906356 N > > Sure this is apples and oranges, but to me it looks like taking the > IDE > stuff (and maybe other drivers) out of the boot kernel is resulting > in larger and larger initrd files, with the size of initrd going up > faster than the size of vmlinuz, by a lot. Ubuntu put quite a lot of other stuff into the initrd which has nothing to do with device drivers. For example, the initrd includes casper and all its support scripts, which provide support for things like persistent USB storage when running as a Live CD. But they also do put the entire kitchen sink in there in terms of device drivers; Ubuntu does aim to cover as wide as possible a range of possible hardware. It's very easy to build a custom kernel package with just what you want using the 'make-kpkg' command, and then you can strip out all the extraneous cruft if you want (I never bother - it's only modules that don't get loaded, and the only performance issue would be if you're PXE booting) Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From ajt at rri.sari.ac.uk Tue Dec 9 04:07:02 2008 From: ajt at rri.sari.ac.uk (Tony Travis) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] cloning issue, hidden module dependency In-Reply-To: <09FE4A3A-EB6B-4A10-A30F-7FA659806BD8@sanger.ac.uk> References: <09FE4A3A-EB6B-4A10-A30F-7FA659806BD8@sanger.ac.uk> Message-ID: <493E5F66.3050006@rri.sari.ac.uk> Tim Cutts wrote: > [...] > It's very easy to build a custom kernel package > with just what you want using the 'make-kpkg' command, and then you > can strip out all the extraneous cruft if you want (I never bother - > it's only modules that don't get loaded, and the only performance > issue would be if you're PXE booting) Hello, Tim. That's right, I PXE boot openMosix without an initrd, with the drivers needed to access the root filesystem built-in: Everything else is loaded as a module from /lib. Bye, Tony. -- Dr. A.J.Travis, University of Aberdeen, Rowett Institute of Nutrition and Health, Greenburn Road, Bucksburn, Aberdeen AB21 9SB, Scotland, UK tel +44(0)1224 712751, fax +44(0)1224 716687, http://www.rowett.ac.uk mailto:a.travis@abdn.ac.uk, http://bioinformatics.rri.sari.ac.uk/~ajt From herborn at usna.edu Tue Dec 9 05:45:30 2008 From: herborn at usna.edu (Steve Herborn) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: <493D68C4.8060400@scalableinformatics.com> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> <493D68C4.8060400@scalableinformatics.com> Message-ID: <85E4A12B2A64449A88808239EB84667C@dynamic.usna.edu> Joe; In relation to your Perl Motto; I'm more then appear that there is always more then one way to skin a cat and great debate will surround the subject. Sometimes the exercise can be useful, if not bloody. Unfortunately for me I'm not currently in a decision maker position on any of this and am being "directed" to do certain things in conjunction with a path that somebody already established, but it was in their mind not written down. The system's compute nodes were originally built to be "Stateful" and the current power player on my team wants it to remain that way. As things sit as of today I'm looking at either using AutoYast and am also evaluating Xcat to perform the task. The biggest issue with AutoYast is that it will assist me in getting the OS out to the Nodes; it really doesn't provide any of the Cluster Management Tools that I would like to get installed. Now you maybe asking yourself why "Stateful" Compute Nodes as I did. It appears to me at this time that along with occasionally using these nodes as part of a Cluster, they also use them as plain old Servers/Workstations as I've found User Accounts & home directories on some of the compute nodes. As I said in my first post I'm new to this position & organization and not quite sure with exactly how & for what the system is even used for. I was simply told to get'er up. Steven A. Herborn U.S. Naval Academy Advanced Research Computing 410-293-6480 (Desk) 757-418-0505 (Cell) -----Original Message----- From: Joe Landman [mailto:landman@scalableinformatics.com] Sent: Monday, December 08, 2008 1:35 PM To: Steve Herborn Cc: beowulf@beowulf.org Subject: Re: [Beowulf] Personal Introduction & First Beowulf Cluster Question Steve Herborn wrote: > > > Good day to the group. I would like to make a brief introduction to > myself and raise my first question to the forum. > > > > My name is Steve Herborn and I am a new employee at the United States > Naval Academy in the Advanced Research Computing group which supports Greetings Steve > the IT systems used for faculty research. Part of my responsibilities > will be the care & feeding of our Beowulf Cluster which is a > commercially procured Cluster from Aspen Systems. It purchased & > installed about four or five years ago. As delivered the system was > originally configured with two Head nodes each with 32 compute nodes. > One head node was running SUSE 9.x and the other Head Node was running > // Scyld (version unknown) also with 32 compute nodes. While I don't > know all of the history, apparently this system was not very actively > maintain and had numerous hardware & software issues, to include losing > the array on which Scyld was installed. //Prior to my arrival a Ouch ... if you call the good folks at Aspen, they could help with that (ping me if you need a contact) > decision was made to reconfigure the system from having two different > head nodes running two different OS Distributions to one Head Node > controlling all 64 Compute Nodes. In addition SUSE Linux Enterprise > Server (10SP2) (X86-64) was selected as the OS for all of the nodes. Ok. > Now on to my question which will more then likely be the first of many. > In the collective group wisdom what would be the most efficient & Danger Will Robinson ... for the N people who answer, you are likely to get N+2 answers, and N/2 arguments going ... not a bad thing, but to steal from the Perl motto "there is more than one way to do these things ..." > effective way to "push" the SLES OS out to all of the compute nodes once > it is fully installed & configured on the Head Node. In my research First: Stateless (e.g. diskless) versus Stateful (e.g. local installation). Scyld is "stateless" though Don will likely correct me (as this is massively oversimpilfied). SuSE can be installed Stateless or Stateful. Its installation can be automated ... we have been doing this for years (one of the few vendors to have done this with SuSE). It can also be run diskless ... we have booted compute nodes with Infiniband to fully operational compute nodes visible in all aspects within the cluster in under 60 seconds. This is the case for 9.3, 10.x SuSE flavors. > I've read about various Cluster packages/distributions that have that > capability built in, such as ROCKS & OSCAR which appear to have the > innate capability to do this as well as some additional tools that would > be very nice to use in managing the system. However, from my current > research in appears that they do not support SLES 10sp2 for the AMD Rocks only supports Redhat and rebuilds, I wouldn't recommend it for the task as you have indicated. Oscar might be able to handle this, though I haven't kept up on it, so I am not sure how active it is. You want to look at xCat v2 (open source), and Warewulf/Perceus (open source). Our package (Tiburon) is not ready to be released, and we will likely make it a meta package atop Perceus at some point soon. Though it is used in production at several large commercial companies specifically for SuSE clusters. > 64-bit Architecture (although since I am so new at this I could be > wrong). Are there any other "free" (money is always an issue) products > or methodologies I should be looking at to push the OS out & help me > manage the system? It appears that a commercial product Moab Cluster See above. If you want a prepackaged system, likely you are going to need to spend money. Moab is a possibility, though for SuSE, I would recommend looking at Concurrent Thinking's appliance. It will cost money, but they solve pretty much all of the problems for you. > Builder will do everything I need & more, but I do not have the funds to > purchase a solution. I also certainly do not want to perform a manual > OS install on all 64 Compute Nodes. No... in all likelihood, you really don't want to do any installation to the nodes (stateless if possible). > > > > Thanks in advance for any & all help, advice, guidance, or pearls of > wisdom that you can provide this Neophyte. Oh and please don't ask why > SLES 10sp2, I've already been through that one with management. It is > what I have been provided & will make work. It's not an issue, though we recommend better kernels/kernel updates. Compared to the RHEL kernels, it uses modern stuff. Joe > > > > > > ** Steven A. Herborn ** > > * * U.S. * * ** Naval Academy ** > > ** Advanced Research Computing ** > > ** 410-293-6480 (Desk) ** > > ** 757-418-0505 (Cell) **** ** > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From herborn at usna.edu Tue Dec 9 05:47:28 2008 From: herborn at usna.edu (Steve Herborn) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com><381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu><493D6B43.2090004@ldeo.columbia.edu><493D8EFB.5080004@ldeo.columbia.edu><493D9220.3080200@scalableinformatics.com> <493DA27C.40707@ldeo.columbia.edu> Message-ID: <7ED78016F3BB4B1CB147EB8C212E608A@dynamic.usna.edu> -----Original Message----- From: Lux, James P [mailto:james.p.lux@jpl.nasa.gov] Sent: Monday, December 08, 2008 6:15 PM To: Gus Correa; Beowulf; Steve Herborn Subject: RE: [Beowulf] Personal Introduction & First Beowulf Cluster Question > very high performance is not expected, Rocks is a quite > convenient and cost-effective solution. > > That is how I maintain a Pentium III little cluster, and my > 1993 Honda. :) Would you take such a jewel to the dealership > for an oil change? > You put rocks in the crankcase of your 93 Honda? Doesn't that make a lot of noise at high revs? Jim And off we go on a rocky side-road. I was wondering how long that would take. :) From landman at scalableinformatics.com Tue Dec 9 06:12:34 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: <85E4A12B2A64449A88808239EB84667C@dynamic.usna.edu> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> <493D68C4.8060400@scalableinformatics.com> <85E4A12B2A64449A88808239EB84667C@dynamic.usna.edu> Message-ID: <493E7CD2.2090702@scalableinformatics.com> Steve Herborn wrote: > The system's compute nodes were originally built to be "Stateful" and the > current power player on my team wants it to remain that way. As things sit Ok, not a problem. > as of today I'm looking at either using AutoYast and am also evaluating Xcat > to perform the task. The biggest issue with AutoYast is that it will assist > me in getting the OS out to the Nodes; it really doesn't provide any of the > Cluster Management Tools that I would like to get installed. Which tools do you have in mind? The Autoyast package that we have set up for our customers installs the OS locally, as well as pdsh, ganglia, and several other tools. Then in our finishing scripts which the autoyast.xml file links to, we set up SGE, adjust NIS/mounts, ... As I indicated, we get operational compute nodes shortly after turning them on. The current version of autoyast.xml + finishing scripts we have constructed also builds a RAID0 for local scratch, uses xfs file systems for root and scratch, installs OFED RPMs (on SuSE), updates the kernel to a late model (2.6.23.14 or so) and does some sysctl tuning. > Now you maybe asking yourself why "Stateful" Compute Nodes as I did. It Not really ... end users and customers have preferences. Our job is to help them understand the good and bad elements of each. Once they understand, if they prefer to make the decision, then we have them decide and go from there. If they leave it up to us, we try to help them make the best choice. > appears to me at this time that along with occasionally using these nodes as > part of a Cluster, they also use them as plain old Servers/Workstations as > I've found User Accounts & home directories on some of the compute nodes. Ow. A central "enterprise" disk is definitely needed. > As I said in my first post I'm new to this position & organization and not > quite sure with exactly how & for what the system is even used for. I was > simply told to get'er up. :) Bug me offline if you want our autoyast.xml, and access to our finishing scripts (parts of our Tiburon package). Check out xCat as well. > > Steven A. Herborn > U.S. Naval Academy > Advanced Research Computing > 410-293-6480 (Desk) > 757-418-0505 (Cell) > > -----Original Message----- > From: Joe Landman [mailto:landman@scalableinformatics.com] > Sent: Monday, December 08, 2008 1:35 PM > To: Steve Herborn > Cc: beowulf@beowulf.org > Subject: Re: [Beowulf] Personal Introduction & First Beowulf Cluster > Question > > Steve Herborn wrote: >> >> Good day to the group. I would like to make a brief introduction to >> myself and raise my first question to the forum. >> >> >> >> My name is Steve Herborn and I am a new employee at the United States >> Naval Academy in the Advanced Research Computing group which supports > > Greetings Steve > >> the IT systems used for faculty research. Part of my responsibilities >> will be the care & feeding of our Beowulf Cluster which is a >> commercially procured Cluster from Aspen Systems. It purchased & >> installed about four or five years ago. As delivered the system was >> originally configured with two Head nodes each with 32 compute nodes. >> One head node was running SUSE 9.x and the other Head Node was running >> // Scyld (version unknown) also with 32 compute nodes. While I don't >> know all of the history, apparently this system was not very actively >> maintain and had numerous hardware & software issues, to include losing >> the array on which Scyld was installed. //Prior to my arrival a > > Ouch ... if you call the good folks at Aspen, they could help with that > (ping me if you need a contact) > >> decision was made to reconfigure the system from having two different >> head nodes running two different OS Distributions to one Head Node >> controlling all 64 Compute Nodes. In addition SUSE Linux Enterprise >> Server (10SP2) (X86-64) was selected as the OS for all of the nodes. > > Ok. > >> Now on to my question which will more then likely be the first of many. >> In the collective group wisdom what would be the most efficient & > > Danger Will Robinson ... for the N people who answer, you are likely to > get N+2 answers, and N/2 arguments going ... not a bad thing, but to > steal from the Perl motto "there is more than one way to do these things > ..." > >> effective way to "push" the SLES OS out to all of the compute nodes once >> it is fully installed & configured on the Head Node. In my research > > First: Stateless (e.g. diskless) versus Stateful (e.g. local > installation). Scyld is "stateless" though Don will likely correct me > (as this is massively oversimpilfied). SuSE can be installed Stateless > or Stateful. Its installation can be automated ... we have been doing > this for years (one of the few vendors to have done this with SuSE). It > can also be run diskless ... we have booted compute nodes with > Infiniband to fully operational compute nodes visible in all aspects > within the cluster in under 60 seconds. This is the case for 9.3, 10.x > SuSE flavors. > >> I've read about various Cluster packages/distributions that have that >> capability built in, such as ROCKS & OSCAR which appear to have the >> innate capability to do this as well as some additional tools that would >> be very nice to use in managing the system. However, from my current >> research in appears that they do not support SLES 10sp2 for the AMD > > Rocks only supports Redhat and rebuilds, I wouldn't recommend it for the > task as you have indicated. > > Oscar might be able to handle this, though I haven't kept up on it, so I > am not sure how active it is. > > You want to look at xCat v2 (open source), and Warewulf/Perceus (open > source). Our package (Tiburon) is not ready to be released, and we will > likely make it a meta package atop Perceus at some point soon. Though > it is used in production at several large commercial companies > specifically for SuSE clusters. > >> 64-bit Architecture (although since I am so new at this I could be >> wrong). Are there any other "free" (money is always an issue) products >> or methodologies I should be looking at to push the OS out & help me >> manage the system? It appears that a commercial product Moab Cluster > > See above. If you want a prepackaged system, likely you are going to > need to spend money. Moab is a possibility, though for SuSE, I would > recommend looking at Concurrent Thinking's appliance. It will cost > money, but they solve pretty much all of the problems for you. > >> Builder will do everything I need & more, but I do not have the funds to >> purchase a solution. I also certainly do not want to perform a manual >> OS install on all 64 Compute Nodes. > > No... in all likelihood, you really don't want to do any installation to > the nodes (stateless if possible). > >> >> >> Thanks in advance for any & all help, advice, guidance, or pearls of >> wisdom that you can provide this Neophyte. Oh and please don't ask why >> SLES 10sp2, I've already been through that one with management. It is >> what I have been provided & will make work. > > It's not an issue, though we recommend better kernels/kernel updates. > Compared to the RHEL kernels, it uses modern stuff. > > Joe > >> >> >> >> >> ** Steven A. Herborn ** >> >> * * U.S. * * ** Naval Academy ** >> >> ** Advanced Research Computing ** >> >> ** 410-293-6480 (Desk) ** >> >> ** 757-418-0505 (Cell) **** ** >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From kilian.cavalotti.work at gmail.com Tue Dec 9 06:35:03 2008 From: kilian.cavalotti.work at gmail.com (Kilian CAVALOTTI) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation Message-ID: <200812091535.03920.kilian.cavalotti.work@gmail.com> Hi all, I'd be curious to know if some of you use or have some real-life experience with rear-door heat exchangers, such as those from SGI [1] or IBM [2]. I'm especially interested in feedback about condensation, and operational water temperature. [1]http://www.sgi.fr/synergie/EpisodeXI/articles/3g.shtml [2]http://www.ibm.com/servers/eserver/xseries/storage/pdf/IBM_RDHx_Spec_Sheet.pdf Thanks a lot! Cheers, -- Kilian From hearnsj at googlemail.com Tue Dec 9 06:52:03 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <200812091535.03920.kilian.cavalotti.work@gmail.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> Message-ID: <9f8092cc0812090652t505a0d85y35c6a103cf17bf42@mail.gmail.com> 2008/12/9 Kilian CAVALOTTI > Hi all, > > I'd be curious to know if some of you use or have some real-life experience > with rear-door heat exchangers, such as those from SGI [1] or IBM [2]. > Killian, yes indeed. I manage both an SGI Altix with the rear-door heat exchangers, and an ICE cluster. We are lucky enough to have our own lake for a cooling pond. Grin. I think these are the cat's pyjamas - the SGI ones come in four horizontal 'stable doors' so you can swing one open for an extended amount of time to work on the rear of systems without overheating the whole rack. They enable us to run machines in some reasonably small spaces, and have been reliable. Contact me off-list please for temperature notes. John Hearns -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081209/6f1657b5/attachment.html From hearnsj at googlemail.com Tue Dec 9 07:04:57 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Personal Introduction & First Beowulf Cluster Question In-Reply-To: <493E7CD2.2090702@scalableinformatics.com> References: <386fa5610812020305n764d006dg606b2bf6461278a9@mail.gmail.com> <381BF20CBD854583BF28F0ED39E71E5A@dynamic.usna.edu> <493D68C4.8060400@scalableinformatics.com> <85E4A12B2A64449A88808239EB84667C@dynamic.usna.edu> <493E7CD2.2090702@scalableinformatics.com> Message-ID: <9f8092cc0812090704h50aae5b5v37366f3c3ca32f4a@mail.gmail.com> 2008/12/9 Joe Landman > > > Which tools do you have in mind? The Autoyast package that we have set up > for our customers installs the OS locally, as well as pdsh, ganglia, and > several other tools. Then in our finishing scripts which the autoyast.xml > file links to, we set up SGE, adjust NIS/mounts, ... > > As I indicated, we get operational compute nodes shortly after turning them > on. The current version of autoyast.xml + finishing scripts we have > constructed also builds a RAID0 for local scratch, uses xfs file systems for > root and scratch, installs OFED RPMs (on SuSE), updates the kernel to a late > model (2.6.23.14 or so) and does some sysctl tuning. > > That's how Streamline originally installed their clusters. Works fine - you do a generic SuSE install, and let the Autoyast tools do all the 'heavy lifting' then run a post-install script which integrates your nodes into the cluster, as Joe says by enabling NSI binding, copying across the batch startup script (yada yada...). I agree with Joe this would probably be a good way forward for you. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081209/40511412/attachment.html From hearnsj at googlemail.com Tue Dec 9 07:00:55 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <200812091535.03920.kilian.cavalotti.work@gmail.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> Message-ID: <9f8092cc0812090700s3a19af52xa5917ec85ea5c03e@mail.gmail.com> 2008/12/9 Kilian CAVALOTTI > > [1]http://www.sgi.fr/synergie/EpisodeXI/articles/3g.shtml > If you look on SGI Techpubs you can find their site install guide http://techpubs.sgi.com/library/tpl/cgi-bin/summary.cgi?coll=hdwr&db=bks&docnumber=007-5021-001 Chapter 4 has the specs for the water cooled racks. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081209/b7fbe045/attachment.html From iioleynik at gmail.com Tue Dec 9 07:09:51 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <9f8092cc0812090652t505a0d85y35c6a103cf17bf42@mail.gmail.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> <9f8092cc0812090652t505a0d85y35c6a103cf17bf42@mail.gmail.com> Message-ID: John, What is the water rate requirement? Can it be fitted to any standard 42 rack, not only SGI made? How much did it cost (rough estimate would suffice)? Thanks, Ivan On Tue, Dec 9, 2008 at 9:52 AM, John Hearns wrote: > > > 2008/12/9 Kilian CAVALOTTI > >> Hi all, >> >> I'd be curious to know if some of you use or have some real-life >> experience >> with rear-door heat exchangers, such as those from SGI [1] or IBM [2]. >> > Killian, yes indeed. I manage both an SGI Altix with the rear-door heat > exchangers, and an ICE cluster. > We are lucky enough to have our own lake for a cooling pond. Grin. > > I think these are the cat's pyjamas - the SGI ones come in four horizontal > 'stable doors' so you can swing one open for an extended amount of time to > work on the rear of systems without overheating the whole rack. > They enable us to run machines in some reasonably small spaces, and have > been reliable. > > Contact me off-list please for temperature notes. > > John Hearns > > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081209/ec93782b/attachment.html From lynesh at cardiff.ac.uk Tue Dec 9 07:14:46 2008 From: lynesh at cardiff.ac.uk (Huw Lynes) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <9f8092cc0812090652t505a0d85y35c6a103cf17bf42@mail.gmail.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> <9f8092cc0812090652t505a0d85y35c6a103cf17bf42@mail.gmail.com> Message-ID: <1228835686.21024.22.camel@w1199.insrv.cf.ac.uk> On Tue, 2008-12-09 at 14:52 +0000, John Hearns wrote: > > > 2008/12/9 Kilian CAVALOTTI > Hi all, > > I'd be curious to know if some of you use or have some > real-life experience > with rear-door heat exchangers, such as those from SGI [1] or > IBM [2]. > Killian, yes indeed. I manage both an SGI Altix with the rear-door > heat exchangers, and an ICE cluster. > We are lucky enough to have our own lake for a cooling pond. Grin. > > I think these are the cat's pyjamas - the SGI ones come in four > horizontal 'stable doors' so you can swing one open for an extended > amount of time to work on the rear of systems without overheating the > whole rack. How much cooling do you lose when opening the rack to do work on it? Thanks, Huw -- Huw Lynes | Advanced Research Computing HEC Sysadmin | Cardiff University | Redwood Building, Tel: +44 (0) 29208 70626 | King Edward VII Avenue, CF10 3NB From gerry.creager at tamu.edu Tue Dec 9 07:26:29 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <200812091535.03920.kilian.cavalotti.work@gmail.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> Message-ID: <493E8E25.1060507@tamu.edu> Our p575 has cool doors. Our campus chill water temp is spec'd at 42F but ranges up as high as 48F. We are seeing no condensation I'm aware of, but I'll ask the operations guys. gerry Kilian CAVALOTTI wrote: > Hi all, > > I'd be curious to know if some of you use or have some real-life experience > with rear-door heat exchangers, such as those from SGI [1] or IBM [2]. > > I'm especially interested in feedback about condensation, and operational > water temperature. > > [1]http://www.sgi.fr/synergie/EpisodeXI/articles/3g.shtml > [2]http://www.ibm.com/servers/eserver/xseries/storage/pdf/IBM_RDHx_Spec_Sheet.pdf > > Thanks a lot! > Cheers, -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From hearnsj at googlemail.com Tue Dec 9 07:33:12 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <1228835686.21024.22.camel@w1199.insrv.cf.ac.uk> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> <9f8092cc0812090652t505a0d85y35c6a103cf17bf42@mail.gmail.com> <1228835686.21024.22.camel@w1199.insrv.cf.ac.uk> Message-ID: <9f8092cc0812090733y4ece1f45i28247e9411044977@mail.gmail.com> 2008/12/9 Huw Lynes > > > How much cooling do you lose when opening the rack to do work on it? > > Good question! It must be about a quarter! Joking aside, if there's any interest I could take some IPMI temperature data before and after opening a door for (say) half an hour. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081209/69303f62/attachment.html From hearnsj at googlemail.com Tue Dec 9 08:06:44 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Re: Beowulf Digest, Vol 58, Issue 9 In-Reply-To: <7be8c36b0812050952h4225e5d3hd15bc9431906ead3@mail.gmail.com> References: <200812051644.mB5GhqRt029376@bluewest.scyld.com> <7be8c36b0812050952h4225e5d3hd15bc9431906ead3@mail.gmail.com> Message-ID: <9f8092cc0812090806q64c6b2a7ub14505cca19339f6@mail.gmail.com> 2008/12/5 Alcides Simao > Hello all! > > I was thinking of how to 'enpower' a Beowulf cluster. I remember back a > while ago that a Intel Atom was overclocked sucessfully to 2.4 GHz > Could it be possible to build a cooling apparatus sufficient to upgrade the > velocity of the beowulf cpu? > I don't see why you could not run a cluster with overclocked CPUs and (say) some heatpipe coolers. Just don't ask your vendor for a warranty! Seriously though, regarding Intel Atom and funky cooling schemes, have a look at: http://www.theregister.co.uk/2008/11/20/sgi_molecule_concept/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081209/248dc4ec/attachment.html From rgb at phy.duke.edu Tue Dec 9 09:02:21 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] For grins...India Message-ID: Daily Dec 09 2008 TOP NEWS from www.siliconindia.com: 8 Indian supercomputers enter global top 500 list With India making a mark in every sector of the technology field, the country has shown its importance in the supercomputing race too. Eight of the top 500 supercomputers are of India with Tata Group's Eka, a HP based system leading the race. Go India! You rock! (The crowd goes wild...:-) rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From lindahl at pbm.com Tue Dec 9 10:06:33 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] ntpd wonky? Message-ID: <20081209180633.GA21193@bx9> Ever since the US daylight savings time change, I've been seeing a lot of jitter in the ntp servers I'm synched to... I'm using the redhat pool. Has anyone else noticed this? On 200 machines I get several complaints per day of >100 ms jitter from my hourly check-ntp cronjob. -- greg From diep at xs4all.nl Tue Dec 9 10:45:08 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] For grins...India In-Reply-To: References: Message-ID: Maybe some decades from now all power per flop wasting supercomputers will be located in India. In the long run, they're the only ones on the planet who can afford the energy real cheap, and supercomputers usually burn a lot more power per gflop than they should, power6 up to factor 10. So even existing government rules (within EU that is) already would forbid building supercomputers as they waste too much power per double precision gflop as compared to the objective norm. Vincent On Dec 9, 2008, at 6:02 PM, Robert G. Brown wrote: > > Daily Dec 09 2008 TOP NEWS from www.siliconindia.com: > > 8 Indian supercomputers enter global top 500 list > > With India making a mark in every sector of the technology field, > the > country has shown its importance in the supercomputing race too. > Eight > of the top 500 supercomputers are of India with Tata Group's Eka, > a HP > based system leading the race. > > Go India! You rock! > > (The crowd goes wild...:-) > > rgb > > Robert G. Brown http://www.phy.duke.edu/~rgb/ > Duke University Dept. of Physics, Box 90305 > Durham, N.C. 27708-0305 > Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From steffen.grunewald at aei.mpg.de Tue Dec 9 01:53:29 2008 From: steffen.grunewald at aei.mpg.de (Steffen Grunewald) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Tesla systems in Germany? Message-ID: <20081209095329.GY16423@casco.aei.mpg.de> Hi, I'm looking for someone in Germany who already has access to a Tesla system. I have received a request by a scientist for "a very powerful machine", and would like him to run some tests before spending and possibly wasting money. (To me it isn't clear whether his code would be suited at all, and he wasn't able to convince me...) Anyone? Cheers, Steffen -- Steffen Grunewald * MPI Grav.Phys.(AEI) * Am Mühlenberg 1, D-14476 Potsdam Cluster Admin * http://pandora.aei.mpg.de/merlin/ * http://www.aei.mpg.de/ * e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7233,fax:7298} No Word/PPT mails - http://www.gnu.org/philosophy/no-word-attachments.html From diep at xs4all.nl Tue Dec 9 16:47:25 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:03 2009 Subject: Fwd: [Beowulf] Tesla systems in Germany? References: <20081209095329.GY16423@casco.aei.mpg.de> Message-ID: Nominated for "what i want to have for christmas" posting of the year 2008, from the beowulf mailing list: "a very powerful machine": Begin forwarded message: > From: Steffen Grunewald > Date: December 9, 2008 10:53:29 AM GMT+01:00 > To: Beowulf mailing list > Subject: [Beowulf] Tesla systems in Germany? > > Hi, > > I'm looking for someone in Germany who already has access to a > Tesla system. > I have received a request by a scientist for "a very powerful > machine", and > would like him to run some tests before spending and possibly > wasting money. > (To me it isn't clear whether his code would be suited at all, and > he wasn't > able to convince me...) > > Anyone? > > Cheers, > Steffen > > -- > Steffen Grunewald * MPI Grav.Phys.(AEI) * Am M?hlenberg 1, D-14476 > Potsdam > Cluster Admin * http://pandora.aei.mpg.de/merlin/ * http:// > www.aei.mpg.de/ > * e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon: > 7233,fax:7298} > No Word/PPT mails - http://www.gnu.org/philosophy/no-word- > attachments.html > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From diep at xs4all.nl Tue Dec 9 16:52:43 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Tesla systems in Germany? In-Reply-To: <20081209095329.GY16423@casco.aei.mpg.de> References: <20081209095329.GY16423@casco.aei.mpg.de> Message-ID: <893D66AE-D468-47AA-B83D-1EEC6D070DF5@xs4all.nl> heh Steffen, On a more serious note. What does your friend want to run at the machine for type of code? Have the algorithm in some sort of stripped format showing the working set size where you read from? Figuring out Tesla type devices is not so stupid right now. It has 240 cores @ 32 bits (either integer of floating point) clocked at say 1.2+ Ghz or so (1 instruction a cycle, forget the BS they quote online). Very powerful. Some algorithms can get rewritten. Would be fun to practice with some physicist code to rewrite it from memory intensive to instruction intensive code. As i have nothing to do with christmas i wanted to write some CUDA code anyway. Of course i have to rehearse dry as i have no CUDA set up devices here let alone budget to buy a 8800 card, let alone a Tesla. Vincent On Dec 9, 2008, at 10:53 AM, Steffen Grunewald wrote: > Hi, > > I'm looking for someone in Germany who already has access to a > Tesla system. > I have received a request by a scientist for "a very powerful > machine", and > would like him to run some tests before spending and possibly > wasting money. > (To me it isn't clear whether his code would be suited at all, and > he wasn't > able to convince me...) > > Anyone? > > Cheers, > Steffen > > -- > Steffen Grunewald * MPI Grav.Phys.(AEI) * Am M?hlenberg 1, D-14476 > Potsdam > Cluster Admin * http://pandora.aei.mpg.de/merlin/ * http:// > www.aei.mpg.de/ > * e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon: > 7233,fax:7298} > No Word/PPT mails - http://www.gnu.org/philosophy/no-word- > attachments.html > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From smulcahy at aplpi.com Wed Dec 10 00:21:13 2008 From: smulcahy at aplpi.com (stephen mulcahy) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] For grins...India In-Reply-To: References: Message-ID: <493F7BF9.70306@aplpi.com> Vincent Diepeveen wrote: > Maybe some decades from now all power per flop wasting supercomputers > will be located in India. > > In the long run, they're the only ones on the planet who can afford the > energy real cheap, > and supercomputers usually burn a lot more power per gflop than they > should, power6 up to factor 10. Iceland have energy literally pumping out of the ground - if they can sort out their connectivity to the US and Europe I think they'll quickly become the data centre to the world. Okay, they have some minor issues with seismic activity to deal with but you can't win em all. -stephen -- Stephen Mulcahy Applepie Solutions Ltd. http://www.aplpi.com Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway) From hearnsj at googlemail.com Wed Dec 10 01:11:13 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Tesla systems in Germany? In-Reply-To: <20081209095329.GY16423@casco.aei.mpg.de> References: <20081209095329.GY16423@casco.aei.mpg.de> Message-ID: <9f8092cc0812100111k6ba52f0ev1ae3571bb1b0aa32@mail.gmail.com> 2008/12/9 Steffen Grunewald > Hi, > > I'm looking for someone in Germany who already has access to a Tesla > system. > I have received a request by a scientist for "a very powerful machine", and > would like him to run some tests before spending and possibly wasting > money. > In that case, why not just buy a standard Nvidia graphics card? They run the same CUDA code. You can run your tests and get an idea of possible speedups, or indeed if the code will run under CUDA, before committing to buy Tesla. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081210/6c56e56c/attachment.html From hearnsj at googlemail.com Wed Dec 10 01:18:02 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Tesla systems in Germany? In-Reply-To: <20081209095329.GY16423@casco.aei.mpg.de> References: <20081209095329.GY16423@casco.aei.mpg.de> Message-ID: <9f8092cc0812100118m23d2fdffp433e6a25addac202@mail.gmail.com> 2008/12/9 Steffen Grunewald > Hi, > > I'm looking for someone in Germany who already has access to a Tesla > system. > I have received a request by a scientist for "a very powerful machine", and > A "very powerful machine" could mean a lot of things - a cluster with a high core count. A large SMP machine with a huge amount of memory. A dedicated machine like the QCD calculators. As Vincent says, you need to look at what the code is before hitting the "I need Cuda" button. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081210/e17becc2/attachment.html From kilian.cavalotti.work at gmail.com Wed Dec 10 01:21:42 2008 From: kilian.cavalotti.work at gmail.com (Kilian CAVALOTTI) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <493E8E25.1060507@tamu.edu> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> <493E8E25.1060507@tamu.edu> Message-ID: <200812101021.42824.kilian.cavalotti.work@gmail.com> Hi Gerry, On Tuesday 09 December 2008 16:26:29 Gerry Creager wrote: > Our p575 has cool doors. Our campus chill water temp is spec'd at 42F > but ranges up as high as 48F. We are seeing no condensation I'm aware > of, but I'll ask the operations guys. Thanks, that's helpful. I was afraid that a low temp for chilled water would generate condensation on the pipes, or even on the doors themselves. Cheers, -- Kilian From diep at xs4all.nl Wed Dec 10 03:08:37 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] For grins...India In-Reply-To: <493F7BF9.70306@aplpi.com> References: <493F7BF9.70306@aplpi.com> Message-ID: On Dec 10, 2008, at 9:21 AM, stephen mulcahy wrote: > Vincent Diepeveen wrote: >> Maybe some decades from now all power per flop wasting >> supercomputers will be located in India. >> In the long run, they're the only ones on the planet who can >> afford the energy real cheap, >> and supercomputers usually burn a lot more power per gflop than >> they should, power6 up to factor 10. > > Iceland have energy literally pumping out of the ground - if they > can sort out their connectivity to the US and Europe I think > they'll quickly become the data centre to the world. Okay, they > have some minor issues with seismic activity to deal with but you > can't win em all. > Now in iceland, i was there not so long ago, i won't say i saw the credit crisis come when i was there. That would be a bit overoptimistic. But realize it's just a fishers society with some sheep on the rocky (vulcano) ground and people living at high american standards driving around in jeeps with huge wheels as there are no roads. There is in total around 300k inhabitants there. Buying a hamburger there always was a big ripoff for tourists (i paid for a simple meal 15 euro or so), as everything gets imported to the island. So the industry that eats 90+% of all energy isn't there. Forget the idea of energy centrals the use heat from the underground. That's just to keep happy the environmental lobby which is bad in doing math. See below. These energy centrals are very expensive and the biggest and most expensive one produces a factor 40 less than what a normal nuclear reactor produces for a cheap price. Additionally a nuclear reactor for sure can produce coming 25 years whereas digging in the underground always is complicated and unsure business. Additionally there is going to be a new treaty within EU about CO2 reduction. Idea is to reduce 20% CO2 or so the coming years. Industry that can compete gets exempted from the treaty. Basically that's all industry, on paper that's 96% of all industry now, and i do not know why the other 4% were so stupid to not ask for an exception, maybe there application is still 4 layers down some office desk. Anyway the energy centrals are not excepted from this treaty, so CO2 reduction it will be for them. So these nuclear reactors will get built massively coming years and India can build most nuclear reactors of us all at a cheap price and they keep producing cheap there forever. Unlike Europe they probably do not have a '25 year limit' in which an energy central must pay itself back after which it has to get destroyed; practical there is a big need for energy so it still keeps producing for the coming 100 years. If each scientific commission of each nation is on its own deciding what type of nuclear reactor gets built, it's gonna be a watercooled reactor (very safe and cheap), which burns up the worldwide stockpile of easily extractable pile of uranium quickly, as the amount of reactors that's gonna get built in Europe is gonna be for sure more than tesla has stream processors. It is obvious that replacing these centrals by nuclear centrals is the only manner for energy industry to reduce CO2. Meanwhile when economy is going to boom again in Europe, of course Germany is going to use even more coals for industry (the 96% that falls in the exemption), so a year or 10 from now of course from the original plans to reduce CO2 output will be a joke of course. Be happy if it didn't double by 2018. In either case, it's good news for Australian export. Vincent > -stephen > > -- > Stephen Mulcahy Applepie Solutions Ltd. http:// > www.aplpi.com > Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, > Galway) > From hearnsj at googlemail.com Wed Dec 10 03:11:08 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Tesla systems in Germany? In-Reply-To: <20081210092947.GK16423@casco.aei.mpg.de> References: <20081209095329.GY16423@casco.aei.mpg.de> <9f8092cc0812100118m23d2fdffp433e6a25addac202@mail.gmail.com> <9f8092cc0812100111k6ba52f0ev1ae3571bb1b0aa32@mail.gmail.com> <20081210092947.GK16423@casco.aei.mpg.de> Message-ID: <9f8092cc0812100311s319dc3ebg6f34c3f464ada564@mail.gmail.com> 2008/12/10 Steffen Grunewald > > Cluster with high core count: this would give the opportunity to do stupid > things on the "several hundreds" scale, but not speed up the single stupid > thing. > Steffen, if I'm not wrong you have just restated Amdahl's Law. > > > As Vincent says, you need to look at what the code is before hitting the > "I > > need Cuda" button. > > Sometimes the approach to "throw enough money at a problem, and it will > resolve itself" is the easier one, compared with the need to power-up your > brains :( > > Thanks for your patience, No problem. Sounds to me actually like you need to encourage some code profiling, before saying that any particular machine is the answer to this one. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081210/f4c7a904/attachment.html From Dan.Kidger at quadrics.com Wed Dec 10 03:15:15 2008 From: Dan.Kidger at quadrics.com (Dan.Kidger@quadrics.com) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] For grins...India In-Reply-To: <493F7BF9.70306@aplpi.com> References: <493F7BF9.70306@aplpi.com> Message-ID: <0D49B15ACFDF2F46BF90B6E08C90048A064D922BD5@quadbrsex1.quadrics.com> And I am sure Iceland would find it much easier to do the machine room cooling than say Spain or the Southern USA Daniel -----Original Message----- From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of stephen mulcahy Sent: 10 December 2008 08:21 To: Vincent Diepeveen Cc: Beowulf Mailing List; Robert G. Brown Subject: Re: [Beowulf] For grins...India Vincent Diepeveen wrote: > Maybe some decades from now all power per flop wasting supercomputers > will be located in India. > > In the long run, they're the only ones on the planet who can afford the > energy real cheap, > and supercomputers usually burn a lot more power per gflop than they > should, power6 up to factor 10. Iceland have energy literally pumping out of the ground - if they can sort out their connectivity to the US and Europe I think they'll quickly become the data centre to the world. Okay, they have some minor issues with seismic activity to deal with but you can't win em all. -stephen -- Stephen Mulcahy Applepie Solutions Ltd. http://www.aplpi.com Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway) _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed Dec 10 05:37:54 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] For grins...India In-Reply-To: <0D49B15ACFDF2F46BF90B6E08C90048A064D922BD5@quadbrsex1.quadrics.com> References: <493F7BF9.70306@aplpi.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BD5@quadbrsex1.quadrics.com> Message-ID: On Wed, 10 Dec 2008, Dan.Kidger@quadrics.com wrote: > And I am sure Iceland would find it much easier to do the machine room > cooling than say Spain or the Southern USA Or the same people that are bringing you e-paper in e-book readers and superphones will figure out how to spray a processor core, a GB of sram, and a GB of nvram onto a piece of vinyl the size of a postage stamp that is powered by the spray-on-solar cell that is sprayed on top of it, and your kilonode supercomputer will become your desktop, literally, as long as you don't cover it all with papers so ambient light can't power it or spill your coffee on it. In the meantime, the advent of the overdue ice age will a) put a whole lot of climatologists out of business, but don't worry, the ones that aren't actually lynched will move on to get important work in public relations or cleaning roadsides wearing lovely orange jumpers or working in the coal-from-ground extraction industry in the forlorn hope that somehow getting enough CO_2 into the air will actually delay the inevitable progress of planetary orbits in interaction with the solar cycle; b) make the idea of a nice, warm desktop computer very attractive once again. DEC/Compaq/HP (which by then will have been take over by Toshiba) will trot out a new release of the Alpha and we will once again have a small computer that is entirely capable of heating a standard office. Iceland will become distinctly unfavorable real estate as it is once again covered with glaciers -- DEEP glaciers. Of course, so will Europe, most of North Asia and Canada down to roughly Ohio. The world will wistfully discover that global warming was actually rather a lovely dream, and that being warm, wet and fertile is GOOD even at the expense of some coastline where having 1/3 of the planet's surface, including most of its wheat growing regions, covered in permafrost is really, really bad. Bad. Did I mention that it won't be good? North Carolina, of course, will thrive, with a climate roughly like that of Nova Scotia today, and we'll do our best to accomodate all of you yankees reading this to tend our farms and bring us our mint juleps to sip. Just remember, you heard it here first. I estimate a roughly one in a hundred chance of the current low in the solar cycle triggering the next (expected) Maunder minimum perhaps by altering the thermohaline circulation that is some five or size orders of magnitude more important a global climate determiner than any greenhouse gas (with water a similar number of orders of magnitude more important than mere CO_2) into deep freeze mode. In any event, out here at 10,000+ years of interglacial (the second longest in the last ten) our ass is definitely hanging over the abyss and we may even live to see the fall. So cooling clusters may not be that much of a problem very, very soon. rgb > Daniel > > > -----Original Message----- > From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of stephen mulcahy > Sent: 10 December 2008 08:21 > To: Vincent Diepeveen > Cc: Beowulf Mailing List; Robert G. Brown > Subject: Re: [Beowulf] For grins...India > > Vincent Diepeveen wrote: >> Maybe some decades from now all power per flop wasting supercomputers >> will be located in India. >> >> In the long run, they're the only ones on the planet who can afford the >> energy real cheap, >> and supercomputers usually burn a lot more power per gflop than they >> should, power6 up to factor 10. > > Iceland have energy literally pumping out of the ground - if they can > sort out their connectivity to the US and Europe I think they'll quickly > become the data centre to the world. Okay, they have some minor issues > with seismic activity to deal with but you can't win em all. > > -stephen > > -- > Stephen Mulcahy Applepie Solutions Ltd. http://www.aplpi.com > Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway) > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From deadline at eadline.org Wed Dec 10 05:57:52 2008 From: deadline at eadline.org (Douglas Eadline) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Small request Message-ID: <33558.192.168.1.213.1228917472.squirrel@mail.eadline.org> Fellow geeks and/or other assorted HPC riff-raff: I have posted a one uestion survey as part of my weekly Linux Magazine column. It is about how the economy is effecting your HPC plans for next year. I also invite comments if you have any ... http://linux-mag.com/id/7198 There is a free registration required to post comments. I'll be summarizing the results of the poll and comments next week. Thanks -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Wed Dec 10 06:50:26 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Small request In-Reply-To: <33558.192.168.1.213.1228917472.squirrel@mail.eadline.org> References: <33558.192.168.1.213.1228917472.squirrel@mail.eadline.org> Message-ID: <493FD732.2070002@ias.edu> Douglas Eadline wrote: > Fellow geeks and/or other assorted HPC riff-raff: > > I have posted a one uestion survey as part of my weekly > Linux Magazine column. It is about how the economy > is effecting your HPC plans for next year. I also > invite comments if you have any ... > > http://linux-mag.com/id/7198 > > There is a free registration required to post comments. > I'll be summarizing the results of the poll and comments > next week. > > Thanks > > -- > Doug > Doug, I was going to post a comment, but it wouldn't be anonymous. I thought your article mentioned that the comments would be anonymous. -- Prentice From deadline at eadline.org Wed Dec 10 07:09:14 2008 From: deadline at eadline.org (Douglas Eadline) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Small request In-Reply-To: <493FD732.2070002@ias.edu> References: <33558.192.168.1.213.1228917472.squirrel@mail.eadline.org> <493FD732.2070002@ias.edu> Message-ID: <53464.192.168.1.213.1228921754.squirrel@mail.eadline.org> Well, you may be correct as it depends on how you registered with the site. Many people register for site with names like "clusterbunny@gmail.com" so they are somewhat anonymous. If f anyone has any comments they want kept anonymous, send them directly to me. And, because I suffer from CRS, your name will probably get dropped from my memory like a bad packet. -- Doug * Can't Remember Shit > Douglas Eadline wrote: >> Fellow geeks and/or other assorted HPC riff-raff: >> >> I have posted a one uestion survey as part of my weekly >> Linux Magazine column. It is about how the economy >> is effecting your HPC plans for next year. I also >> invite comments if you have any ... >> >> http://linux-mag.com/id/7198 >> >> There is a free registration required to post comments. >> I'll be summarizing the results of the poll and comments >> next week. >> >> Thanks >> >> -- >> Doug >> > > > Doug, > > I was going to post a comment, but it wouldn't be anonymous. I thought > your article mentioned that the comments would be anonymous. > > -- > Prentice > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From niftyompi at niftyegg.com Wed Dec 10 12:08:12 2008 From: niftyompi at niftyegg.com (Nifty Tom Mitchell) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <200812101021.42824.kilian.cavalotti.work@gmail.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> <493E8E25.1060507@tamu.edu> <200812101021.42824.kilian.cavalotti.work@gmail.com> Message-ID: <20081210200812.GA3449@compegg.wr.niftyegg.com> On Wed, Dec 10, 2008 at 10:21:42AM +0100, Kilian CAVALOTTI wrote: > > Hi Gerry, > > On Tuesday 09 December 2008 16:26:29 Gerry Creager wrote: > > Our p575 has cool doors. Our campus chill water temp is spec'd at 42F > > but ranges up as high as 48F. We are seeing no condensation I'm aware > > of, but I'll ask the operations guys. > > Thanks, that's helpful. I was afraid that a low temp for chilled water would > generate condensation on the pipes, or even on the doors themselves. > Watch dew point numbers in the room. Dew point is dominantly a function of humidity... http://en.wikipedia.org/wiki/Dew_point If the dew point is higher than the chilled water temp condensation is possible if the heat exchanger surface cools that much. Condensation on normal cold water pipes and chillers in the large and small construction like home or office is common so the correct insulation materials are easy to find and install. Many frost free home refrigerators solve this problem by running the heated exhaust air over the catch pan so any frost/ condensation is promptly evaporated. With clever airflow management drains may not be needed but water rots wood, breeds bacteria and attracts bugs and may be problematic. The bacteria issue is important.... see Legionella pneumophila. Right now the outside air dew point in Bryan, Texas is about 19F and historically gets as high as 69F in December. So yes condensation from 42F cooling pipes is possible and should be part of the management/ monitoring process. I suspect that the campus AC manages the dew point to the high end of a comfort range that might be about 50 - 54?F in the US keeping things all OK. i.e. If the building AC manages humidity you may not have to if they have the capacity to control it at the building air inlets. Of course the weather in France is not the same as Texas... looks nice ;-) something like. 52 ?F / 11 ?C Light Rain Humidity: 82% Dew Point: 46 ?F / 8 ?C -- T o m M i t c h e l l Found me a new hat, now what? From niftyompi at niftyegg.com Wed Dec 10 13:39:29 2008 From: niftyompi at niftyegg.com (Nifty Tom Mitchell) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] For grins...India In-Reply-To: References: <493F7BF9.70306@aplpi.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BD5@quadbrsex1.quadrics.com> Message-ID: <20081210213929.GB3449@compegg.wr.niftyegg.com> On Wed, Dec 10, 2008 at 08:37:54AM -0500, Robert G. Brown wrote: > On Wed, 10 Dec 2008, Dan.Kidger@quadrics.com wrote: > >> And I am sure Iceland would find it much easier to do the machine room >> cooling than say Spain or the Southern USA > ..... > > In the meantime, the advent of the overdue ice age will... ---- And in many of the 'global warming' reserch groups are those that are looking at 'anoxic' ocean regons in the ocean as bad side effects of global warming. In a geologic perspective it is exactly the environment that sequestered so much carbon as coal. These regions and processes may be critical in keeping the lid on CO2 in the atmosphere. As for the north polar cap it would be interesting to model the warm water flow of the Japan Current as it encounters the Bering Strait. Only 53 Miles wide the warm water flow change into the artic with less than a meter rise in the sea level would be large (%age) and have a butterfly effect on the artic. On the converse, a probject to place a meter+ thick gravel flow barrier would be an engineering project akin to a railroad ballast 53 miles long (easy). With GPS locators dredge/ fill/ rock could be placed with precision to this end and PERHAPS reverse the shrinking of the artic ice sheet and increase the albedo of the earth and perhaps restoring the status quo in this regard. OK grosly simplified but there are not many environmental pinch points with as much global leverage. http://en.wikipedia.org/wiki/Kuroshio Others are thinking about this. But are they able to modeling it? http://psc.apl.washington.edu/HLD/Bstrait/bstrait.html -- T o m M i t c h e l l Found me a new hat, now what? PS: the critical point that the Bering Strait might play here was first expressed to me by Ed McCullough then dean of Geology at the University of Arizona c. 1969. From hahn at mcmaster.ca Wed Dec 10 12:34:24 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <200812101021.42824.kilian.cavalotti.work@gmail.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> <493E8E25.1060507@tamu.edu> <200812101021.42824.kilian.cavalotti.work@gmail.com> Message-ID: > Thanks, that's helpful. I was afraid that a low temp for chilled water would > generate condensation on the pipes, or even on the doors themselves. condensation happens when air passes over a surface which is below the air's dew point. it's more likely that the main chiller's cold output will be at a lower temperature than the door coil, so any condensation will happen there. that's assuming you don't have oddities like major moisture sources, and that you avoid undesirable airflow. From diep at xs4all.nl Wed Dec 10 15:01:23 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] For grins...India In-Reply-To: <20081210213929.GB3449@compegg.wr.niftyegg.com> References: <493F7BF9.70306@aplpi.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BD5@quadbrsex1.quadrics.com> <20081210213929.GB3449@compegg.wr.niftyegg.com> Message-ID: What is most interesting from supercomputer viewpoint seen is the comments i got from some scientists when speaking about climate calculations. At a presentation at SARA at 11 september 2008 with some bobo's there (minister bla bla), there was a few sheets from the North-Atlantic. It was done in rectangles from 40x40KM. The real question raised by the other scientists who weren't there at the presentation is: "why such an ugly resolution the commercial software we use is far superior to this and capable of calculating more". So my question basically here to the climatologists here would be: "what does it take to accurately calculate the effects of "global warming" and the fact that the ocean will react, triggering according to some predictions a new iceage. It should be really possible to do a lot of calculations there also presenting what errors there can be in the calculations to hold true. Especially the resolutions at which things got calculated so far in climate change area, most scientists who are more busy with airwings (some others of those North Atlantic Software Association type guys post also in this group), influence of moon and so on, onto all kind of models. They do not really understand why all this hasn't been calculated before very well. Maybe the format used to calculate is too generic and therefore not storing enough information? Would GPU's help speeding up calculations here? So far most models were to say polite, total laymen models. A good guess from a scientist so far always has been better than any calculation. You realize that this meter rise calculation, i checked out that source code myself back in 2003 which ran on Earth machine and SARA's 1024 processor Origin3800. I wasn't impressed to read that their conclusion was the rise would be 1 meter and in some sort of file that i would call now bugfix.log there was the comments: "oops we fixed a bug, the meter was initialized a meter too high". I could be wrong of course reading that, as it might be it was just the first half million CPU node hours that got wasted... Is the software too generic to be accurate? How low level has it been optimized? Not seldom if some low level programmers go busy with such software it suddenly speeds up factor 1000. Vincent On Dec 10, 2008, at 10:39 PM, Nifty Tom Mitchell wrote: > On Wed, Dec 10, 2008 at 08:37:54AM -0500, Robert G. Brown wrote: >> On Wed, 10 Dec 2008, Dan.Kidger@quadrics.com wrote: >> >>> And I am sure Iceland would find it much easier to do the machine >>> room >>> cooling than say Spain or the Southern USA >> > ..... >> >> In the meantime, the advent of the overdue ice age will... > ---- > > And in many of the 'global warming' reserch groups are those that are > looking at 'anoxic' ocean regons in the ocean as bad side effects of > global warming. In a geologic perspective it is exactly the > environment > that sequestered so much carbon as coal. These regions and processes > may be critical in keeping the lid on CO2 in the atmosphere. > > As for the north polar cap it would be interesting to model the > warm water > flow of the Japan Current as it encounters the Bering Strait. Only > 53 Miles > wide the warm water flow change into the artic with less than a > meter rise > in the sea level would be large (%age) and have a butterfly effect > on the artic. > On the converse, a probject to place a meter+ thick gravel flow > barrier would > be an engineering project akin to a railroad ballast 53 miles long > (easy). > With GPS locators dredge/ fill/ rock could be placed with precision > to this end and PERHAPS > reverse the shrinking of the artic ice sheet and increase the > albedo of > the earth and perhaps restoring the status quo in this regard. > > OK grosly simplified but there are not many environmental pinch points > with as much global leverage. > > http://en.wikipedia.org/wiki/Kuroshio > > Others are thinking about this. But are they able to modeling it? > > http://psc.apl.washington.edu/HLD/Bstrait/bstrait.html > > > -- > T o m M i t c h e l l > Found me a new hat, now what? > > PS: the critical point that the Bering Strait might play here was > first expressed to me by Ed McCullough then dean of Geology at the > University > of Arizona c. 1969. > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From diep at xs4all.nl Wed Dec 10 15:07:37 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:03 2009 Subject: [Beowulf] For grins...India In-Reply-To: <20081210213929.GB3449@compegg.wr.niftyegg.com> References: <493F7BF9.70306@aplpi.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BD5@quadbrsex1.quadrics.com> <20081210213929.GB3449@compegg.wr.niftyegg.com> Message-ID: On Dec 10, 2008, at 10:39 PM, Nifty Tom Mitchell wrote: > On Wed, Dec 10, 2008 at 08:37:54AM -0500, Robert G. Brown wrote: >> On Wed, 10 Dec 2008, Dan.Kidger@quadrics.com wrote: >> >>> And I am sure Iceland would find it much easier to do the machine >>> room >>> cooling than say Spain or the Southern USA >> > ..... >> >> In the meantime, the advent of the overdue ice age will... > ---- > > And in many of the 'global warming' reserch groups are those that are > looking at 'anoxic' ocean regons in the ocean as bad side effects of > global warming. In a geologic perspective it is exactly the > environment > that sequestered so much carbon as coal. These regions and processes > may be critical in keeping the lid on CO2 in the atmosphere. > > As for the north polar cap it would be interesting to model the > warm water > flow of the Japan Current as it encounters the Bering Strait. Only > 53 Miles > wide the warm water flow change into the artic with less than a > meter rise > in the sea level would be large (%age) and have a butterfly effect > on the artic. > On the converse, a probject to place a meter+ thick gravel flow > barrier would > be an engineering project akin to a railroad ballast 53 miles long > (easy). > With GPS locators dredge/ fill/ rock could be placed with precision > to this end and PERHAPS > reverse the shrinking of the artic ice sheet and increase the > albedo of > the earth and perhaps restoring the status quo in this regard. > Of course as usual such a barrier has some political implications. Putin's building 5 new aircraft carriers not even days after oil was found underneath the northpole with a russian flag on the bottom already. If there is ice once again over there how is he gonna get out the oil out of there? > OK grosly simplified but there are not many environmental pinch points > with as much global leverage. > > http://en.wikipedia.org/wiki/Kuroshio > > Others are thinking about this. But are they able to modeling it? > What we need is accurate calculations. All the accuracy goes to military currently not to climate modelling it seems. Why is that? Politicians just 4 years to power each one of 'em? > http://psc.apl.washington.edu/HLD/Bstrait/bstrait.html > > > -- > T o m M i t c h e l l > Found me a new hat, now what? > > PS: the critical point that the Bering Strait might play here was > first expressed to me by Ed McCullough then dean of Geology at the > University > of Arizona c. 1969. > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From bill at cse.ucdavis.edu Wed Dec 10 15:14:21 2008 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <200812091535.03920.kilian.cavalotti.work@gmail.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> Message-ID: <49404D4D.6060203@cse.ucdavis.edu> Since you mentioned the rear door exchangers, I figured I'd mention a related solution for machine rooms that can't handle the heat density of today's 1U/blade racks. Liebert makes a rack top cooler that blows air in front of the rack (the cold isle) and sucks in hot air from the rear, and dumps the heat into a water source. Seems like a pretty reasonable design and seems to work well. It doesn't make it any harder to work on the rack/nodes, although I do recommend a wide brimmed hat if you don't like high volumes of cold air blowing on your forehead when you are working on the console. One complications I saw of a design with the retrofitted rear rack cooler was the maximum flow rate they were designed for and how changes in that rate would effect the resulting cooling. Vendors I talked to didn't immediately have CFM numbers for nodes, nor did the rear door vendor have any graphs for cooling delivered vs air temperature and pressure. Nor the the 1U vendors have graphs of airflow delivered relative to backpressure (potentially caused by the rear door). It wasn't at all clear to me if a rear door would work similarly with a 10kw rack with zero room cooling as it would with a 20kw rack with 10kw of room cooling. Not to mention blades/1Us designed to dissipate 20 kw per rack would likely have significantly higher airflow. With all that said I've seen installations that were pretty happy with the rear door solutions as well. From lindahl at pbm.com Wed Dec 10 15:29:43 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] For grins...India In-Reply-To: References: <493F7BF9.70306@aplpi.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BD5@quadbrsex1.quadrics.com> <20081210213929.GB3449@compegg.wr.niftyegg.com> Message-ID: <20081210232943.GC21119@bx9> On Thu, Dec 11, 2008 at 12:01:23AM +0100, Vincent Diepeveen wrote: > Not seldom if some low level programmers go busy with such software it > suddenly speeds up factor 1000. Vincent, the people I know who do climate compare notes on how many model years per cpu year they can compute at a given resolution and algorithm and machine. If someone made a mistake and was 1000X slower than the competition, they'd be aware of it. Your wild claims can be funny at times, but really, you should start a blog instead of posting them here. -- greg From diep at xs4all.nl Wed Dec 10 16:22:23 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] For grins...India In-Reply-To: <20081210232943.GC21119@bx9> References: <493F7BF9.70306@aplpi.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BD5@quadbrsex1.quadrics.com> <20081210213929.GB3449@compegg.wr.niftyegg.com> <20081210232943.GC21119@bx9> Message-ID: <4CC1AA71-F8A4-4D30-A7BB-9D26580AD7E6@xs4all.nl> Heh Greg, nice to hear something from you. But... ...since when are you expert on garbage dumps as well? For someone who said he would shredder all emails i shipped, you seem to have the remarkable quality to recover stuff from the garbage dump. Vincent On Dec 11, 2008, at 12:29 AM, Greg Lindahl wrote: > On Thu, Dec 11, 2008 at 12:01:23AM +0100, Vincent Diepeveen wrote: > >> Not seldom if some low level programmers go busy with such >> software it >> suddenly speeds up factor 1000. > > Vincent, the people I know who do climate compare notes on how many > model years per cpu year they can compute at a given resolution and > algorithm and machine. If someone made a mistake and was 1000X slower > than the competition, they'd be aware of it. > > Your wild claims can be funny at times, but really, you should start a > blog instead of posting them here. > > -- greg > > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From lindahl at pbm.com Wed Dec 10 17:17:01 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] For grins...India In-Reply-To: <4CC1AA71-F8A4-4D30-A7BB-9D26580AD7E6@xs4all.nl> References: <493F7BF9.70306@aplpi.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BD5@quadbrsex1.quadrics.com> <20081210213929.GB3449@compegg.wr.niftyegg.com> <20081210232943.GC21119@bx9> <4CC1AA71-F8A4-4D30-A7BB-9D26580AD7E6@xs4all.nl> Message-ID: <20081211011701.GA3780@bx9> On Thu, Dec 11, 2008 at 01:22:23AM +0100, Vincent Diepeveen wrote: > For someone who said he would shredder all emails i shipped, > you seem to have the remarkable quality to recover stuff from the > garbage dump. Vincent, Stop making stuff up. I don't believe I've ever said I was "shredder"ing your emails. I did say (Oct 15th 2008), in a personal email: | Vincent, I don't read most of your blather on the Beowulf | list. Emailing me personally is a waste of your time and mine. I can only hope that your memory for HPC is better than your memory of past disagreements. -- greg From csamuel at vpac.org Wed Dec 10 19:58:54 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] cloning issue, hidden module dependency In-Reply-To: <1617901603.60651228967269477.JavaMail.root@mail.vpac.org> Message-ID: <117871361.61381228967934181.JavaMail.root@mail.vpac.org> ----- "Tim Cutts" wrote: > It's very easy to build a custom kernel package with just what > you want using the 'make-kpkg' command, and then you can strip > out all the extraneous cruft if you want Be warned that with 2.6.27 and later you will most likely need to patch the kernel-package scripts to avoid putting firmware directly into /lib/firmware as otherwise you'll get conflicts with other 2.6.27+ packages. The Ubuntu bug report: https://bugs.launchpad.net/ubuntu/+source/kernel-package/+bug/256983 has a link to the upstream fix that I applied manually to my system to fix this. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Wed Dec 10 20:09:27 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] RE: moab In-Reply-To: <8E50F960A9F3F6448D39155B882544E818717F79@EX2K7-VIRT-1.ads.qub.ac.uk> Message-ID: <900957656.61771228968567397.JavaMail.root@mail.vpac.org> ----- "Richard Rankin" wrote: > I will have funding available to purchase some new clusters in the new > year. > > I was hoping to be able to have a cluster with a mix of Linux and > windows nodes so that the mix could be varied depending on the work > load. > > I have been pointed to > http://www.clusterresources.com/pages/products/moab-hybrid-cluster.php > > Has anyone any experience of this Not in that scenario (no Windows here) but I do know a University here who do power up/down nodes based on demand using Moab. We're not doing that as we always have a backlog of jobs waiting to run! Hope that's of some use to you ? cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From algomantra at gmail.com Tue Dec 9 16:47:33 2008 From: algomantra at gmail.com (AlgoMantra) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] For grins...India In-Reply-To: References: Message-ID: <6171110d0812091647h730be9f7sd4323c34c5cda077@mail.gmail.com> >Maybe some decades from now all power per flop wasting supercomputers will be located in India. >In the long run, they're the only ones on the planet who can afford the energy real cheap.... (Disclaimer: I'm located in Jaipur, India). Vincent, I'm curious why you think we will be able to afford this energy cheaply and where will it come from! ------- -.- 1/f ))) --. ------- ... http://www.algomantra.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081210/e9861e7a/attachment.html From steffen.grunewald at aei.mpg.de Wed Dec 10 01:29:47 2008 From: steffen.grunewald at aei.mpg.de (Steffen Grunewald) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Tesla systems in Germany? In-Reply-To: <9f8092cc0812100118m23d2fdffp433e6a25addac202@mail.gmail.com> <9f8092cc0812100111k6ba52f0ev1ae3571bb1b0aa32@mail.gmail.com> References: <20081209095329.GY16423@casco.aei.mpg.de> <9f8092cc0812100118m23d2fdffp433e6a25addac202@mail.gmail.com> <20081209095329.GY16423@casco.aei.mpg.de> <9f8092cc0812100111k6ba52f0ev1ae3571bb1b0aa32@mail.gmail.com> Message-ID: <20081210092947.GK16423@casco.aei.mpg.de> Thanks John for your thoughts (accidentally they match with mine) > > I'm looking for someone in Germany who already has access to a Tesla > > system. > > I have received a request by a scientist for "a very powerful machine", and > > would like him to run some tests before spending and possibly wasting > > money. > > > In that case, why not just buy a standard Nvidia graphics card? They run the > same CUDA code. You can run your tests and get an idea of possible speedups, > or indeed if the code will run under CUDA, before committing to buy Tesla. > A "very powerful machine" could mean a lot of things - a cluster with a high > core count. A large SMP machine with a huge amount of memory. A dedicated > machine like the QCD calculators. Since he would have been able to use MPI (at least locally to use the available multi-core architecture), and didn't do that, it's still a lot of linear code. Large memory (but not excessive) - yes. SMP - not really. Cluster with high core count: this would give the opportunity to do stupid things on the "several hundreds" scale, but not speed up the single stupid thing. > As Vincent says, you need to look at what the code is before hitting the "I > need Cuda" button. Sometimes the approach to "throw enough money at a problem, and it will resolve itself" is the easier one, compared with the need to power-up your brains :( Actually, I was facing an outcome of "5% speedup if nothing is done about code efficiency", and I wouldn't like wasting money for that result - that's why I was asking for a way to confront that guy with the need to re-work his code, you understand? Thanks for your patience, Steffen From drcoolsanta at gmail.com Wed Dec 10 02:32:06 2008 From: drcoolsanta at gmail.com (Dr Cool Santa) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Developing software for MPICH clusters? Message-ID: <86b56470812100232n5307706cm9aad031c38c4381d@mail.gmail.com> Till now I have only created Beowulf clusters for my mother who is a theoretical chemist. She needed clusters to run applications and study chemical aspects of various substances to help in her research. I being a programmer found it quite exciting. I sometimes have had programmed software to automate my work, however sometimes that work is too much for my computer to handle that it takes hours and days. I was thinking if someone could tell me how I could convert it into MPI based code. Basically my aim is to divide the work among the computers that are on the cluster. The cluster is comprised of 4 dual core machines so you can understand how much powerful they would be compare to my computer. Also I generally program in C or C++ but I have a vast range of languages to program in. I can explain the main features of such programs with an example. They would compute results of consecutive numbers and store them in some file or database so it doesn't have to compute them again later or something similar. This is just an example my work is more complicates. Basically what I wanted to tell was that the work in itself isn't difficult but the quantity is a lot and so it needs to be divided. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081210/32242bde/attachment.html From drcoolsanta at gmail.com Wed Dec 10 05:51:43 2008 From: drcoolsanta at gmail.com (Dr Cool Santa) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Parallel software for chemists Message-ID: <86b56470812100551g277917dag95b93d2dbcaf346@mail.gmail.com> Currently in the lab we use Schrodinger and we are looking into NWchem. We'd be interested in knowing about software that a chemist could use that makes use of a parallel supercomputer. And better if it is linux. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081210/f4c78c0c/attachment.html From oneal at dbi.udel.edu Wed Dec 10 06:00:24 2008 From: oneal at dbi.udel.edu (Doug ONeal) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Re: Rear-door heat exchangers and condensation In-Reply-To: <200812091535.03920.kilian.cavalotti.work@gmail.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> Message-ID: The IBM doors look great - will they fit any 19" rack? I have APC Netshelter VX racks and the powered ventilation rear doors are not sufficient any more. Doug On 12/09/2008 09:35 AM, Kilian CAVALOTTI wrote: > Hi all, > > I'd be curious to know if some of you use or have some real-life experience > with rear-door heat exchangers, such as those from SGI [1] or IBM [2]. > > I'm especially interested in feedback about condensation, and operational > water temperature. > > [1]http://www.sgi.fr/synergie/EpisodeXI/articles/3g.shtml > [2]http://www.ibm.com/servers/eserver/xseries/storage/pdf/IBM_RDHx_Spec_Sheet.pdf > > Thanks a lot! > Cheers, From andrew.robbie at gmail.com Wed Dec 10 07:27:27 2008 From: andrew.robbie at gmail.com (Andrew Robbie (GMail)) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Tesla systems in Germany? In-Reply-To: <20081209095329.GY16423@casco.aei.mpg.de> References: <20081209095329.GY16423@casco.aei.mpg.de> Message-ID: On Tue, Dec 9, 2008 at 8:53 PM, Steffen Grunewald < steffen.grunewald@aei.mpg.de> wrote: > Hi, > > I'm looking for someone in Germany who already has access to a Tesla > system. > I have received a request by a scientist for "a very powerful machine", and > would like him to run some tests before spending and possibly wasting > money. Use the dilbert principle: http://pics.livejournal.com/allah_sulu/pic/0002f3h8/g13 i.e. he won't know the difference between a Tesla and something else... An nVidia GTX 280 graphics card isn't much different from a Quadro 5800. The Telsa is basically Quadro 5800s without a graphics port. In my experience, by the time your researcher has ported code to work on the new system, a much faster new iteration will have been released. Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081211/d22bdf11/attachment.html From brahmaforces at gmail.com Wed Dec 10 21:32:13 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Setting up First Beowulf System: Recommendations re racking, linux flavour, and up to date books Message-ID: Dear Beowulfers, After dipping my toes in the pool of Beowulfery (doing research here and there) I am about to sail my ship by creating my beowulf system. I have four PCS that were cutting edge in their time over the past 5 years. I am thinking of mounting them on a rack, connecting them with ethernet cables. I would summon your wide and deep experiences on the following: 1) Rack ideas, materials and warnings 2) Upto date classic Beowulfery books for 4 to 16 nodes 3) The right uptodate books on parrallel programming 4) Which flavour of linux is well adapted for beowulfery and has all the required tools standardly? Any online resources on getting the hardware aspect of it going, ie from the box to the rack... Thanks in advance... -- Best regards, arjuna http://www.brahmaforces.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081211/222f31f8/attachment.html From brahmaforces at gmail.com Wed Dec 10 21:51:57 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware Message-ID: Hello all again: I thought I would add a little more background about myself and the intended cluster. I am an artist and a computer programmer and am planning on using this cluster as a starting point to do research on building an ideal cluster for Animation for my own personal/entrepreneurial work. It would reside in my art studio. As an artist the idea of rack mounting the commodity PCS is much more fun that piling up the PCS. I was thinking of working with a local hardware friend and figuring out how to screw on motherboards onto hardware type racks. Im sure there are better tried and tested racks out there that are not expensive. Any suggestions on the actual physical hardware for constructing racks for upto 16PCs. Also any thoughts on racks versus piles of PCS. A lot of the posts on the internet are old and out of date. I am wondering what the upto date trends are in racking commodity computers to create beowulf clusters. What should i be reading? -- Best regards, arjuna http://www.brahmaforces.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081211/c3ba31f9/attachment.html From hearnsj at googlemail.com Thu Dec 11 01:33:33 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Parallel software for chemists In-Reply-To: <86b56470812100551g277917dag95b93d2dbcaf346@mail.gmail.com> References: <86b56470812100551g277917dag95b93d2dbcaf346@mail.gmail.com> Message-ID: <9f8092cc0812110133q269f58f0o2685ec95a590b727@mail.gmail.com> 2008/12/10 Dr Cool Santa > Currently in the lab we use Schrodinger and we are looking into NWchem. > We'd be interested in knowing about software that a chemist could use that > makes use of a parallel supercomputer. And better if it is linux. > > Its probably worth it for you to join the Computational Chemistry list: http://www.ccl.net/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081211/efc41906/attachment.html From rgb at phy.duke.edu Thu Dec 11 05:33:45 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Developing software for MPICH clusters? In-Reply-To: <86b56470812100232n5307706cm9aad031c38c4381d@mail.gmail.com> References: <86b56470812100232n5307706cm9aad031c38c4381d@mail.gmail.com> Message-ID: On Wed, 10 Dec 2008, Dr Cool Santa wrote: > Till now I have only created Beowulf clusters for my mother who is a > theoretical chemist. She needed clusters to run applications and study > chemical aspects of various substances to help in her research. > > I being a programmer found it quite exciting. I sometimes have had > programmed software to automate my work, however sometimes that work is too > much for my computer to handle that it takes hours and days. I was thinking > if someone could tell me how I could convert it into MPI based code. > Basically my aim is to divide the work among the computers that are on the > cluster. The cluster is comprised of 4 dual core machines so you can > understand how much powerful they would be compare to my computer. Also I > generally program in C or C++ but I have a vast range of languages to > program in. > I can explain the main features of such programs with an example. They would > compute results of consecutive numbers and store them in some file or > database so it doesn't have to compute them again later or something > similar. This is just an example my work is more complicates. > Basically what I wanted to tell was that the work in itself isn't difficult > but the quantity is a lot and so it needs to be divided. It sounds like you already have a collection of systems running linux in a "beowulf" (cluster) configuration, so I'll focus on just the MPI aspect. Pick an MPI. There are at least three or four to choose from, and I have no particular religious bias towards any of them and expect that all of them would work for your problem. For example, lam is often a "yum install" or "apt get" away, as is openmpi. IIRC mpich(2) has to be built, but it is EASY to build with e.g. src rpms ready to fire up. Look in the following places for mpi examples, in order: * Online, e.g. in articles on www.clustermonkey.net. I think you could very likely find a complete set of tutorials there alone that would take you through your first few programs and out to where you could write/run YOUR code in MPI. * In the source or documentation trees. There are almost always simple example programs there that serve as templates for more complicated parallel programs (they e.g. compute pi or evaluate chunks of the mandelbrot set). * Books. There are some decent books on MPI programming available that should suffice to at least get you started, as before. I think C will do just fine. Good luck. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From landman at scalableinformatics.com Thu Dec 11 05:40:21 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Parallel software for chemists In-Reply-To: <86b56470812100551g277917dag95b93d2dbcaf346@mail.gmail.com> References: <86b56470812100551g277917dag95b93d2dbcaf346@mail.gmail.com> Message-ID: <49411845.2010803@scalableinformatics.com> Dr Cool Santa wrote: > Currently in the lab we use Schrodinger and we are looking into NWchem. > We'd be interested in knowing about software that a chemist could use > that makes use of a parallel supercomputer. And better if it is linux. Depends upon the calculations you wish to do. GAMESS for electronic structure is a very nice parallel code, though setting up the parallel system can be a little challenging for the un-initiated. There are quite a few others (Amber, Charmm, ...) What types of calculation do you want to do? -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From rgb at phy.duke.edu Thu Dec 11 06:00:32 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: Message-ID: On Thu, 11 Dec 2008, arjuna wrote: > Hello all again: > > I thought I would add a little more background about myself and the intended > cluster. I am an artist and a computer programmer and am planning on using > this cluster as a starting point to do research on building an ideal cluster > for Animation for my own personal/entrepreneurial work. It would reside in > my art studio. As an artist the idea of rack mounting the commodity PCS is > much more fun that piling up the PCS. > > I was thinking of working with a local hardware friend and figuring out how > to screw on motherboards onto hardware type racks. Im sure there are better > tried and tested racks out there that are not expensive. Any suggestions on > the actual physical hardware for constructing racks for upto 16PCs. > > Also any thoughts on racks versus piles of PCS. > > A lot of the posts on the internet are old and out of date. I am wondering > what the upto date trends are in racking commodity computers to create > beowulf clusters. What should i be reading? Look in the online archives -- they aren't old or out of date at all. We just had a brief discussion of rack cases vs tower cases last week, for example. The consensus view from that was (that rack cases and racks tend to be more expensive than tower cases and cheap shelving, that 2U cases were likely to be quieter and less fussy than 1U cases, that there some very "nifty" relatively new micro- form factor cases that work quite well and attractively in shelved machine room environments (suggesting that racks aren't the only way to get an "artistically clean" looking cluster:-), and that either rack or micro is likely to produce a smaller footprint cluster than the old/classic shelf full of towers model. Bladed systems were (as always) mentioned as an alternative and (as always) it was pointed out that bladed systems are an alternative for the truly deep pocketed as they are even more expensive (if more compact) than racked systems. There are enormous clusters built on all of these models. It sounds like you are interested in building a rendering farm. The original render farms for the original rendered cartoon movies were IIRC shelves full of towers, as is IIRC Google, but I'm sure that a lot of them now are racked up. As for "trends" -- I doubt that there are any. Beowulfery is all about designing a cluster to meet your specific needs given your specific application space and budget. A rackmount cluster has certain advantages, but they cost a certain amount extra. At some point you have to face the question of whether you are better off in the long run spending the extra for rackmount boxes or would prefer to get cheaper form factors and get more systems. I hesitate to make pronouncements on what SHOULD differentiate these choices as no matter what I say there will be somebody on list who chose differently, quite probably for good reasons. So with a LARGE grain of salt, I'd say that very very loosely, if you are building your first cluster, a hobby cluster, a low-budget cluster, a small cluster (say less than 32 nodes total), or a production cluster in an environment with lots of physical room and AC/power resources, one or more shelf units of towers is either optimal or perfectly reasonable. If you are a professional with experience building a commercial-grade production cluster, especially one expected to have >=32 nodes in a real machine-room environment, and you aren't horribly constrained in your budget, you're more likely to go with rackmount FF nodes, or in the richest and most space-constrained environments, even blades. But these lines and differentiators are FAR from sharp. I'm sure there are people on list with 100's of shelved nodes (some of them just posted in last week's discussion). There are also bound to be people with racks containing just four or five nodes (somewhat more likely if their four or five nodes are just SOME of the systems in the preexisting rack). At home thus far I've tended to go with towers, at Duke I started with towers years ago but now would only get rackmounts, and I've thought pretty seriously about getting e.g. a half-height rolling rack for home and starting to populate it a few U a year with my very limited budget. The obstacle is that racks are expensive enough that it will cost me AT LEAST one node just to get set up with a rack and a single rackmount system, compared to just buying two equivalent power towers and popping them into my existing $60 steel shelving. OTOH, I could get at least four cores even in a single rackmount chassis, for cheap. OTOOH, I could probably get eight cores one way or another in towers. And while it isn't exactly "my money" I'm spending, the particular pocket of OPM I'm using is quite finite. So ultimately, your decision here will come down to what you want to spend and what you expect to get for it. Beauty? Ease of maintenance and access? A professional look to attract investors? All of these things bear, which is why the choice is not simple. As far as books are concerned, I'll let others answer. My online book is free (so it costs you nothing to start there) but I'm the first to admit that it is dated at this point, especially in its (lack of) treatment of the more advanced networks. Clustermonkey resources are arguably more up to date, also free. Many of the print books on the subject are either similarly out of date or are written by people I've never heard of, which basically means that they don't frequent this list and participate in it, which means that I am skeptical about their value (of the books). The ones I've picked up in the store to thumb through have mostly been pretty forgettable. > Best regards, > arjuna > http://www.brahmaforces.com Two names near and dear to my heart, given that I love the Mahabharata and named my first cluster "Brahma" as well...;-) rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From prentice at ias.edu Thu Dec 11 06:14:54 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] For grins...India In-Reply-To: <20081210232943.GC21119@bx9> References: <493F7BF9.70306@aplpi.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BD5@quadbrsex1.quadrics.com> <20081210213929.GB3449@compegg.wr.niftyegg.com> <20081210232943.GC21119@bx9> Message-ID: <4941205E.10008@ias.edu> Greg Lindahl wrote: > On Thu, Dec 11, 2008 at 12:01:23AM +0100, Vincent Diepeveen wrote: > >> Not seldom if some low level programmers go busy with such software it >> suddenly speeds up factor 1000. > > Vincent, the people I know who do climate compare notes on how many > model years per cpu year they can compute at a given resolution and > algorithm and machine. If someone made a mistake and was 1000X slower > than the competition, they'd be aware of it. > > Your wild claims can be funny at times, but really, you should start a > blog instead of posting them here. > > -- greg I agree. Vincent constantly makes ridiculous political statements that simply have no place on this list. -- Prentice From prentice at ias.edu Thu Dec 11 06:35:15 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Parallel software for chemists In-Reply-To: <86b56470812100551g277917dag95b93d2dbcaf346@mail.gmail.com> References: <86b56470812100551g277917dag95b93d2dbcaf346@mail.gmail.com> Message-ID: <49412523.9030307@ias.edu> Dr Cool Santa wrote: > Currently in the lab we use Schrodinger and we are looking into NWchem. > We'd be interested in knowing about software that a chemist could use > that makes use of a parallel supercomputer. And better if it is linux. > Clarification: I supported Schodinger up until a year ago. Unless things have changed, Jaguar is the only Schrodinger application that can truly make use of a "parallel" supercomputer. The other Schrodinger programs perform calculations that are embarassingly parallel and require no inter-process communications during calculations. In these cases, the data to be analyzed is broken up into smaller pieces that are analyzed individually by the computers with no communication between them. When they are done, the main program reassembles their output to a final result. This is parallel computing, but doesn't require a "parallel supercomputer". It works great, BTW. OpenEye provides some commercial computational chemistry software (conformer generation, docking, etc.), that uses parallel code. Their code uses PVM instead of MPI, which makes OpenEye kind of an odd duck. Last I spoke to OpenEye, they were planning on porting their code to MPI, but don't know if that's been done yet, since I no longer support their software. If you're doing molecular simulations, there's LAMMPS ( Large-scale Atomic/Molecular Massively Parallel Simulator), which is open-source. I actually submitted a *very* small patch to when I ported it to IRIX 6.5 a few years ago. NAMD is also parallel, but I don't know much about it. I compiled it, installed it, but then I don't think the comp chemists ever used it (don't you hate that?). -- Prentice From kilian.cavalotti.work at gmail.com Thu Dec 11 06:47:45 2008 From: kilian.cavalotti.work at gmail.com (Kilian CAVALOTTI) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <20081210200812.GA3449@compegg.wr.niftyegg.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> <200812101021.42824.kilian.cavalotti.work@gmail.com> <20081210200812.GA3449@compegg.wr.niftyegg.com> Message-ID: <200812111547.45559.kilian.cavalotti.work@gmail.com> Hi Tom, On Wednesday 10 December 2008 21:08:12 Nifty Tom Mitchell wrote: > Watch dew point numbers in the room. > Dew point is dominantly a function of humidity... > http://en.wikipedia.org/wiki/Dew_point Oh right, that's interesting: """ The dew point is associated with relative humidity. A high relative humidity indicates that the dew point is closer to the current air temperature. Relative humidity of 100% indicates that the dew point is equal to the current temperature (and the air is maximally saturated with water). When the dew point stays constant and temperature increases, relative humidity will decrease. """ I guess that controlling the relative humidity level (most CRAC units can do that, can't they?) and keeping it below say 60% is a pretty simple way to avoid condensation, then. > Many frost free home refrigerators solve this problem by running the > heated exhaust air over the catch pan so any frost/ condensation is > promptly evaporated. With clever airflow management drains may not be > needed but water rots wood, breeds bacteria and attracts bugs and may be > problematic. The bacteria issue is important.... see Legionella > pneumophila. I was only thinking about the hassle of having to mop down your racks every morning, but the point about bacteria is very relevant. That should legitimate a bonus, working in hazardous areas. :) > Right now the outside air dew point in Bryan, Texas is about 19F and > historically gets as high as 69F in December. So yes condensation from > 42F cooling pipes is possible and should be part of the management/ > monitoring process. I suspect that the campus AC manages the dew point > to the high end of a comfort range that might be about 50 - 54?F in the > US keeping things all OK. i.e. If the building AC manages humidity you > may not have to if they have the capacity to control it at the building > air inlets. There's no such thing as "building AC" where I am, unless you call opening a window "managing the dew point". I guess we won't avoid local equipment in the server room to control relative humidity. > Of course the weather in France is not the same as Texas... looks nice ;-) > something like. > 52 ?F / 11 ?C Light Rain Humidity: 82% Dew Point: 46 ?F / 8 ?C That's pretty much it, even on the south coast: gray, rainy and cold. Man, I so miss California... :) Thanks for the insight, Cheers, -- Kilian From kilian.cavalotti.work at gmail.com Thu Dec 11 06:57:37 2008 From: kilian.cavalotti.work at gmail.com (Kilian CAVALOTTI) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Re: Rear-door heat exchangers and condensation In-Reply-To: References: <200812091535.03920.kilian.cavalotti.work@gmail.com> Message-ID: <200812111557.37555.kilian.cavalotti.work@gmail.com> Hi Doug, On Wednesday 10 December 2008 15:00:24 Doug ONeal wrote: > The IBM doors look great - will they fit any 19" rack? > I have APC Netshelter VX racks and the powered ventilation rear doors are > not sufficient any more. If I'm not mistaken, the IBM RDHx doors are manufactured by Vette Corp, and according to their spec sheet, it looks like they can be installed on a variety of standard racks, including Netshelter VXes, with the help of a "transition frame". See : http://www.vettecorp.com/information_center/LiquiCoolRDHx_DataSheet.pdf Cheers, -- Kilian From prentice at ias.edu Thu Dec 11 07:00:45 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Setting up First Beowulf System: Recommendations re racking, linux flavour, and up to date books In-Reply-To: References: Message-ID: <49412B1D.1040205@ias.edu> arjuna wrote: > I have four PCS that were cutting edge in their time over the past 5 > years. I am thinking of mounting them on a rack, connecting them with > ethernet cables. > > I would summon your wide and deep experiences on the following: > > 1) Rack ideas, materials and warnings > 2) Upto date classic Beowulfery books for 4 to 16 nodes > 3) The right uptodate books on parrallel programming > 4) Which flavour of linux is well adapted for beowulfery and has all the > required tools standardly? > > Any online resources on getting the hardware aspect of it going, ie from > the box to the rack... 3) There are two aspects to parallel programming: The concepts of parallel programming, and using an actual programming language. Personally, I recommend starting with the "why" and learning the theory of parallel programming. It will make designing effective parallel programs easier. I have these two parallel computing texbooks on my bookshelf: Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers (2nd Edition) by Barry Wilkinson and Michael Allen http://www.amazon.com/Parallel-Programming-Techniques-Applications-Workstations/dp/0131405632 Introduction to Parallel Computing (2nd Edition) (Hardcover) by Ananth Grama, George Karypis, Vipin Kumar, Anshul Gupta http://www.amazon.com/Introduction-Parallel-Computing-Ananth-Grama/dp/0201648652 I haven't read either one cover to cover, but I have read portions, an both are relatively easy to read. Most parallel programming is done using MPI, so you might want to start there for actually writing parallel programs. For that, this is a good book: Parallel Programming With MPI (Paperback) by Peter Pacheco http://www.amazon.com/Parallel-Programming-MPI-Peter-Pacheco/dp/1558603395/ Again, I haven't read this one in it's entirety, more of a reference for me, since I hardly actually do MPI programming as an admin. It's looks very easy to read. I'd go so far as to say it's the "gold standard" on this topic, since I've seen it recommended over and over again. 4) Any major Linux distro (Red Hat, SUSE, Debian, Ubuntu) will work well. I use a rebuild of RHEL. Not sure which distros have all you need right out of the box. -- Prentice From rgb at phy.duke.edu Thu Dec 11 07:40:28 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Setting up First Beowulf System: Recommendations re racking, linux flavour, and up to date books In-Reply-To: <49412B1D.1040205@ias.edu> References: <49412B1D.1040205@ias.edu> Message-ID: On Thu, 11 Dec 2008, Prentice Bisbal wrote: > Personally, I recommend starting with the "why" and learning the theory > of parallel programming. It will make designing effective parallel > programs easier. I have these two parallel computing texbooks on my > bookshelf: Excellent point. Don't forget Ian Foster's book: http://www-unix.mcs.anl.gov/dbpp/ This has the advantage of being available for free online as well as in hardcover if you prefer it that way. So you can read it NOW and see if it meets your needs, and explore the other books below (where I haven't read Wilkinson and Allen but have looked through GKKG and agree that it's a lovely book) as you can obtain a copy. rgb > Parallel Programming: Techniques and Applications Using Networked > Workstations and Parallel Computers (2nd Edition) > by Barry Wilkinson and Michael Allen > http://www.amazon.com/Parallel-Programming-Techniques-Applications-Workstations/dp/0131405632 > > Introduction to Parallel Computing (2nd Edition) (Hardcover) > by Ananth Grama, George Karypis, Vipin Kumar, Anshul Gupta > http://www.amazon.com/Introduction-Parallel-Computing-Ananth-Grama/dp/0201648652 > > I haven't read either one cover to cover, but I have read portions, an > both are relatively easy to read. Most parallel programming is done > using MPI, so you might want to start there for actually writing > parallel programs. For that, this is a good book: > > Parallel Programming With MPI (Paperback) > by Peter Pacheco > http://www.amazon.com/Parallel-Programming-MPI-Peter-Pacheco/dp/1558603395/ > > Again, I haven't read this one in it's entirety, more of a reference for > me, since I hardly actually do MPI programming as an admin. It's looks > very easy to read. I'd go so far as to say it's the "gold standard" on > this topic, since I've seen it recommended over and over again. > > 4) Any major Linux distro (Red Hat, SUSE, Debian, Ubuntu) will work > well. I use a rebuild of RHEL. Not sure which distros have all you need > right out of the box. > > > > -- > Prentice > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From hearnsj at googlemail.com Thu Dec 11 08:59:07 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Computing in the clouds Message-ID: <9f8092cc0812110859y4fde842tdcda0ae56dddce50@mail.gmail.com> I'm sure Joe is too self-effacing to trumpet this excellent article on the relevance of Cloud Computing to HPC: http://www.linux-mag.com/id/7196/1/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081211/a597e91e/attachment.html From landman at scalableinformatics.com Thu Dec 11 10:26:11 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Computing in the clouds In-Reply-To: <9f8092cc0812110859y4fde842tdcda0ae56dddce50@mail.gmail.com> References: <9f8092cc0812110859y4fde842tdcda0ae56dddce50@mail.gmail.com> Message-ID: <49415B43.3030708@scalableinformatics.com> John Hearns wrote: > I'm sure Joe is too self-effacing to trumpet this excellent article on > the relevance of Cloud Computing to HPC: > > http://www.linux-mag.com/id/7196/1/ Thanks for the pointer :) I hope the (feeble) attempt at humor up front (clouds ... vapor ... hot air) doesn't put anyone off ... -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From mathog at caltech.edu Thu Dec 11 12:10:58 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Re: cloning issue, hidden module dependency Message-ID: Finally wrote a working image on the one node with a different mobo, it was a minor mess doing so. In order to figure out the values for modprobe.conf: alias eth0 8139too install usb-interface /sbin/modprobe uhci_hcd; /bin/true install ide-controller /sbin/modprobe via82cxxx; /sbin/modprobe ide_generic; /bin/true alias pci:v000010ECd00008139sv000010ECsd00008139bc02sc00i00 8139too I still had to to do a basic install on that node The first 3 lines I could probably have figured out eventually from the modprobe.conf from the previous release, but that last line, no way. For future reference: 1. write / and /boot from an image made from an S2466 system using a boel3 script (this also wrote all known node specific files) 2. ^C to break to boel shell 3. mkdir /a mount /dev/hda3 /a mount /dev/hda1 /a/boot chroot /a # file name will change for each release!!! rm -f /boot/initrd-2.6.24.7-desktop-2mnb.img # expect and ignore warnings mkinitrd /boot/initrd-2.6.24.7-desktop-2mnb.img 2.6.24.7-desktop-2mnb # the preceding stomped our custom inittab, put it back cp -f /etc/inittab.saf /etc/inittab exit reboot 4. Need to run lilo now or at reboot it STILL comes up using the wrong modules. You can guess how I found that out. Lilo won't work reliably chroot from boel3 with this Mandriva distro, and boel3 has no lilo of its own. So pxe boot the node with PLD 2.01, then at the prompt: mkdir /a mount /dev/hda3 /a mount /dev/hda1 /boot lilo -C /a/etc/lilo.conf #reset dhcpd on the master so this node will boot from internal disk reboot At least reinstalling on this node would now be less painful, since the new initrd has been stored on the SI server, and could be put in place by just copying it in the installation script. All and all though, this isn't a very elegant way to install a system image on a different type of machine. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From lindahl at pbm.com Thu Dec 11 13:38:57 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] SSD prices Message-ID: <20081211213857.GA7355@bx9> I was recently surprised to learn that SSD prices are down in the $2-$3 per gbyte range. I did a survey of one brand (OCZ) at NexTag and it was: 256 gigs = $700 128 gigs = 300 64 gigs = 180 32 gigs = 70 Also, Micron is saying that they're going to get into the business of PCIe-attached flash, which will give us a second source for what Fusion-io is shipping today. If you're on the "I like a real system disk" side of the diskless/diskfull fence, these SSDs ought to be a lot more reliable than tradtional disks. And I'd like to get rid of the mirrored disks in our developer desktops... -- greg From rgb at phy.duke.edu Thu Dec 11 14:28:15 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] SSD prices In-Reply-To: <20081211213857.GA7355@bx9> References: <20081211213857.GA7355@bx9> Message-ID: On Thu, 11 Dec 2008, Greg Lindahl wrote: > I was recently surprised to learn that SSD prices are down in the > $2-$3 per gbyte range. I did a survey of one brand (OCZ) at NexTag > and it was: > > 256 gigs = $700 > 128 gigs = 300 > 64 gigs = 180 > 32 gigs = 70 > > Also, Micron is saying that they're going to get into the business of > PCIe-attached flash, which will give us a second source for what > Fusion-io is shipping today. > > If you're on the "I like a real system disk" side of the > diskless/diskfull fence, these SSDs ought to be a lot more reliable > than tradtional disks. And I'd like to get rid of the mirrored > disks in our developer desktops... Very useful information, as I'm on the "real system disk" side. Lagging real hard disk by what, a decade? But catching up, and 32 GB is really plenty anyway, whether for a node or for a workstation, right up to where you start putting your entire music/movie collection on it. I'll have to get a 32 GB chip and see if I can boot my laptop from it. I definitely can boot from USB flash (and carry linux in my pocket routinely these days:-) and it is down to something like 16 GB for $30. 16 GB is actually the size of / on my current laptop, and a fairly fat Fedora 9 (lots of games and other stuff to play with and try out) still leaves 5 GB. Thanks! rgb > > -- greg > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From hahn at mcmaster.ca Thu Dec 11 14:57:30 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Re: Rear-door heat exchangers and condensation In-Reply-To: References: <200812091535.03920.kilian.cavalotti.work@gmail.com> Message-ID: > Netshelter VX racks and the powered ventilation rear doors are not > sufficient any more. why do you have doors on your rack? normally, a rack is filled with servers with fans that generate the standard front-to-back airflow. that means that you want no doors, or at least only high-perf mesh ones. I've seen some wacky things in machinerooms - closed racks with just a small fan in the top, for instance. or racks with 1U servers each carefully separated by 2-3U of open space. here's the way I think of it: try to make your airflow cycle as close to a simple cycle as possible. get all the air coming out of the chiller to impinge (only and as naturally as possible) on the front of the rack(s). get all the hot air from the back to to the chiller intake as naturally as possible. no mixing, no bypass, no counter-rotation, minimizing total air path as well as changes in the airflow vectors. ideally, machine room should be divided into hot and cold halves, with no free flow between them. there are, of course some sweet spots (as well as "sour" ones): very close to the chiller, cold air velocity may be high enough to under-supply hot racks. far away from the chillers, the problem is both supply and the hot-air/return path. ducting and plenums are invaluable, but IMO the main goal is partitioning hot from cold. once airflow is relatively sane and controllable, it's more doable to measure dissipation in a rack, as well as its airflow and delta-t to see whether an auxiliary chiller is necessary. I would be very reluctant to add "spot fixes" to a particular rack without having a very good handle on the full flow/circulation/temp/power picture... >> I'm especially interested in feedback about condensation, and operational >> water temperature. incidentally, before committing to chilled water, make sure that it runs year-round with reasonable temperature and flow. our first machineroom was stable only during summer, because the campus chilled water loop was warmer and poorly chilled during the winter when (office) load was low. From lindahl at pbm.com Thu Dec 11 15:14:52 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Re: Rear-door heat exchangers and condensation In-Reply-To: References: <200812091535.03920.kilian.cavalotti.work@gmail.com> Message-ID: <20081211231452.GC29359@bx9> On Thu, Dec 11, 2008 at 05:57:30PM -0500, Mark Hahn wrote: > why do you have doors on your rack? normally, a rack is filled with > servers with fans that generate the standard front-to-back airflow. > that means that you want no doors, or at least only high-perf mesh ones. Sometimes a server which is off is too hot to turn on, thanks to its neighbors. One way to solve that is to have fans in the rack back door. But these days it's hard to get enough airflow unless the entire door is perforated, which makes fans in the door useless. -- greg From csamuel at vpac.org Thu Dec 11 15:44:29 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Developing software for MPICH clusters? In-Reply-To: Message-ID: <1939446205.112551229039069647.JavaMail.root@mail.vpac.org> /* Second attempt, this time with caffeine.. */ ----- "Robert G. Brown" wrote: > For example, lam is often a "yum install" or "apt get" > away, as is openmpi. I would suggest that if someone is starting out and interested in LAM-MPI then they try OpenMPI first as LAM-MPI is now just in maintenance mode with all development work switched to OpenMPI. According to the LAM web page the developers are trying to encourage "all users to try migrating to Open MPI" cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From dnlombar at ichips.intel.com Thu Dec 11 15:36:04 2008 From: dnlombar at ichips.intel.com (Lombard, David N) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] SSD prices In-Reply-To: <20081211213857.GA7355@bx9> References: <20081211213857.GA7355@bx9> Message-ID: <20081211233604.GA12826@nlxdcldnl2.cl.intel.com> On Thu, Dec 11, 2008 at 01:38:57PM -0800, Greg Lindahl wrote: > I was recently surprised to learn that SSD prices are down in the > $2-$3 per gbyte range. I did a survey of one brand (OCZ) at NexTag > and it was: > > 256 gigs = $700 > 128 gigs = 300 > 64 gigs = 180 > 32 gigs = 70 FWIW, newegg shows a $20 rebate on the 32g. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From csamuel at vpac.org Thu Dec 11 15:54:57 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Parallel software for chemists In-Reply-To: <1951474958.112651229039486564.JavaMail.root@mail.vpac.org> Message-ID: <2119423501.112691229039697517.JavaMail.root@mail.vpac.org> ----- "Prentice Bisbal" wrote: > NAMD is also parallel, but I don't know much about it. I compiled it, > installed it, but then I don't think the comp chemists ever used it > (don't you hate that?). We've got NAMD here (it's a molecular dynamics program), it's not what you'd call a trivial application to build. ;-) Building it as an MPI application (i.e. getting Charm++ to use MPI rather than its own custom framework) makes it a lot easier to use though, especially if you use PBS and have a TM aware MPI launcher installed (like OpenMPI, LAM or Pete Wyckoffs excellent mpiexec replacement). cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Thu Dec 11 16:04:47 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] SSD prices In-Reply-To: <885368590.112911229040025006.JavaMail.root@mail.vpac.org> Message-ID: <1073016340.113081229040287297.JavaMail.root@mail.vpac.org> ----- "Greg Lindahl" wrote: > If you're on the "I like a real system disk" side of the > diskless/diskfull fence, these SSDs ought to be a lot more reliable > than tradtional disks. And I'd like to get rid of the mirrored > disks in our developer desktops... Hmm, I was thinking that until I read this blog post by one of the kernel filesystem developers (Val Henson from Intel) who had some (possibly Apple specific) concerns about data corruption & reliability and why she still chooses spinning disks over SSD. http://valhenson.livejournal.com/25228.html This is one of the reasons I'm *really* interested to get btrfs going on my Dell E4200 which has a 128GB SSD, data checksums (and duplicate copies of data) are good.. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From lindahl at pbm.com Thu Dec 11 16:18:59 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] SSD prices In-Reply-To: <1073016340.113081229040287297.JavaMail.root@mail.vpac.org> References: <885368590.112911229040025006.JavaMail.root@mail.vpac.org> <1073016340.113081229040287297.JavaMail.root@mail.vpac.org> Message-ID: <20081212001859.GA7929@bx9> On Fri, Dec 12, 2008 at 11:04:47AM +1100, Chris Samuel wrote: > Hmm, I was thinking that until I read this blog post by > one of the kernel filesystem developers (Val Henson from > Intel) who had some (possibly Apple specific) concerns > about data corruption & reliability and why she still > chooses spinning disks over SSD. Nothing new in that blog post. We'll find out the actual reliability of this generation of flash SSD when they've been around for a while, and not an anecdote sooner. -- greg From james.p.lux at jpl.nasa.gov Thu Dec 11 17:01:23 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] SSD prices In-Reply-To: <1073016340.113081229040287297.JavaMail.root@mail.vpac.org> References: <885368590.112911229040025006.JavaMail.root@mail.vpac.org> <1073016340.113081229040287297.JavaMail.root@mail.vpac.org> Message-ID: > > Hmm, I was thinking that until I read this blog post by one > of the kernel filesystem developers (Val Henson from > Intel) who had some (possibly Apple specific) concerns about > data corruption & reliability and why she still chooses > spinning disks over SSD. > > http://valhenson.livejournal.com/25228.html > > This is one of the reasons I'm *really* interested to get > btrfs going on my Dell E4200 which has a 128GB SSD, data > checksums (and duplicate copies of data) are good.. > She raises some interesting points, the most significant of which is that "real data" is very hard to come by. We are starting to use Flash memory for storage on spacecraft, and, of course, wear-out is a big deal. OTOH, it's something we know how to deal with (since spacecraft like Galileo used magnetic tape for storage), even when the medium gets old and decrepit. There's also some traps for the unwary for flash that don't apply to mechanical storage: What if a software bug hammers on one location accidentally, and whips through all 100,000 cycles of its life in a day? Of course, in the space biz, we don't have the issue she identifies of different suppliers. Hah.. We have "traceability to sand", and if someone finds out that a contaminated cigarette butt was dropped on that sand back in 1953, we'll get an alert for all parts potentially made from that sand. (Real fun when the part is something like a 2N2222 NPN transistor or a 51 ohm resistor) From lindahl at pbm.com Thu Dec 11 19:22:54 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] SSD prices In-Reply-To: References: <885368590.112911229040025006.JavaMail.root@mail.vpac.org> <1073016340.113081229040287297.JavaMail.root@mail.vpac.org> Message-ID: <20081212032254.GA29300@bx9> On Thu, Dec 11, 2008 at 05:01:23PM -0800, Lux, James P wrote: > We are starting to use Flash memory for storage on spacecraft, and, > of course, wear-out is a big deal. Yeah, but you have the huge advantage that you get to write your own wear-leveling software. -- greg From rgb at phy.duke.edu Thu Dec 11 20:38:58 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] SSD prices In-Reply-To: <20081212001859.GA7929@bx9> References: <885368590.112911229040025006.JavaMail.root@mail.vpac.org> <1073016340.113081229040287297.JavaMail.root@mail.vpac.org> <20081212001859.GA7929@bx9> Message-ID: On Thu, 11 Dec 2008, Greg Lindahl wrote: > On Fri, Dec 12, 2008 at 11:04:47AM +1100, Chris Samuel wrote: > >> Hmm, I was thinking that until I read this blog post by >> one of the kernel filesystem developers (Val Henson from >> Intel) who had some (possibly Apple specific) concerns >> about data corruption & reliability and why she still >> chooses spinning disks over SSD. > > Nothing new in that blog post. We'll find out the actual reliability > of this generation of flash SSD when they've been around for a while, > and not an anecdote sooner. It does look worth noting that one should get SLC and not MLC SSD for any "disk like" application. It's faster (10x faster) and they argue much more reliable. More expensive, too, of course. I think somebody mentioned Transcend -- they apparently are concentrating on SLC only (and claim ECC) and their prices still look pretty competitive. I'm not sure SSD is perfect for userspace hard storage, but for basic operating system images it seems reasonable. How many times does one write to the read-mostly stuff in /, /usr, /lib, /etc? Surely nothing like the thousands of times minimum one is SUPPOSED to be able to rewrite. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From james.p.lux at jpl.nasa.gov Thu Dec 11 20:46:30 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] SSD prices In-Reply-To: <20081212032254.GA29300@bx9> Message-ID: On 12/11/08 7:22 PM, "Greg Lindahl" wrote: > On Thu, Dec 11, 2008 at 05:01:23PM -0800, Lux, James P wrote: > >> We are starting to use Flash memory for storage on spacecraft, and, >> of course, wear-out is a big deal. > > Yeah, but you have the huge advantage that you get to write your own > wear-leveling software. > > -- greg > True enough. But just as the article pointed out with respect to the SSD, we often have some sort of hardware flash controller between the flash and the CPU. Or, you're integrating some subsystem designed and built by someone else (or a spare from a previous mission). But a bigger deal is that the whole wear out thing is not super well understood (in a predictability sense). For instance, it's sensitive to the temperature. From lindahl at pbm.com Thu Dec 11 21:18:06 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] SSD prices In-Reply-To: References: <885368590.112911229040025006.JavaMail.root@mail.vpac.org> <1073016340.113081229040287297.JavaMail.root@mail.vpac.org> <20081212001859.GA7929@bx9> Message-ID: <20081212051806.GA7589@bx9> On Thu, Dec 11, 2008 at 11:38:58PM -0500, Robert G. Brown wrote: > It does look worth noting that one should get SLC and not MLC SSD for > any "disk like" application. It's faster (10x faster) and they argue > much more reliable. More expensive, too, of course. Transcend has both MLC and SLC, and they charge 3X as much for SLC, if NexTag is finding low prices properly. (Incidentally, NexTag claims the lowest price for a 32G USB stick is about the same as a 32gb OCZ drive. Hm.) But Fusion-io's specsheet says that their MLC board is only a little slower than their SLC board. That's a high-end controller with more channels, but you'll see that in low-end drives in the next generation. Perhaps most of the problems people report in the low-end drives might be crappy firmware on that Jmicron controller everyone hates. Certainly Apple doesn't seem to have a problem making flash devices reasonably reliable. But they control all the firmware. > I'm not sure SSD is perfect for userspace hard storage, but for basic > operating system images it seems reasonable. How many times does one > write to the read-mostly stuff in /, /usr, /lib, /etc? Not very often, but there's always /var and /tmp and swap to worry about. The other guys at Blekko had a very bad experience with flash 5 years ago on Linux network appliances at their previous startup. It is unclear to me if that was a bad batch of flash, or dumb software, or what. -- greg p.s. While I'm at it, I think that these SATA-to-USB gizmos are pretty cool: http://www.thinkgeek.com/computing/drives/a7ea/?cpg=ab Lots of people seem to sell 'em. From hunting at ix.netcom.com Thu Dec 11 22:09:10 2008 From: hunting at ix.netcom.com (Michael Huntingdon) Date: Wed Nov 25 01:08:04 2009 Subject: [Beowulf] Re: Rear-door heat exchangers and condensation In-Reply-To: <20081211231452.GC29359@bx9> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> <20081211231452.GC29359@bx9> Message-ID: <7493914765164F50BFEB0183BBA5995C@MichaelPC> ...or rather than worry about door perforation, you make sure you've looked at solutions that take advantage of the best hw/sw currently available. Today it really is not just about cooling a group of 1u systems or a single hptc cabinet. Today maybe you know you need two densely populated cabinets. But you also know your requirements will double or maybe triple a year from now. You're in math, and you know the same applies to CSE and chemistry. Let's not forget physics, and oh, by the way, campus central computing wants to bring it all together, manage and maintain it for you. Sound familiar? Are they offering to manage your fans or your rear door heat exchangers? If so, they and/or you are in big trouble. Seems as though it's about how you most effectively/efficiently cool each cabinet in concert with the rest of your cabs (systems, storage, networking) and technology in the data center. There are engineering groups that do little more than eat/sleep/drink this stuff, so let me know if you are seriously interested in talking about how to bring all the technologies together necessary to manage the environmental requirements of your densely architected stand alone cluster, or clusters of clusters within a data center. cheers...michael ----- Original Message ----- From: "Greg Lindahl" To: "Mark Hahn" Cc: "Beowulf Mailing List" Sent: Thursday, December 11, 2008 3:14 PM Subject: Re: [Beowulf] Re: Rear-door heat exchangers and condensation > On Thu, Dec 11, 2008 at 05:57:30PM -0500, Mark Hahn wrote: > >> why do you have doors on your rack? normally, a rack is filled with >> servers with fans that generate the standard front-to-back airflow. >> that means that you want no doors, or at least only high-perf mesh ones. > > Sometimes a server which is off is too hot to turn on, thanks to its > neighbors. One way to solve that is to have fans in the rack back > door. But these days it's hard to get enough airflow unless the entire > door is perforated, which makes fans in the door useless. > > -- greg > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From bernard at vanhpc.org Thu Dec 11 23:01:31 2008 From: bernard at vanhpc.org (Bernard Li) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] ntpd wonky? In-Reply-To: <20081209180633.GA21193@bx9> References: <20081209180633.GA21193@bx9> Message-ID: Hi Greg: On Tue, Dec 9, 2008 at 10:06 AM, Greg Lindahl wrote: > Ever since the US daylight savings time change, I've been seeing a lot > of jitter in the ntp servers I'm synched to... I'm using the redhat > pool. Has anyone else noticed this? On 200 machines I get several > complaints per day of >100 ms jitter from my hourly check-ntp cronjob. Have you tried other pools eg. pool.ntp.org? Cheers, Bernard From eugen at leitl.org Thu Dec 11 23:56:55 2008 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Inside Tsubame - the Nvidia GPU supercomputer Message-ID: <20081212075655.GK11544@leitl.org> http://www.goodgearguide.com.au/article/270416/inside_tsubame_-_nvidia_gpu_supercomputer?fp=&fpid=&pf=1 Inside Tsubame - the Nvidia GPU supercomputer Tokyo Tech University's Tsubame supercomputer attained 29th ranking in the new Top 500, thanks in part to hundreds of Nvidia Tesla graphics cards. Martyn Williams (IDG News Service) 10/12/2008 12:20:00 When you enter the computer room on the second floor of Tokyo Institute of Technology's computer building, you're not immediately struck by the size of Japan's second-fastest supercomputer. You can't see the Tsubame computer for the industrial air conditioning units that are standing in your way, but this in itself is telling. With more than 30,000 processing cores buzzing away, the machine consumes a megawatt of power and needs to be kept cool. Tsubame was ranked 29th-fastest supercomputer in the world in the latest Top 500 ranking with a speed of 77.48T Flops (floating point operations per second) on the industry-standard Linpack benchmark. While its position is relatively good, that's not what makes it so special. The interesting thing about Tsubame is that it doesn't rely on the raw processing power of CPUs (central processing units) alone to get its work done. Tsubame includes hundreds of graphics processors of the same type used in consumer PCs, working alongside CPUs in a mixed environment that some say is a model for future supercomputers serving disciplines like material chemistry. Graphics processors (GPUs) are very good at quickly performing the same computation on large amounts of data, so they can make short work of some problems in areas such as molecular dynamics, physics simulations and image processing. "I think in the vast majority of the interesting problems in the future, the problems that affect humanity where the impact comes from nature ... requires the ability to manipulate and compute on a very large data set," said Jen-Hsun Huang, CEO of Nvidia, who spoke at the university this week. Tsubame uses 680 of Nvidia's Tesla graphics cards. Just how much of a difference do the GPUs make? Takayuki Aoki, a professor of material chemistry at the university, said that simulations that used to take three months now take 10 hours on Tsubame. Tsubame itself - once you move past the air-conditioners - is split across several rooms in two floors of the building and is largely made up of rack-mounted Sun x4600 systems. There are 655 of these in all, each of which has 16 AMD Opteron CPU cores inside it, and Clearspeed CSX600 accelerator boards. The graphics chips are contained in 170 Nvidia Tesla S1070 rack-mount units that have been slotted in between the Sun systems. Each of the 1U Nvidia systems has four GPUs inside, each of which has 240 processing cores for a total of 960 cores per system. The Tesla systems were added to Tsubame over the course of about a week while the computer was operating. "People thought we were crazy," said Satoshi Matsuoka, director of the Global Scientific Information and Computing Center at the university. "This is a ?1 billion (US$11 million) supercomputer consuming a megawatt of power, but we proved technically that it was possible." The result is what university staff call version 1.2 of the Tsubame supercomputer. "I think we should have been able to achieve 85 [T Flops], but we ran out of time so it was 77 [T Flops]," said Matsuoka of the benchmarks performed on the system. At 85T Flops it would have risen a couple of places in the Top 500 and been ranked fastest in Japan. There's always next time: A new Top 500 list is due out in June 2009, and Tokyo Institute of Technology is also looking further ahead. "This is not the end of Tsubame, it's just the beginning of GPU acceleration becoming mainstream," said Matsuoka. "We believe that in the world there will be supercomputers registering several petaflops in the years to come, and we would like to follow suit." Tsubame 2.0, as he dubbed the next upgrade, should be here within the next two years and will boast a sustained performance of at least a petaflop (a petaflop is 1,000 teraflops), he said. The basic design for the machine is still not finalized but it will continue the heterogeneous computing base of mixing CPUs and GPUs, he said. From award at uda.ad Fri Dec 12 02:02:03 2008 From: award at uda.ad (Alan Ward) Date: Wed Nov 25 01:08:05 2009 Subject: RS: [Beowulf] Inside Tsubame - the Nvidia GPU supercomputer References: <20081212075655.GK11544@leitl.org> Message-ID: Very interesting, but perhaps a bit of an overkill. How many TFlop/Watt does that figure out as? :-( Cheers, -Alan -----Missatge original----- De: beowulf-bounces@beowulf.org en nom de Eugen Leitl Enviat el: dv. 12/12/2008 08:56 Per a: info@postbiota.org; Beowulf@beowulf.org Tema: [Beowulf] Inside Tsubame - the Nvidia GPU supercomputer http://www.goodgearguide.com.au/article/270416/inside_tsubame_-_nvidia_gpu_supercomputer?fp=&fpid=&pf=1 Inside Tsubame - the Nvidia GPU supercomputer Tokyo Tech University's Tsubame supercomputer attained 29th ranking in the new Top 500, thanks in part to hundreds of Nvidia Tesla graphics cards. Martyn Williams (IDG News Service) 10/12/2008 12:20:00 When you enter the computer room on the second floor of Tokyo Institute of Technology's computer building, you're not immediately struck by the size of Japan's second-fastest supercomputer. You can't see the Tsubame computer for the industrial air conditioning units that are standing in your way, but this in itself is telling. With more than 30,000 processing cores buzzing away, the machine consumes a megawatt of power and needs to be kept cool. Tsubame was ranked 29th-fastest supercomputer in the world in the latest Top 500 ranking with a speed of 77.48T Flops (floating point operations per second) on the industry-standard Linpack benchmark. While its position is relatively good, that's not what makes it so special. The interesting thing about Tsubame is that it doesn't rely on the raw processing power of CPUs (central processing units) alone to get its work done. Tsubame includes hundreds of graphics processors of the same type used in consumer PCs, working alongside CPUs in a mixed environment that some say is a model for future supercomputers serving disciplines like material chemistry. Graphics processors (GPUs) are very good at quickly performing the same computation on large amounts of data, so they can make short work of some problems in areas such as molecular dynamics, physics simulations and image processing. "I think in the vast majority of the interesting problems in the future, the problems that affect humanity where the impact comes from nature ... requires the ability to manipulate and compute on a very large data set," said Jen-Hsun Huang, CEO of Nvidia, who spoke at the university this week. Tsubame uses 680 of Nvidia's Tesla graphics cards. Just how much of a difference do the GPUs make? Takayuki Aoki, a professor of material chemistry at the university, said that simulations that used to take three months now take 10 hours on Tsubame. Tsubame itself - once you move past the air-conditioners - is split across several rooms in two floors of the building and is largely made up of rack-mounted Sun x4600 systems. There are 655 of these in all, each of which has 16 AMD Opteron CPU cores inside it, and Clearspeed CSX600 accelerator boards. The graphics chips are contained in 170 Nvidia Tesla S1070 rack-mount units that have been slotted in between the Sun systems. Each of the 1U Nvidia systems has four GPUs inside, each of which has 240 processing cores for a total of 960 cores per system. The Tesla systems were added to Tsubame over the course of about a week while the computer was operating. "People thought we were crazy," said Satoshi Matsuoka, director of the Global Scientific Information and Computing Center at the university. "This is a ?1 billion (US$11 million) supercomputer consuming a megawatt of power, but we proved technically that it was possible." The result is what university staff call version 1.2 of the Tsubame supercomputer. "I think we should have been able to achieve 85 [T Flops], but we ran out of time so it was 77 [T Flops]," said Matsuoka of the benchmarks performed on the system. At 85T Flops it would have risen a couple of places in the Top 500 and been ranked fastest in Japan. There's always next time: A new Top 500 list is due out in June 2009, and Tokyo Institute of Technology is also looking further ahead. "This is not the end of Tsubame, it's just the beginning of GPU acceleration becoming mainstream," said Matsuoka. "We believe that in the world there will be supercomputers registering several petaflops in the years to come, and we would like to follow suit." Tsubame 2.0, as he dubbed the next upgrade, should be here within the next two years and will boast a sustained performance of at least a petaflop (a petaflop is 1,000 teraflops), he said. The basic design for the machine is still not finalized but it will continue the heterogeneous computing base of mixing CPUs and GPUs, he said. _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081212/e979e129/attachment.html From diep at xs4all.nl Fri Dec 12 02:50:51 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Inside Tsubame - the Nvidia GPU supercomputer In-Reply-To: <20081212075655.GK11544@leitl.org> References: <20081212075655.GK11544@leitl.org> Message-ID: <34568AAB-6A20-44B7-B80B-FA8BB92AC1F6@xs4all.nl> On Dec 12, 2008, at 8:56 AM, Eugen Leitl wrote: > > http://www.goodgearguide.com.au/article/270416/inside_tsubame_- > _nvidia_gpu_supercomputer?fp=&fpid=&pf=1 > > Inside Tsubame - the Nvidia GPU supercomputer > > Tokyo Tech University's Tsubame supercomputer attained 29th ranking > in the > new Top 500, thanks in part to hundreds of Nvidia Tesla graphics > cards. > > Martyn Williams (IDG News Service) 10/12/2008 12:20:00 > > When you enter the computer room on the second floor of Tokyo > Institute of > Technology's computer building, you're not immediately struck by > the size of > Japan's second-fastest supercomputer. You can't see the Tsubame > computer for > the industrial air conditioning units that are standing in your > way, but this > in itself is telling. With more than 30,000 processing cores > buzzing away, > the machine consumes a megawatt of power and needs to be kept cool. > 1000000 watt / 77480 gflop = 12.9 watt per gflop. If you run double precision codes on this box it is a big energy waster IMHO. (of course it's very well equipped for all kind of crypto codes using that google library). Vincent > Tsubame was ranked 29th-fastest supercomputer in the world in the > latest Top > 500 ranking with a speed of 77.48T Flops (floating point operations > per > second) on the industry-standard Linpack benchmark. > > While its position is relatively good, that's not what makes it so > special. > The interesting thing about Tsubame is that it doesn't rely on the raw > processing power of CPUs (central processing units) alone to get > its work > done. Tsubame includes hundreds of graphics processors of the same > type used > in consumer PCs, working alongside CPUs in a mixed environment that > some say > is a model for future supercomputers serving disciplines like material > chemistry. > > Graphics processors (GPUs) are very good at quickly performing the > same > computation on large amounts of data, so they can make short work > of some > problems in areas such as molecular dynamics, physics simulations > and image > processing. > > "I think in the vast majority of the interesting problems in the > future, the > problems that affect humanity where the impact comes from > nature ... requires > the ability to manipulate and compute on a very large data set," said > Jen-Hsun Huang, CEO of Nvidia, who spoke at the university this > week. Tsubame > uses 680 of Nvidia's Tesla graphics cards. > > Just how much of a difference do the GPUs make? Takayuki Aoki, a > professor of > material chemistry at the university, said that simulations that > used to take > three months now take 10 hours on Tsubame. > > Tsubame itself - once you move past the air-conditioners - is split > across > several rooms in two floors of the building and is largely made up of > rack-mounted Sun x4600 systems. There are 655 of these in all, each > of which > has 16 AMD Opteron CPU cores inside it, and Clearspeed CSX600 > accelerator > boards. > > The graphics chips are contained in 170 Nvidia Tesla S1070 rack- > mount units > that have been slotted in between the Sun systems. Each of the 1U > Nvidia > systems has four GPUs inside, each of which has 240 processing > cores for a > total of 960 cores per system. > > The Tesla systems were added to Tsubame over the course of about a > week while > the computer was operating. > > "People thought we were crazy," said Satoshi Matsuoka, director of > the Global > Scientific Information and Computing Center at the university. > "This is a ?1 > billion (US$11 million) supercomputer consuming a megawatt of > power, but we > proved technically that it was possible." > > The result is what university staff call version 1.2 of the Tsubame > supercomputer. > > "I think we should have been able to achieve 85 [T Flops], but we > ran out of > time so it was 77 [T Flops]," said Matsuoka of the benchmarks > performed on > the system. At 85T Flops it would have risen a couple of places in > the Top > 500 and been ranked fastest in Japan. > > There's always next time: A new Top 500 list is due out in June > 2009, and > Tokyo Institute of Technology is also looking further ahead. > > "This is not the end of Tsubame, it's just the beginning of GPU > acceleration > becoming mainstream," said Matsuoka. "We believe that in the world > there will > be supercomputers registering several petaflops in the years to > come, and we > would like to follow suit." > > Tsubame 2.0, as he dubbed the next upgrade, should be here within > the next > two years and will boast a sustained performance of at least a > petaflop (a > petaflop is 1,000 teraflops), he said. The basic design for the > machine is > still not finalized but it will continue the heterogeneous > computing base of > mixing CPUs and GPUs, he said. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From Florent.Calvayrac at univ-lemans.fr Fri Dec 12 03:05:56 2008 From: Florent.Calvayrac at univ-lemans.fr (Florent Calvayrac-Castaing) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Inside Tsubame - the Nvidia GPU supercomputer - OpenCL In-Reply-To: <20081212075655.GK11544@leitl.org> References: <20081212075655.GK11544@leitl.org> Message-ID: <49424594.1010709@univ-lemans.fr> Eugen Leitl wrote: > http://www.goodgearguide.com.au/article/270416/inside_tsubame_-_nvidia_gpu_supercomputer?fp=&fpid=&pf=1 > > Inside Tsubame - the Nvidia GPU supercomputer > > Tokyo Tech University's Tsubame supercomputer attained 29th ranking in the > new Top 500, thanks in part to hundreds of Nvidia Tesla graphics cards. > > Interesting. I understand why, when I submitted a joint exploratory project about GPU computing two years ago with a Japanese colleague we were ranked first in Japan and last in France ; the idea seems more popular in Japan if they can fork millions on an architecture it is not very quick to program for (at least maybe not as fast as Moore's law is increasing power). By the way, has anyone on the list any idea on the prospects of Apple's OpenCL ? We just have started working seriously on CUDA but maybe it is time to change for something more open and maybe easier to program with. PS : I hope this message gets approved to the list ; I had a few rejected messages in the past but when I read the recent drivel I can't help wonder on some mysteries of life. From Bogdan.Costescu at iwr.uni-heidelberg.de Fri Dec 12 03:09:20 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Re: cloning issue, hidden module dependency In-Reply-To: References: Message-ID: On Thu, 11 Dec 2008, David Mathog wrote: > install usb-interface /sbin/modprobe uhci_hcd; /bin/true > install ide-controller /sbin/modprobe via82cxxx; /sbin/modprobe > ide_generic; /bin/true I don't know why you need these. On all the distributions that I've worked with recently such issues are taken care of by running 'depmod', the resulting files are taken into consideration when running 'mkinitrd'. > alias pci:v000010ECd00008139sv000010ECsd00008139bc02sc00i00 8139too And why do you need this ? Didn't the module detect this hardware ? > 4. Need to run lilo now or at reboot it STILL comes up using the wrong > modules. You can guess how I found that out. Lilo won't work reliably > chroot from boel3 with this Mandriva distro, and boel3 has no lilo > of its own. So pxe boot the node with PLD 2.01, then at the prompt: I think that you've just created more problems by mixing all these different distributions then mixing booting from local disk with PXE... I would not have had such problems with CentOS or Fedora, but I work with these distributions for a long time (and Red Hat Linux before that), so I know them pretty well. Maybe it's time for you to learn some more about the tools that your preferred distribution makes available and how all the pieces are put together ? -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8240, Fax: +49 6221 54 8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From tortay at cc.in2p3.fr Fri Dec 12 04:01:56 2008 From: tortay at cc.in2p3.fr (Loic Tortay) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Inside Tsubame - the Nvidia GPU supercomputer - OpenCL In-Reply-To: <49424594.1010709@univ-lemans.fr> References: <20081212075655.GK11544@leitl.org> <49424594.1010709@univ-lemans.fr> Message-ID: <494252B4.7040106@cc.in2p3.fr> Florent Calvayrac-Castaing wrote: [...] > > Interesting. > > I understand why, when I submitted a joint exploratory project > about GPU computing two years ago with a Japanese > colleague we were ranked first in Japan and last in France ; the > idea seems more popular in Japan if they can fork millions > on an architecture it is not very quick to program for (at least > maybe not as fast as Moore's law is increasing power). > They may be willing to spend millions because they already have programs able to use the GPUs. If I'm not mistaken, the "Tsubame" cluster was initially using Clearspeed accelerators (in Sun X4600 "fat" nodes). Therefore, they probably have appropriate programs that need little adaptation (or less than many) to work on the GPUs. Lo?c. From csamuel at vpac.org Fri Dec 12 04:32:29 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Inside Tsubame - the Nvidia GPU supercomputer - OpenCL In-Reply-To: <49424594.1010709@univ-lemans.fr> Message-ID: <623914342.124881229085149551.JavaMail.root@mail.vpac.org> ----- "Florent Calvayrac-Castaing" wrote: > By the way, has anyone on the list any idea on > the prospects of Apple's OpenCL ? I think we need something that is hardware independent. If OpenCL can deliver that (and Apple obviously believe it can otherwise they'd not have it in Snow Leopard) then I think that's going to be great. The big question for me is where are the implementations going to come from ? My understanding is that Snow Leopard will use the LLVM compiler for it [1], and nVidia will ship support it in their CUDA SDK. AMD have already nailed their colours to the mast and based on past behaviour it might be reasonable to expect that they'll use GCC as their base (which would be nice!). As for Intel, well I guess it'll be in their compiler, though I asked about Larabee and OpenCL on the Intel stand at SC and was told "we don't have anyone here who knows about it, we'll get someone to call you" (nothing yet). cheers, Chris [1] - Given that LLVM is BSD licensed it is unclear whether Apples modifications to implement it will be public or not. -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From laytonjb at att.net Fri Dec 12 05:09:07 2008 From: laytonjb at att.net (Jeff Layton) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] SSD prices In-Reply-To: <20081212001859.GA7929@bx9> References: <885368590.112911229040025006.JavaMail.root@mail.vpac.org> <1073016340.113081229040287297.JavaMail.root@mail.vpac.org> <20081212001859.GA7929@bx9> Message-ID: <49426273.7030803@att.net> Greg Lindahl wrote: > On Fri, Dec 12, 2008 at 11:04:47AM +1100, Chris Samuel wrote: > > >> Hmm, I was thinking that until I read this blog post by >> one of the kernel filesystem developers (Val Henson from >> Intel) who had some (possibly Apple specific) concerns >> about data corruption & reliability and why she still >> chooses spinning disks over SSD. >> > > Nothing new in that blog post. We'll find out the actual reliability > of this generation of flash SSD when they've been around for a while, > and not an anecdote sooner. > This is one of the ugly secrets about SSD's that haven't gotten out very to the masses. They have data corruption issues. JEDEC has certain requirements for data retention basically as a function of capacity. For SSD's at 10% of the rewrite capacity, you need to retain the data for 10 years. Current crops of SSD's with MLC barely meet this goal (actually it's a function of the NAND's). Neat 100% of the rewrite life of the SSD they need to hold the data for only 1 year! So if you are near the end of life of the SSD, get your data off of there and pronto! Jeff From laytonjb at att.net Fri Dec 12 05:14:47 2008 From: laytonjb at att.net (Jeff Layton) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] SSD prices In-Reply-To: <20081211213857.GA7355@bx9> References: <20081211213857.GA7355@bx9> Message-ID: <494263C7.6030009@att.net> Greg Lindahl wrote: > I was recently surprised to learn that SSD prices are down in the > $2-$3 per gbyte range. I did a survey of one brand (OCZ) at NexTag > and it was: > > 256 gigs = $700 > 128 gigs = 300 > 64 gigs = 180 > 32 gigs = 70 > > Also, Micron is saying that they're going to get into the business of > PCIe-attached flash, which will give us a second source for what > Fusion-io is shipping today. > > If you're on the "I like a real system disk" side of the > diskless/diskfull fence, these SSDs ought to be a lot more reliable > than tradtional disks. And I'd like to get rid of the mirrored > disks in our developer desktops... > Remember that OCZ does not equal Fusion-IO :) There are many factors that go into an SSD that determine performance. So the performance of OCZ is not nearly that of Fusion-IO's product. For example, I've been tracking some performance testing of a wide variety of SSD's and spinning disks in my day job. Some of the SSD's are fairly inexpensive, but the performance is pretty pathetic. For example, if your read/write mix includes more than about 10% writes, then the performance of the SSD's is worse than a spinning disk (this is in terms of IOPS). If you want to move up the food chain and buy some unbelievably fast SSD's you get can get the performance above spinning disks but the price is several orders of magnitude greater than spinning disks. Reliability is another question and I posted a quick response to this list in a different email. Jeff From gdjacobs at gmail.com Fri Dec 12 06:17:19 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] SSD prices In-Reply-To: <494263C7.6030009@att.net> References: <20081211213857.GA7355@bx9> <494263C7.6030009@att.net> Message-ID: <4942726F.1050705@gmail.com> Jeff Layton wrote: > Greg Lindahl wrote: >> I was recently surprised to learn that SSD prices are down in the >> $2-$3 per gbyte range. I did a survey of one brand (OCZ) at NexTag >> and it was: >> >> 256 gigs = $700 >> 128 gigs = 300 >> 64 gigs = 180 >> 32 gigs = 70 >> >> Also, Micron is saying that they're going to get into the business of >> PCIe-attached flash, which will give us a second source for what >> Fusion-io is shipping today. >> >> If you're on the "I like a real system disk" side of the >> diskless/diskfull fence, these SSDs ought to be a lot more reliable >> than tradtional disks. And I'd like to get rid of the mirrored >> disks in our developer desktops... >> > > Remember that OCZ does not equal Fusion-IO :) There are many > factors that go into an SSD that determine performance. So the > performance of OCZ is not nearly that of Fusion-IO's product. > > For example, I've been tracking some performance testing of a > wide variety of SSD's and spinning disks in my day job. Some of > the SSD's are fairly inexpensive, but the performance is pretty > pathetic. For example, if your read/write mix includes more than > about 10% writes, then the performance of the SSD's is worse > than a spinning disk (this is in terms of IOPS). > > If you want to move up the food chain and buy some unbelievably > fast SSD's you get can get the performance above spinning disks > but the price is several orders of magnitude greater than spinning > disks. Yeah, there's a few vendors out there selling battery backed dram solutions. Basically maxing out the interface, but stupidly expensive. >From the benches I've seen, though, it could be a useful accelerator for workloads akin to databases. > Reliability is another question and I posted a quick response to > this list in a different email. This being my big concern with flash. > > Jeff -- Geoffrey D. Jacobs From hearnsj at googlemail.com Fri Dec 12 06:45:56 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Inside Tsubame - the Nvidia GPU supercomputer - OpenCL In-Reply-To: <494252B4.7040106@cc.in2p3.fr> References: <20081212075655.GK11544@leitl.org> <49424594.1010709@univ-lemans.fr> <494252B4.7040106@cc.in2p3.fr> Message-ID: <9f8092cc0812120645x77e3257cq7f477e963bdbf41b@mail.gmail.com> 2008/12/12 Loic Tortay > > If I'm not mistaken, the "Tsubame" cluster was initially using > Clearspeed accelerators (in Sun X4600 "fat" nodes). > > Therefore, they probably have appropriate programs that need little > adaptation (or less than many) to work on the GPUs. > > Emmmm.... I'm no expert on Clearspeed, but AFAIK Clearspeeds selling point is that the cards run standard maths library functions - ie. you just 'drop in' a compatible maths library and the card gets given the computations to do. This is not the same model as GPU programming. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081212/86c0ea14/attachment.html From laytonjb at att.net Fri Dec 12 07:33:11 2008 From: laytonjb at att.net (Jeff Layton) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Inside Tsubame - the Nvidia GPU supercomputer - OpenCL In-Reply-To: <9f8092cc0812120645x77e3257cq7f477e963bdbf41b@mail.gmail.com> References: <20081212075655.GK11544@leitl.org> <49424594.1010709@univ-lemans.fr> <494252B4.7040106@cc.in2p3.fr> <9f8092cc0812120645x77e3257cq7f477e963bdbf41b@mail.gmail.com> Message-ID: <49428437.5050704@att.net> John Hearns wrote: > > > 2008/12/12 Loic Tortay > > > > If I'm not mistaken, the "Tsubame" cluster was initially using > Clearspeed accelerators (in Sun X4600 "fat" nodes). > > Therefore, they probably have appropriate programs that need little > adaptation (or less than many) to work on the GPUs. > > Emmmm.... I'm no expert on Clearspeed, but AFAIK Clearspeeds selling > point is that the cards run standard maths library functions - ie. you > just 'drop in' a compatible maths library and the card gets given the > computations to do. > This is not the same model as GPU programming. Yes and No. There are libraries for CUDA for BLAS and FFT's. Clearspeed has this as well. Jeff From laytonjb at att.net Fri Dec 12 07:41:02 2008 From: laytonjb at att.net (Jeff Layton) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] SSD prices In-Reply-To: <4942726F.1050705@gmail.com> References: <20081211213857.GA7355@bx9> <494263C7.6030009@att.net> <4942726F.1050705@gmail.com> Message-ID: <4942860E.70300@att.net> Geoff Jacobs wrote: > Jeff Layton wrote: > >> Remember that OCZ does not equal Fusion-IO :) There are many >> factors that go into an SSD that determine performance. So the >> performance of OCZ is not nearly that of Fusion-IO's product. >> >> For example, I've been tracking some performance testing of a >> wide variety of SSD's and spinning disks in my day job. Some of >> the SSD's are fairly inexpensive, but the performance is pretty >> pathetic. For example, if your read/write mix includes more than >> about 10% writes, then the performance of the SSD's is worse >> than a spinning disk (this is in terms of IOPS). >> >> If you want to move up the food chain and buy some unbelievably >> fast SSD's you get can get the performance above spinning disks >> but the price is several orders of magnitude greater than spinning >> disks. >> > > Yeah, there's a few vendors out there selling battery backed dram > solutions. Basically maxing out the interface, but stupidly expensive. > >From the benches I've seen, though, it could be a useful accelerator for > workloads akin to databases. > It gets more involved than just adding dram. The controllers can have a huge impact on performance. You will find some high-end drives that have great NAND's but really crappy controllers. This limits performance (I don't know if I've seen any with good controllers and bad NAND's though, but I think there are some out there). Then you have some amazing drives with great controllers and great NAND's - but you will pay dearly for them :) And as others have pointed out, the details of the firmware can also have a big impact on performance. Also, the interface can impact performance as well. Jeff From laytonjb at att.net Fri Dec 12 07:59:29 2008 From: laytonjb at att.net (Jeff Layton) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] SSD prices - q: how many writes/erases??? In-Reply-To: <20081212153526.GA31871@anuurn.compact> References: <20081211213857.GA7355@bx9> <494263C7.6030009@att.net> <4942726F.1050705@gmail.com> <20081212153526.GA31871@anuurn.compact> Message-ID: <49428A61.3070208@att.net> Peter Jakobi wrote: > On Fri, Dec 12, 2008 at 08:17:19AM -0600, Geoff Jacobs wrote: > Rehi, > >>> Reliability is another question and I posted a quick response to >>> this list in a different email. >>> >> This being my big concern with flash. >> > > related is this topic on SSD / flashes: > > what's the life time when changing the same file frequently? > aka "mapping block writes to cell erases" > aka "how many erases are possible?" > This is somewhat a complicated question. It depends upon a few factors if you are looking at things from the perspective of the drive. In general the cells have a re-write limit that is a function of what kind of cell it is. I don't remember exact numbers, but I think MLC's are something like 10,000 rewrites and SLC's have like 100,000 rewrites. But the wear-leveling algorithms do a reasonable job of moving data to different cells rather than rewrite. This "levels" out the number of rewrites to the cells. What some people are doing to also help SSD's is to reserve a portion of the drive as "backups" for cells that have reached their limit. For example, you take a 64GB drive and make it appear as a 50GB drive. Then the extra 14GB is used by the drive to replace bad cells when needed (think of it as the SSD approach that SATA drives have with spare blocks that are used by the drive). While you lose space on the drive, overall the drive can last longer because of the spare cells. I think this is a good idea for MLC in particular because of the low rewrite limit. There should be some stuff floating around the web on the topic of SSD's. Just treat some of the more "popular" stuff from sites like Tom's Hardware, etc. with skepticism. :) Jeff From brahmaforces at gmail.com Thu Dec 11 02:04:38 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: I am in NewDelhi India. However I would prefer to put the cluster together myself, because 1) I am a good python programmer and like programming and playing with computers 2) I will be using the cluster for animation (art + computers) and may have to bend it and tinker with it...therefore it makes sense for me to know it inside out. 3) If I set it up then I can grow it, and i envision it growing, outsourcing the whole thing would be expensive 4) I have been using linux for several years and am comfortable in the environment 5) I have a bunch of old computers lying about which are not so old and run basic versions of linux fast. What is 1u? What is a blade system? I would be putting it in a room with air conditioning. At this time I am trying to figure out the racks. Am meeting the hardware guy on Saturday and we were thinking of opening up the PCS i have lying around and taking measurements of how the mother boards fit into the cases,with the intention of creating a rack from scratch. Any ideas of what goes into a good rack in terms of size and matieral (assuming it has to be insulated) Also again, what might be some upto date books on the subject and any experiences regarding the actual creation of the rack and the physical hardware. I am starting with 3 nodes to be expanded to n nodes....The 3 nodes will allow me to keep complexity down while learning and then i can expand to n nodes once i have it down to increase speed. Am planning to run animation software (like blender) on it. Since animation software requires large processing power i am assuming they have already worked on parrallelizing the code... Anyone using clusters for animation on this list? > > Two pieces of advice > > a) let us know where you are physically. Talk to a clustering company in > your country, or area. > You will be surprised - they will put the whole thing together for you as a > 'turnkey' cluster AND what's more important support it. OK, you don't get > the learning experience which you are after. > > b) if this thing is to sit in your office, think about noise, cooling and > how many amps you can draw from a wall socket. > 1U servers have lots of little high speed fans and the noise gets very, > very annoying. > Think of putting this thing in a separate room, with some air conditioning. > Even a small room with a portable wheeled unit, venting to the outside may > be adequate for you. > > > Have you thought about a blade system for your particular situation? Might > be the ideal solution. > > > > > -- Best regards, arjuna http://www.brahmaforces.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081211/cd4dbefc/attachment.html From drcoolsanta at gmail.com Thu Dec 11 04:17:04 2008 From: drcoolsanta at gmail.com (Dr Cool Santa) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Parallel software for chemists In-Reply-To: <9f8092cc0812110133q269f58f0o2685ec95a590b727@mail.gmail.com> References: <86b56470812100551g277917dag95b93d2dbcaf346@mail.gmail.com> <9f8092cc0812110133q269f58f0o2685ec95a590b727@mail.gmail.com> Message-ID: <86b56470812110417p3cf97018na6676ed44213f328@mail.gmail.com> Thanks, seems like a good website. Actually it is my mother who is the chemist. On Thu, Dec 11, 2008 at 3:03 PM, John Hearns wrote: > > > 2008/12/10 Dr Cool Santa > >> Currently in the lab we use Schrodinger and we are looking into NWchem. >> We'd be interested in knowing about software that a chemist could use that >> makes use of a parallel supercomputer. And better if it is linux. >> >> Its probably worth it for you to join the Computational Chemistry list: > > http://www.ccl.net/ > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081211/575e3450/attachment.html From alsimao at gmail.com Thu Dec 11 06:54:03 2008 From: alsimao at gmail.com (Alcides Simao) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Re: Beowulf Digest, Vol 58, Issue 28 In-Reply-To: <200812111416.mBBEFmHa001826@bluewest.scyld.com> References: <200812111416.mBBEFmHa001826@bluewest.scyld.com> Message-ID: <7be8c36b0812110654h532fdef4ta8c4259c0922a934@mail.gmail.com> Hello 'wulfers! Any news on GPCPU stuff from ATI? Best, Alcides -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081211/f555763e/attachment.html From spambox at emboss.co.nz Thu Dec 11 16:56:13 2008 From: spambox at emboss.co.nz (Michael Brown) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] SSD prices In-Reply-To: <20081211213857.GA7355@bx9> References: <20081211213857.GA7355@bx9> Message-ID: <95046224B0E6475293D97FF9B5819BC4@Forethought> Greg Lindahl wrote: >I was recently surprised to learn that SSD prices are down in the > $2-$3 per gbyte range. I did a survey of one brand (OCZ) at NexTag > and it was: > > 256 gigs = $700 > 128 gigs = 300 > 64 gigs = 180 > 32 gigs = 70 Alas, these drives have lousy random write performance. As in 4 IOps lousy. Read speed is pretty good, but since it appears to take 250 ms for an erase + write cycle on the flash (during which other reads are blocked as well), it's got really rather limited usefulness. People have reported that Vista won't install on the drives, due to timeouts. This is also why the prices are so low - they're basically dumping them to get rid of them. Note that OCZ aren't alone in this issue - all of the "low cost" SSDs have the same issue since they're all just rebadges of the same OEM drive. For good performance, you're AFAIK limited to the Intel X25's and similar. The 80 GB X25-M hits you for $528 according to NexTag, other good 64 GB SSDs are around the $450 - $500 mark, depending on the drive (I can't get NexTag to list them, it only shows a very high price for the MTRON 64 GB drive). They're still not a whole lot faster than spinning rust once you start to have some randomness in your writing. Reliability should be fine in laptops, though I'd be less keen to deploy a rack full of them - they're a lot more sensitive to electrical noise than traditional HDDs when both reading and writing, so their reliability in these situations depends on how good the ADC and DAC converters are in the chips, and how much space they burn for ECC. The fact that the manufacturers don't spec the uncorrected/miscorrected error rate under any circumstances makes me a tad worried. Also, the lifespan question is still unanswered - any particular page of MLC flash is still limited to about 10000 writes, so you've got to hope that your workload doesn't tickle the wear levelling algorithm the wrong way. Cheers, Michael From jakobi at acm.org Fri Dec 12 07:35:26 2008 From: jakobi at acm.org (Peter Jakobi) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] SSD prices - q: how many writes/erases??? In-Reply-To: <4942726F.1050705@gmail.com> References: <20081211213857.GA7355@bx9> <494263C7.6030009@att.net> <4942726F.1050705@gmail.com> Message-ID: <20081212153526.GA31871@anuurn.compact> On Fri, Dec 12, 2008 at 08:17:19AM -0600, Geoff Jacobs wrote: Rehi, > > Reliability is another question and I posted a quick response to > > this list in a different email. > > This being my big concern with flash. related is this topic on SSD / flashes: what's the life time when changing the same file frequently? aka "mapping block writes to cell erases" aka "how many erases are possible?" In the days of yore, that was the limitation on using flash, as writing a block to the same physical location on the flash (for some to be defined sense of physical location :)) requires a whole slew of blocks (let's call it a 'cell', maybe containing a few dozens or thousands of blocks?) to be erased and a subset of them to be written. Does anyone have current and uptodate info or researched this issue already? if so thanx!! Peter === Some of the questions I see before checking recent kernel sources would be: - is there some remapping in the hardware of the ide emulation chip space of say compactflash or usb sticks? - is part of this possible in the ide-emulation in the kernel? - or is part of this in the filesystem, that is suddenly after a decade or more, the fs has to cope again with frequent bad blocks, like the old bad blocks lists of the SCSI days 2 decades past? [basically: is there some 'newish' balancing to limit / redistribute the number of erases over all cells? Is there a way to relocate cells that resist erasing, ...?] - can I place a filesystem containing some files that are always rewritten on flash and use say ordinary ext2 or vfat for this? - might I even be able to SWAP on flash nowadays? - Or do I still have to do voodoo with FUSE overlays or other tricks to reduce the number of writes leading to cell erases? Maybe check if there's a real log-structured filesystem available, that has seen production use outside of labs (and doesn't fail by keeping its some of its frequently changing metadata in always in the same location). -- cu Peter jakobi@acm.org From james.p.lux at jpl.nasa.gov Fri Dec 12 08:58:50 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: ________________________________ From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of arjuna Sent: Thursday, December 11, 2008 2:05 AM To: beowulf@beowulf.org Subject: Re: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware I am in NewDelhi India. However I would prefer to put the cluster together myself, because 1) I am a good python programmer and like programming and playing with computers 2) I will be using the cluster for animation (art + computers) and may have to bend it and tinker with it...therefore it makes sense for me to know it inside out. 3) If I set it up then I can grow it, and i envision it growing, outsourcing the whole thing would be expensive 4) I have been using linux for several years and am comfortable in the environment 5) I have a bunch of old computers lying about which are not so old and run basic versions of linux fast. All decent reasons to put together a cluster. > What is 1u? Standard rack mount systems (with 19" wide RETMA/EIA panels) have certain vertical heights for each component (as well as standard hole patterns). 1 U = 1 Unit = 1 7/8" (4U = 7", 2U= 3.5") As a practical matter, 1 U high systems are quite tight inside, and tough to cool, because the fans can only be an inch or so high (maybe 40mm) and to move any amount of air, they have to spin fast. Fast small fan = low efficiency, lots of noise.. >What is a blade system? Where there's a "card cage" into which one slides cards (or "blades") which are a whole PC. They all share a common power supply and they're denser because you don't have extra sheetmetal between PCs. OTOH, denser means more heat in a small volume, which aggravates the cooling problem. Why "blades".. -> it sounds cooler (no, really... It's because of marketing. Cards in a card cage is so 1950s.. Why, PDP-8s and IBM 1401s use cards in a card cage..) >I would be putting it in a room with air conditioning. There's "air conditioning" and "AIR CONDITIONING".. Throw a few kilowatts worth of computers in a room, and you'll find out which one you have. >At this time I am trying to figure out the racks. Am meeting the hardware guy on Saturday and we were thinking of opening up the PCS i have lying around and taking measurements of how the mother boards fit into the cases,with the intention of creating a rack from scratch. Any ideas of what goes into a good rack in terms of size and matieral (assuming it has to be insulated) My favorite "field expedient" scheme is to use half or full size aluminum baking sheets with raised edges (aka jelly roll pans or sheet pans) and double stick foam tape. You can slide them into a standard baker's rack. All readily available or improvised, although I'd look for the real rack (they're cheap and sturdy). The pans can just be sheets of aluminum sheared to size, if you like. Search the archives of the list for some website addresses of kitchen supply places where you can see a picture of this kind of thing. There is CERTAINLY some local source where you live for this stuff (ask at any commercial kitchen or bakery) > Also again, what might be some upto date books on the subject and any experiences regarding the actual creation of the rack and the physical hardware. Catalogs are your friend, as far as packaging goes. >I am starting with 3 nodes to be expanded to n nodes....The 3 nodes will allow me to keep complexity down while learning and then i can expand to n nodes once i have it down to increase speed. Good luck.. At 3 nodes, you can just throw the PCs under the table and hook em up with a ethernet switch. But, if you find a pallet load of surplus PCs, and want to repackage them a bit more densely, then the cookie sheet approach is easy. From hearnsj at googlemail.com Fri Dec 12 09:13:54 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: <9f8092cc0812120913g573f511fq2b5f63282517d96e@mail.gmail.com> 2008/12/11 arjuna > > What is 1u? > Easy question! a "U" is short for a "rack unit". Rack mounted equipment always comes in multiples of a vertical height unit, which is 1.75 inches. I gather this is actually an old Russian unit of measurement (you can check on Wikipedia). So when you put equipment into standard 19inch wide racks, you ask "how many U high is that equipment). It means that you can mix and match different types of equipment in the same rack. Specifically for this discussion, a 1U computer (server) is a server which takes up 1U of vertical space. They are generally very deep to compensate for the lack of space in height, and use lots of small, fast fans rather than the big one you have in a desktop. Air comes in through slots in the front, past the disk drives, over the motherboard and out of the rear. These systems generally pack more compute power per piece of floorspace, and help make the cabling neater as everything is in a standard place to the rear of the rack. > > What is a blade system? > This is where servers are packaged into standard units, generally a bit smaller than the 1U servers above. The blades plug into a chassis, which in turn is mounted in the rack. The chassis provides power to each blade, plus networking connections across a "backplane" In the case of the 1U servers you generally have to connect mains power to each one, and run separate cables for ethernet / Myrinet / Infiniband. This cabling is all wrapped up inside the chassis. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081212/d49e4b61/attachment.html From james.p.lux at jpl.nasa.gov Fri Dec 12 09:15:22 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] SSD prices In-Reply-To: <95046224B0E6475293D97FF9B5819BC4@Forethought> References: <20081211213857.GA7355@bx9> <95046224B0E6475293D97FF9B5819BC4@Forethought> Message-ID: > Reliability should be fine in laptops, though I'd be less > keen to deploy a rack full of them - they're a lot more > sensitive to electrical noise than traditional HDDs when both > reading and writing, so their reliability in these situations > depends on how good the ADC and DAC converters are in the > chips, and how much space they burn for ECC. The fact that > the manufacturers don't spec the uncorrected/miscorrected > error rate under any circumstances makes me a tad worried. That's not a specification that is easily "measureable" or tested, so it doesn't get published. Heck, I'd be happy to run across a "flash memory error simulator" to test our EDAC implementations here. Sure, you can cobble a "flash emulator" up in a FPGA, but who's to say if it's realistic. From hearnsj at googlemail.com Fri Dec 12 09:35:34 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <200812091535.03920.kilian.cavalotti.work@gmail.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> Message-ID: <9f8092cc0812120935t7a001409oe77ec688965d1cae@mail.gmail.com> I was down in our server room with the ICE this afternoon. Its worth describing how they are put together for the purposes of this thread. Each rack has four blade chassis in it. These are called Independent Rack Units in SGI speak. An IRU has sixteen compute blades, plus the mains PSUs and Infiniband blades. Each IRU has an L1 chassis controller. At the rear of each IRU there is a bank of big fans. Each IRU couples up to a 1/4 sized rear rack door, using a foam gasket. Each of these 1/4 sized doors is a swing out heat exchanger. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081212/b4ff9671/attachment.html From prentice at ias.edu Fri Dec 12 09:42:02 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Can't unload OpenIB kernel modules during a reboot Message-ID: <4942A26A.2080507@ias.edu> When I reboot the nodes in my cluster, the openibd scripts hangs when shutting down. If I wait long enough(5-10 minutes, probably closer to 10), it eventually completes, or at least fails so the system can continue shutting down. If I do 'service openibd stop' before doing the reboot, the openibd script does it's thing in only a few seconds, as expected: /etc/init.d/openibd stop Unloading OpenIB kernel modules: [ OK ] I'm using a RHEL-rebuild distro (PU_IAS 5.2), the openibd script is part of the openib package that comes with the distro: rpm -qf /etc/init.d/openibd openib-1.3-3.el5 Any ideas why this script would behave differently during a shutdown? -- Prentice From lynesh at cardiff.ac.uk Fri Dec 12 10:05:00 2008 From: lynesh at cardiff.ac.uk (Huw Lynes) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <9f8092cc0812120935t7a001409oe77ec688965d1cae@mail.gmail.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> <9f8092cc0812120935t7a001409oe77ec688965d1cae@mail.gmail.com> Message-ID: <1229105100.5611.1.camel@desktop> On Fri, 2008-12-12 at 17:35 +0000, John Hearns wrote: > At the rear of each IRU there is a bank of big fans. Each IRU couples > up to a 1/4 sized rear rack door, using a foam gasket. Each of these > 1/4 sized doors is a swing out heat exchanger. That's the bit of information I was missing. I'd assumed the entire door swung out as one losing all cooling when you work on the rack. The stable-door approach makes more sense. I still like our APC contained hot-aisle system though. Cheers, Huw -- Huw Lynes | Advanced Research Computing HEC Sysadmin | Cardiff University | Redwood Building, Tel: +44 (0) 29208 70626 | King Edward VII Avenue, CF10 3NB From landman at scalableinformatics.com Fri Dec 12 10:06:45 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Can't unload OpenIB kernel modules during a reboot In-Reply-To: <4942A26A.2080507@ias.edu> References: <4942A26A.2080507@ias.edu> Message-ID: <4942A835.3060307@scalableinformatics.com> Prentice Bisbal wrote: > > Any ideas why this script would behave differently during a shutdown? Hi Prentice Sounds like a race situation. Do you have an NFS mount over IPoIB? Joe > > -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From hearnsj at googlemail.com Fri Dec 12 10:21:09 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <1229105100.5611.1.camel@desktop> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> <9f8092cc0812120935t7a001409oe77ec688965d1cae@mail.gmail.com> <1229105100.5611.1.camel@desktop> Message-ID: <9f8092cc0812121021x4cd4a5b7y354007afd897571d@mail.gmail.com> 2008/12/12 Huw Lynes > > > That's the bit of information I was missing. I'd assumed the entire door > swung out as one losing all cooling when you work on the rack. The > stable-door approach makes more sense. > > I still like our APC contained hot-aisle system though. > > Horses for course, Huw. (*) SGI did an install in Ireland where they have the IRU chassis mounted vertically, in those same APC racks. Seemingly it works quite well - the drawback is that you get three IRUs per rack rather than four. But I guess with the APC racks being narrower you do not lose out that much as you get more racks per aisle. I must measure this up actually. (*) Come on. Its Friday. We have a "race condition" in another thread. Let the horsey puns flow. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081212/79e25238/attachment.html From kus at free.net Fri Dec 12 10:28:19 2008 From: kus at free.net (Mikhail Kuzminsky) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Parallel software for chemists In-Reply-To: <86b56470812100551g277917dag95b93d2dbcaf346@mail.gmail.com> Message-ID: In message from "Dr Cool Santa" (Wed, 10 Dec 2008 19:21:43 +0530): >Currently in the lab we use Schrodinger and we are looking into >NWchem. We'd >be interested in knowing about software that a chemist could use that >makes >use of a parallel supercomputer. And better if it is linux. To say shortly, practically all the modern software for molecular modelling calculations can run "in parallel" on Linux clusters. Mikhail Kuzminsky Computer Assistance to Chemical Research Center Zelinsky Institute of Organic Chemistry RAS Moscow > >-- >This message has been scanned for viruses and >dangerous content by MailScanner, and is >believed to be clean. > From james.p.lux at jpl.nasa.gov Fri Dec 12 10:35:33 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:05 2009 Subject: 1U for racks? RE: [Beowulf] Newbie Question: Racks... In-Reply-To: <9f8092cc0812120913g573f511fq2b5f63282517d96e@mail.gmail.com> References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> <9f8092cc0812120913g573f511fq2b5f63282517d96e@mail.gmail.com> Message-ID: From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of John Hearns 2008/12/11 arjuna What is 1u? Easy question! a "U" is short for a "rack unit". Rack mounted equipment always comes in multiples of a vertical height unit, which is 1.75 inches. I gather this is actually an old Russian unit of measurement (you can check on Wikipedia). So when you put equipment into standard 19inch wide racks, you ask "how many U high is that equipment). It means that you can mix and match different types of equipment in the same rack. -- The Wikipedia entry just says it's coincidence: "Coincidentally, a rack unit is equal to a vershok, an obsolete Russian length unit." I used to think that the rack hole spacing is almost certainly from Western Electric or something like this (back when they were called "relay racks"). The rack dimensions are an old RETMA standard ("Radio Electron Television Manufacturing Association) (RETMA changed to EIA in the late 50s) which I'm pretty sure derives from some older standard, which in turn probably derives from an early telegraphy standard, so it, could, like the stories about railroad gauge, be derived from the dimensions of Roman donkeys. A usenet post from Larry Lippman in 1990 gives what sounds like a fairly authoritative description: Ma Bell -> 23", 2" spacing (starting in 1917 or thereabouts) Other older -> 19", 1.75" spacing (he thought RCA, perhaps, in the early 20s) The web is a wonderful place to waste time.. Now I can amaze folks at work with a truly arcane piece of information. From lindahl at pbm.com Fri Dec 12 10:52:41 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Re: Rear-door heat exchangers and condensation In-Reply-To: <7493914765164F50BFEB0183BBA5995C@MichaelPC> References: <20081211231452.GC29359@bx9> <7493914765164F50BFEB0183BBA5995C@MichaelPC> Message-ID: <20081212185241.GA11559@bx9> On Thu, Dec 11, 2008 at 10:09:10PM -0800, Michael Huntingdon wrote: > Today it really is not just about cooling a group of 1u systems or a single > hptc cabinet. Today maybe you know you need two densely populated cabinets. Like most people, I can't use very dense systems, due to the power/cubic foot limitation of my colo. I just stack up 2U systems, and it's basically idiot-proof. -- greg From prentice at ias.edu Fri Dec 12 11:32:37 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Can't unload OpenIB kernel modules during a reboot In-Reply-To: <4942A835.3060307@scalableinformatics.com> References: <4942A26A.2080507@ias.edu> <4942A835.3060307@scalableinformatics.com> Message-ID: <4942BC55.8010501@ias.edu> Joe Landman wrote: > Prentice Bisbal wrote: > >> >> Any ideas why this script would behave differently during a shutdown? > > Hi Prentice > > > Sounds like a race situation. Do you have an NFS mount over IPoIB? > > Joe I do have NFS mounts, but *NOT* through IPoIB. At least they *shouldn't* be. I don't think that's the problem. If I had NFS mounts over IB, I should get errors when I shutdown IB by itself with 'service openibd stop' command. I have no need for IPoIB at the moment. Is they're any way to confirm it's not being used, or explicitly disable it. -- Prentice From mathog at caltech.edu Fri Dec 12 11:38:56 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Re: cloning issue, hidden module dependency Message-ID: Bogdan Costescu wrote: > On Thu, 11 Dec 2008, David Mathog wrote: > > > install usb-interface /sbin/modprobe uhci_hcd; /bin/true > > install ide-controller /sbin/modprobe via82cxxx; /sbin/modprobe > > ide_generic; /bin/true > > I don't know why you need these. On all the distributions that I've > worked with recently such issues are taken care of by running > 'depmod', the resulting files are taken into consideration when > running 'mkinitrd'. The kernel doesn't have either the via82cxxx or USB_UHCI_HCD modules built into it. These lines apparently tell mkinitrd to include the needed modules in the initrd, and also to add some corresponding modprobe lines in the init file which it contains. > > > alias pci:v000010ECd00008139sv000010ECsd00008139bc02sc00i00 8139too > > And why do you need this ? Didn't the module detect this hardware ? This one is mysterious to me too. Those sorts of lines only appeared with Mandriva 2008.1 (kernel 2.6.24.7). The 8139too is built as a module, but it isn't included in the initrd, so that isn't it. All I can tell you is that every Mandriva 2008.1 system I have installed created a similar line for its ethernet driver (whatever that happened to be), and when there were two such interfaces which were identical, there was only one such line. It seems to be related to this line in /lib/modules/*/modules.alias alias pci:v000010ECd00008139sv*sd*bc*sc*i* 8139too There are many other similar lines in modules.alias, differing in the leading numeric field, but only this one pattern matches with the one from modprobe.conf. Why this one and not the others - I have no idea. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From rgb at phy.duke.edu Fri Dec 12 11:51:07 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: On Thu, 11 Dec 2008, arjuna wrote: > I am in NewDelhi India. However I would prefer to put the cluster together > myself, because Ya, that's where I lived for seven years growing up. > 1) I am a good python programmer and like programming and playing with > computers > 2) I will be using the cluster for animation (art + computers) and may have > to bend it and tinker with it...therefore it makes sense for me to know it > inside out. > 3) If I set it up then I can grow it, and i envision it growing, outsourcing > the whole thing would be expensive > 4) I have been using linux for several years and am comfortable in the > environment > 5) I have a bunch of old computers lying about which are not so old and run > basic versions of linux fast. All excellent and traditional reasons, although you'll want to learn a compiler, either C, C++ or Fortran. Which one is most appropriate depends a little bit on the application space you want to work in, a little bit on your personality. None are terribly like python. > What is 1u? A rack comes in "U"nits of height with prespecified/standard layouts for screws and so on. A "1U" rack chassis occupies 1.75" of vertical rack space in a rack that is typically anywhere from 20U to 42U in height (the latter is basically the height of a person; MUCH higher and you start having difficult working on the upper slots and can interfere with the ceiling or overhead cable trays, etc). See: http://www.webopedia.com/TERM/1/1U.html While we're on this question, remember Google Is Your Friend (GIYF). These days, so are the various online references, especially e.g. wikipedia. So while we're always HAPPY to answer questions, you should (as a good student:-) always try to answer them yourself first, especially the easy ones. > What is a blade system? Here's another example. I google a second, pick the wikipedia article and: http://en.wikipedia.org/wiki/Blade_server complete with pictures. The Google return also has dozens of links to Tier 1 builders of bladed systems and vendors that resell them. From the latter you can actually look over specific blade servers and maybe even get prices. > I would be putting it in a room with air conditioning. Sure, but ESPECIALLY in New Delhi, with post-monsoon summertime temperatures in the 40C range outdoors (and with monsoon humidity AND heat before that) you will need to take special care with your environment. The AC will needed to dehumidify and keep the room cooled down to (ideally) 20C or lower all year long, summer and winter. To help you estimate the cooling capacity: AC is usually sold in "tons". 1 ton of AC can remove 3500 joules of heat per second (3500 watts). It needs some of this capacity to maintain a temperature DIFFERENTIAL between inside and outside; a 20+ C differential will use an easy 10% of the capacity, maybe more. So you can look at your AC unit and figure out how many systems you can put into the space before it starts to get too warm -- most systems draw between 100 and 300 watts loaded (sorry about the large variation, but there is everything from single core UP to dual quad core out there with lots of combinations of memory and accessory hardware). If you have a half-ton of AC (say), your body and the electric lights are probably 200W, heat infiltration through the walls another 100W or more depending on where the room is, so you can run as many as 10 systems or as few as three, depending. Note that you'll pay for energy twice -- once for the power coming in, again for the power used by the AC to remove it. Oh, and New Delhi has one other unique-ish environmental constraint, unless things have changed a lot since I lived there. Post-monsoon, when it dries out again you have dust storms. I don't think most list members can really imagine them, but I can (I used to climb a tree outside of our house and feel the dust stinging my cheeks and erasing the buildings all from sight). You will need to be able to keep the dust that infiltrates EVERYWHERE in the houses at that time out of the computer room, as computers (especially the cooling fans) don't like dust. After a big one, you may need to shut down and vaccuum out the insides of your systems. > At this time I am trying to figure out the racks. Am meeting the hardware > guy on Saturday and we were thinking of opening up the PCS i have lying > around and taking measurements of how the mother boards fit into the > cases,with the intention of creating a rack from scratch. Any ideas of what > goes into a good rack in terms of size and matieral (assuming it has to be > insulated) Let's talk terminology. What you are calling "a rack" we call "shelving". A rack is the thing described in the article up above -- a completely standardized computer/telecom equipment holding arrangement. When somebody talks about "rackmount equipment" they refer to stuff boxed up to "slide into a rack" -- made a precise size and with screws and/or rails in just the right places to accomplish this. What you're talking about is a form of interesting homebrew cluster, I think. Periodically people talk about this sort of racking up of motherboards in a homemade (cheap but still effective) way on list. Search back through the archives and you'll find some great discussions. "Recipes" that I can recall include: a) Mounting motherboards on cookie sheets and using a baking rack for a cluster. b) Mounting motherboards on cookie sheets and using heavy duty steel shelving with wooden shelves to make a sort of "vertically bladed" cluster, sliding the cookie sheets in and out of slots cut into the wood. c) Clusters built into standard file cabinets. and several others. In the discussions were some suggestions concerning safety (fire and otherwise) and electomagnetic isolation and noise. Links to pictures, as well, let's see: http://www.beowulf.org/archive/2006-March/015209.html and as you can see, nearly everything is in the beowulf archives somewhere if you search for it cleverly. I think Andrew is still around and may be listening in case the links have been moved in the meantime. > Also again, what might be some upto date books on the subject and any > experiences regarding the actual creation of the rack and the physical > hardware. People don't build racks. People buy racks. However, if you have a machine shop and access to steel and know how to bend and tap it and weld it, you could probably, from the link up above and perhaps some more stuff like it gleaned from the web, build a simple four poster that would "work" to hold standard rackmount chassis. Heck, even building rackmount cases has been discussed on list. Sheet aluminum or steel, cut to spec, fold and weld, and so on. Here it isn't worth the time any more -- rackmount boxes and racks aren't THAT expensive compared to the time needed to DIY -- but I suppose it is possible. > I am starting with 3 nodes to be expanded to n nodes....The 3 nodes will > allow me to keep complexity down while learning and then i can expand to n > nodes once i have it down to increase speed. Sure. Good plan. Get yourself an 8 port (or better) gigabit ethernet switch to use as your first network, too. > Am planning to run animation software (like blender) on it. Since animation > software requires large processing power i am assuming they have already > worked on parrallelizing the code... Assume nothing, unfortunately. However, even if they haven't, if you can partition up the tasks and just run it N times in a batch mode on N systems, that's pretty good parallel speed up right there, and likely doable for a task that is basically embarrassingly parallel. > Anyone using clusters for animation on this list? Don't know. I doubt it. Not QUITE HPC, although I do know physicists who have e.g. animated simulations and so on on clusters. However, the animation itself wasn't done in parallel, only the generation of data to animate. rgb > > ? > ? > Two pieces of advice > ? > a) let us know where you are physically. Talk to a clustering company > in your country, or area. > You will be surprised - they will put the whole thing together for you > as a 'turnkey' cluster AND what's more important support it. OK, you > don't get the learning experience which you are after. > ? > b) if this thing is to sit in your office, think about noise, cooling > and how many amps you can draw from a wall socket. > 1U servers have lots of little high speed fans and the noise gets > very, very annoying. > Think of putting this thing in a separate room, with some air > conditioning. Even a small room with a portable wheeled unit, venting > to the outside may be adequate for you. > ? > ? > Have you thought about a blade system for your particular situation? > Might be the ideal solution. > ? > ? > ? > ? > > > > > -- > Best regards, > arjuna > http://www.brahmaforces.com > > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From i.n.kozin at googlemail.com Fri Dec 12 11:58:58 2008 From: i.n.kozin at googlemail.com (Igor Kozin) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Inside Tsubame - the Nvidia GPU supercomputer In-Reply-To: <34568AAB-6A20-44B7-B80B-FA8BB92AC1F6@xs4all.nl> References: <20081212075655.GK11544@leitl.org> <34568AAB-6A20-44B7-B80B-FA8BB92AC1F6@xs4all.nl> Message-ID: 23.55 Mflops/W according to green500 estimates (#488 in thier list) 2008/12/12 Vincent Diepeveen > > On Dec 12, 2008, at 8:56 AM, Eugen Leitl wrote: > > >> http://www.goodgearguide.com.au/article/270416/inside_tsubame_- >> _nvidia_gpu_supercomputer?fp=&fpid=&pf=1 >> >> Inside Tsubame - the Nvidia GPU supercomputer >> >> Tokyo Tech University's Tsubame supercomputer attained 29th ranking in the >> new Top 500, thanks in part to hundreds of Nvidia Tesla graphics cards. >> >> Martyn Williams (IDG News Service) 10/12/2008 12:20:00 >> >> When you enter the computer room on the second floor of Tokyo Institute of >> Technology's computer building, you're not immediately struck by the size >> of >> Japan's second-fastest supercomputer. You can't see the Tsubame computer >> for >> the industrial air conditioning units that are standing in your way, but >> this >> in itself is telling. With more than 30,000 processing cores buzzing away, >> the machine consumes a megawatt of power and needs to be kept cool. >> >> > 1000000 watt / 77480 gflop = 12.9 watt per gflop. > > If you run double precision codes on this box it is a big energy waster > IMHO. > (of course it's very well equipped for all kind of crypto codes using that > google library). > > Vincent > > > Tsubame was ranked 29th-fastest supercomputer in the world in the latest >> Top >> 500 ranking with a speed of 77.48T Flops (floating point operations per >> second) on the industry-standard Linpack benchmark. >> >> While its position is relatively good, that's not what makes it so >> special. >> The interesting thing about Tsubame is that it doesn't rely on the raw >> processing power of CPUs (central processing units) alone to get its work >> done. Tsubame includes hundreds of graphics processors of the same type >> used >> in consumer PCs, working alongside CPUs in a mixed environment that some >> say >> is a model for future supercomputers serving disciplines like material >> chemistry. >> >> Graphics processors (GPUs) are very good at quickly performing the same >> computation on large amounts of data, so they can make short work of some >> problems in areas such as molecular dynamics, physics simulations and >> image >> processing. >> >> "I think in the vast majority of the interesting problems in the future, >> the >> problems that affect humanity where the impact comes from nature ... >> requires >> the ability to manipulate and compute on a very large data set," said >> Jen-Hsun Huang, CEO of Nvidia, who spoke at the university this week. >> Tsubame >> uses 680 of Nvidia's Tesla graphics cards. >> >> Just how much of a difference do the GPUs make? Takayuki Aoki, a professor >> of >> material chemistry at the university, said that simulations that used to >> take >> three months now take 10 hours on Tsubame. >> >> Tsubame itself - once you move past the air-conditioners - is split across >> several rooms in two floors of the building and is largely made up of >> rack-mounted Sun x4600 systems. There are 655 of these in all, each of >> which >> has 16 AMD Opteron CPU cores inside it, and Clearspeed CSX600 accelerator >> boards. >> >> The graphics chips are contained in 170 Nvidia Tesla S1070 rack-mount >> units >> that have been slotted in between the Sun systems. Each of the 1U Nvidia >> systems has four GPUs inside, each of which has 240 processing cores for a >> total of 960 cores per system. >> >> The Tesla systems were added to Tsubame over the course of about a week >> while >> the computer was operating. >> >> "People thought we were crazy," said Satoshi Matsuoka, director of the >> Global >> Scientific Information and Computing Center at the university. "This is a >> ?1 >> billion (US$11 million) supercomputer consuming a megawatt of power, but >> we >> proved technically that it was possible." >> >> The result is what university staff call version 1.2 of the Tsubame >> supercomputer. >> >> "I think we should have been able to achieve 85 [T Flops], but we ran out >> of >> time so it was 77 [T Flops]," said Matsuoka of the benchmarks performed on >> the system. At 85T Flops it would have risen a couple of places in the Top >> 500 and been ranked fastest in Japan. >> >> There's always next time: A new Top 500 list is due out in June 2009, and >> Tokyo Institute of Technology is also looking further ahead. >> >> "This is not the end of Tsubame, it's just the beginning of GPU >> acceleration >> becoming mainstream," said Matsuoka. "We believe that in the world there >> will >> be supercomputers registering several petaflops in the years to come, and >> we >> would like to follow suit." >> >> Tsubame 2.0, as he dubbed the next upgrade, should be here within the next >> two years and will boast a sustained performance of at least a petaflop (a >> petaflop is 1,000 teraflops), he said. The basic design for the machine is >> still not finalized but it will continue the heterogeneous computing base >> of >> mixing CPUs and GPUs, he said. >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081212/504a9591/attachment.html From niftyompi at niftyegg.com Fri Dec 12 13:24:47 2008 From: niftyompi at niftyegg.com (Nifty Tom Mitchell) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] For grins...India In-Reply-To: References: <493F7BF9.70306@aplpi.com> <0D49B15ACFDF2F46BF90B6E08C90048A064D922BD5@quadbrsex1.quadrics.com> <20081210213929.GB3449@compegg.wr.niftyegg.com> Message-ID: <20081212212447.GA3146@compegg.wr.niftyegg.com> On Thu, Dec 11, 2008 at 12:01:23AM +0100, Vincent Diepeveen wrote: > > What is most interesting from supercomputer viewpoint seen is the > comments i > got from some scientists when speaking about climate calculations. > > At a presentation at SARA at 11 september 2008 with some bobo's there > (minister bla bla),i > there was a few sheets from the North-Atlantic. > > It was done in rectangles from 40x40KM. Interesting... the north Atlantic, If the width of the straits on the other side of the Arctic is about 60 miles the straits might be represented by three data points in the model. Three data points to ask a global question... seems almost silly from my seat here in the peanut gallery. -- T o m M i t c h e l l Found me a new hat, now what? From csamuel at vpac.org Fri Dec 12 14:01:31 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Inside Tsubame - the Nvidia GPU supercomputer - OpenCL In-Reply-To: <9f8092cc0812120645x77e3257cq7f477e963bdbf41b@mail.gmail.com> Message-ID: <894439029.167191229119291931.JavaMail.root@mail.vpac.org> ----- "John Hearns" wrote: > Emmmm.... I'm no expert on Clearspeed, but AFAIK > Clearspeeds selling point is that the cards run > standard maths library functions - ie. you just > 'drop in' a compatible maths library and the card > gets given the computations to do. AMD are working on a version of their Core Maths Library (ACML) that can offload to a compatible ATI GPU if it's installed. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From hahn at mcmaster.ca Fri Dec 12 19:20:57 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: > What is 1u? rack-mounted hardware is measured in units called "units" ;) 1U means 1 rack unit: roughly 19" wide and 1.75" high. racks are all the same width, and rackmount unit consumes some number of units in height. (rack depth is moderately variable.) (a full rack is generally 42"). a 1U server is a basic cluster building block - pretty well suited, since it's not much taller than a disk, and fits a motherboard pretty nicely (clearance for dimms if designed properly, a couple optional cards, passive CPU heatsinks.) > What is a blade system? it is a computer design that emphasizes an enclosure and fastening mechanism that firmly locks buyers into a particular vendor's high-margin line ;) in theory, the idea is to factor a traditional server into separate components, such as shared power supply, unified management, and often some semi-integrated network/san infrastructure. one of the main original selling points was power management: that a blade enclosure would have fewer, more fully loaded, more efficnet PSUs. and/or more reliable. blades are often claimed to have superior managability. both of these factors are very, very arguable, since it's now routine for 1U servers to have nearly the same PSU efficiency, for instance. and in reality, simple managability interfaces like IPMI are far better (scalably scriptable) than a too-smart gui per enclosure, especially if you have 100 enclosures... > goes into a good rack in terms of size and matieral (assuming it has to be > insulated) ignoring proprietary crap, MB sizes are quite standardized. and since 10 million random computer shops put them together, they're incredibly forgiving when it comes to mounting, etc. I'd recommend just glue-gunning stuff into place, and not worring too much. > Anyone using clusters for animation on this list? not much, I think. this list is mainly "using commodity clusters to do stuff fairly reminiscent of traditional scientific supercomputing". animation is, in HPC terms, embarassingly parallel and often quite IO-intensive. both those are somewhat derogatory. all you need to do an animation farm is some storage, a network, nodes and probably a scheduler or at least task queue-er. From oneal at dbi.udel.edu Fri Dec 12 08:30:31 2008 From: oneal at dbi.udel.edu (Doug ONeal) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Re: Rear-door heat exchangers and condensation In-Reply-To: References: <200812091535.03920.kilian.cavalotti.work@gmail.com> Message-ID: On 12/11/2008 05:57 PM, Mark Hahn wrote: >> Netshelter VX racks and the powered ventilation rear doors are not >> sufficient any more. > > why do you have doors on your rack? normally, a rack is filled with > servers with fans that generate the standard front-to-back airflow. > that means that you want no doors, or at least only high-perf mesh ones. > The rear doors are apc air removal units with three 8" fans that vent the hot air out the top of the unit. It is possible to attach ducts to the units to vent the air out of the server room completely but my physical setup does not allow for that. There are no front doors on the racks. From vlad at geociencias.unam.mx Fri Dec 12 09:05:39 2008 From: vlad at geociencias.unam.mx (Vlad Manea) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] File server on ROCKS cluster Message-ID: <494299E3.8020603@geociencias.unam.mx> Hi, I need to add to my new ROCKS 5.1 cluster a fileserver, the /export partition of the first disk on the frontend might not be enough. First question: Is there any documentation on how rocks do this? Second: is out there anyone with experience on Dell MD3000(i) with rocks? I will probably buy one... Thanks, Vlad From dmitri.chubarov at gmail.com Fri Dec 12 09:47:23 2008 From: dmitri.chubarov at gmail.com (Dmitri Chubarov) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: Hello, my first reply missed the list by mistake so I will repeat a few points that I mentioned there. > What is 1u? > > What is a blade system? > Compute clusters are often built of rack-server hardware meaning boxes different from desktop boxes and chipset that have features not necessary for desktop PCs like ECC memory, redundant power supply units, integrated management processors, RAID controllers, that altogether provide better reliability, since the failure rate for a cluster of a 100 nodes is 100 times higher than for a single node. You may not need any of it for a 16 node rendering farm. Anyone using clusters for animation on this list? > We are just writing up on a research project on distributed rendering. Rendering is the part in the animation process that requires the most processing power. We used 3dStudio Max (3DS) for modelling and V-Ray for rendering. 3DS has its own utility, called Backburner, for distributing frames among a number of cluster nodes. We observed that V-Ray failed on some certain frames thus stopping the whole rendering queue, therefore the process was not completely automated. I would also repeat that a storage subsystem that uses an array of disks is essential for performance. > > At this time I am trying to figure out the racks. Am meeting the hardware > guy on Saturday and we were thinking of opening up the PCS i have lying > around and taking measurements of how the mother boards fit into the > cases,with the intention of creating a rack from scratch. Any ideas of what > goes into a good rack in terms of size and matieral (assuming it has to be > insulated) > This sort of rack is more of a research project. On the contrary, the usual kind of rack is an IEC standard server rack, http://en.wikipedia.org/wiki/19_inch_rack > Am planning to run animation software (like blender) on it. Since animation > software requires large processing power i am assuming they have already > worked on parrallelizing the code... > Blender does not seem to have a driver to distribute rendering (I might be wrong) but it can generate PovRay scripts and povray can make use of parallel processing in a number of ways. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081212/7d31f14d/attachment.html From william.a.sellers at nasa.gov Fri Dec 12 11:53:31 2008 From: william.a.sellers at nasa.gov (Bill Sellers) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <9f8092cc0812121021x4cd4a5b7y354007afd897571d@mail.gmail.com> References: <200812091535.03920.kilian.cavalotti.work@gmail.com><9f8092cc0812120935t7a001409oe77ec688965d1cae@mail.gmail.com><1229105100.5611.1.camel@desktop> <9f8092cc0812121021x4cd4a5b7y354007afd897571d@mail.gmail.com> Message-ID: <4942C13B.8010309@nasa.gov> John Hearns wrote: > > > 2008/12/12 Huw Lynes > > > > > That's the bit of information I was missing. I'd assumed the > entire door > swung out as one losing all cooling when you work on the rack. The > stable-door approach makes more sense. > > I still like our APC contained hot-aisle system though. > > Horses for course, Huw. (*) > > > SGI did an install in Ireland where they have the IRU chassis mounted > vertically, in those same APC racks. > Seemingly it works quite well - the drawback is that you get three > IRUs per rack rather than four. > But I guess with the APC racks being narrower you do not lose out that > much as you get more racks > per aisle. I must measure this up actually. > > > (*) Come on. Its Friday. We have a "race condition" in another thread. > Let the horsey puns flow. > > > We have had the SGI water cooled doors here for some time now. They are very effective. Early models had issues with condensation pooling under the rack and general pipe sweating, but newer models have drains. Our facility has plenty of chilled water, so this solution made sense. I wouldn't recommend such a system for a single 19" rack. There is quite a bit of plumbing involved and without an economy of scale, it wouldn't make sense to me. http://www.sgi.com/company_info/newsroom/media_coverage/downloads/hpcwire_datacenterchill.pdf Bill -- Bill Sellers, CISSP Team Lead/Systems Administrator, ConITS Sr Systems Analyst, NCI Inc. From hearnsj at googlemail.com Sat Dec 13 01:08:59 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Rear-door heat exchangers and condensation In-Reply-To: <4942C13B.8010309@nasa.gov> References: <200812091535.03920.kilian.cavalotti.work@gmail.com> <9f8092cc0812120935t7a001409oe77ec688965d1cae@mail.gmail.com> <1229105100.5611.1.camel@desktop> <9f8092cc0812121021x4cd4a5b7y354007afd897571d@mail.gmail.com> <4942C13B.8010309@nasa.gov> Message-ID: <9f8092cc0812130108u4cc250e4v705d54c3908eebff@mail.gmail.com> 2008/12/12 Bill Sellers > J I wouldn't recommend such a system for a single 19" rack. There is > quite a bit of plumbing involved and without an economy of scale, it > wouldn't make sense to me. > > Bill, it can also make sense for a small number of racks if you a) have a small room available with either none or a small amount of A/C b) an existing supply of chilled water into the building. In my situation, as I've said in this thread we have our own cooling lake, complete with resident fish and a heron! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081213/8bab85f9/attachment.html From landman at scalableinformatics.com Sat Dec 13 20:10:43 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] GPU-HMMer for interested people Message-ID: <49448743.7000605@scalableinformatics.com> Hi folks GPU-HMMer (part of the MPI-HMMer effort) has just been announced/released at http://www.mpihmmer.org MPI-HMMer has itself been improved with parallel-IO and better scalability features. JP has measured some large number (about 180x) over single cores on a cluster for the MPI run. Enjoy! Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From brahmaforces at gmail.com Sat Dec 13 03:48:04 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: Hello All, Thank you for your detailed responses. Following your line of thought, advice and web links, it seems that it is not difficult to build a small cluster to get started. I explored the photos of the various clusters that have been posted and it seems quite straightforward. It seems I have been siezed by a mad inspiration to do this...The line of thought is t make a 19 inch rack with aluminum plates on which the mother boards are mounted. The plan is first to simply create one using the old computers i have...This can be an experimental one to get going...Thereafter it would make sense to research the right mother boards, cooling and so on... It seems that I am going to take the plunge next week and wire these three computers on a home grown rack... A simple question though...Aluminum plates are used because aluminum is does not conduct electricity. Is this correct? Also for future reference, I saw a reference to dc-dc converters for power supply. Is it possible to use motherboards that do not guzzle electricity and generate a lot of heat and are yet powerful. It seems that not much more is needed that motherboards, CPUs, memory, harddrives and an ethernet card. For a low energy system, has any one explored ultra low energy consuming and heat generating power solutions that maybe use low wattage DC? On Sat, Dec 13, 2008 at 8:50 AM, Mark Hahn wrote: > What is 1u? >> > > rack-mounted hardware is measured in units called "units" ;) > 1U means 1 rack unit: roughly 19" wide and 1.75" high. racks are all > the same width, and rackmount unit consumes some number of units in height. > (rack depth is moderately variable.) (a full rack is generally 42"). > > a 1U server is a basic cluster building block - pretty well suited, > since it's not much taller than a disk, and fits a motherboard pretty > nicely (clearance for dimms if designed properly, a couple optional cards, > passive CPU heatsinks.) > > What is a blade system? >> > > it is a computer design that emphasizes an enclosure and fastening > mechanism > that firmly locks buyers into a particular vendor's high-margin line ;) > > in theory, the idea is to factor a traditional server into separate > components, such as shared power supply, unified management, and often > some semi-integrated network/san infrastructure. one of the main original > selling points was power management: that a blade enclosure would have > fewer, more fully loaded, more efficnet PSUs. and/or more reliable. blades > are often claimed to have superior managability. both of these factors are > very, very arguable, since it's now routine for 1U servers to have nearly > the same PSU efficiency, for instance. and in reality, simple managability > interfaces like IPMI are far better (scalably scriptable) > than a too-smart gui per enclosure, especially if you have 100 > enclosures... > > goes into a good rack in terms of size and matieral (assuming it has to be >> insulated) >> > > ignoring proprietary crap, MB sizes are quite standardized. and since 10 > million random computer shops put them together, they're incredibly > forgiving when it comes to mounting, etc. I'd recommend just glue-gunning > stuff into place, and not worring too much. > > Anyone using clusters for animation on this list? >> > > not much, I think. this list is mainly "using commodity clusters to do > stuff fairly reminiscent of traditional scientific supercomputing". > > animation is, in HPC terms, embarassingly parallel and often quite > IO-intensive. both those are somewhat derogatory. all you need to do > an animation farm is some storage, a network, nodes and probably a > scheduler or at least task queue-er. > -- Best regards, arjuna http://www.brahmaforces.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081213/10253b36/attachment.html From james.p.lux at jpl.nasa.gov Sun Dec 14 08:24:03 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:05 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: Message-ID: On 12/13/08 3:48 AM, "arjuna" wrote: > Hello All, > > Thank you for your detailed responses. Following your line of thought, advice > and web links, it seems that it is not difficult to build a small cluster to > get started. I explored the photos of the various clusters that have been > posted and it seems quite straightforward. > > It seems I have been siezed by a mad inspiration to do this...The line of > thought is t make a 19 inch rack with aluminum plates on which the mother > boards are mounted. > > The plan is first to simply create one using the old computers i have...This > can be an experimental one to get going...Thereafter it would make sense to > research the right mother boards, cooling and so on.. > > It seems that I am going to take the plunge next week and wire these three > computers on a home grown rack... > > A simple question though...Aluminum plates are used because aluminum is does > not conduct electricity. Is this correct? No.. Aluminum is a good conductor. Aluminum is used because it's cheap and easy to work with and doesn't rust. Steel is even cheaper, but harder to work with handtools, heavier, and it needs to be painted. > > Also for future reference, I saw a reference to dc-dc converters for power > supply. Is it possible to use motherboards that do not guzzle electricity and > generate a lot of heat and are yet powerful. It seems that not much more is > needed that motherboards, CPUs, memory, harddrives and an ethernet card. For a > low energy system, has any one explored ultra low energy consuming and heat > generating power solutions that maybe use low wattage DC? In general, the efficiency of line voltage AC to DC power supplies is higher than DC to DC converters, especially once you factor in the need to get the DC that the DC/DC converter starts with. It's a matter of IR losses on the primary side, mostly. For beowulfery, especially for novices, you're looking for inexpensive commodity consumer gear, and that's the standard PC power supplies. As far as the overall power consumption goes, total up the consumption of all the pieces, and it adds up fairly fast. One can use low power devices (e.g. Like those used in battery powered applications such as notebook computers), but typically, you also take a performance hit. Since the vast majority of clusters are not battery powered, and you're interested in computational speed, there's no advantage in replacing 5 standard PCs with 10 lowpower, low speed PCs. From amacater at galactic.demon.co.uk Sun Dec 14 08:40:43 2008 From: amacater at galactic.demon.co.uk (Andrew M.A. Cater) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: <20081214164043.GA2668@galactic.demon.co.uk> On Sat, Dec 13, 2008 at 05:18:04PM +0530, arjuna wrote: > Hello All, > > Thank you for your detailed responses. Following your line of thought, > advice and web links, it seems that it is not difficult to build a small > cluster to get started. I explored the photos of the various clusters that > have been posted and it seems quite straightforward. > The easiest/most straightforward way, if you have PC's in mini-tower / tower cases. Get some strong wood / steel shelving (English trade name Dexion for steel shelving. Place your PCs four to a shelf. Route cables etc. down the back of the shelf. Allow plenty of space for circulating air. Add an Ethernet switch or two if needed. [Andy - who has four computers at his feet connected to a cheap KVM (Keyboard/video/mouse) switch and one Ethernet switch. > > The plan is first to simply create one using the old computers i have...This > can be an experimental one to get going...Thereafter it would make sense to > research the right mother boards, cooling and so on... > > It seems that I am going to take the plunge next week and wire these three > computers on a home grown rack... See above. > > A simple question though...Aluminum plates are used because aluminum is does > not conduct electricity. Is this correct? > No - for God's sake, if you don't know _this_ much, DON'T try and wire your own solution but leave your PCs in their cases. Jim Lux's solution uses baking tray-size aluminium sheets in a commercial kitchen trolley. Air cooled - but you need to be extremely careful about how you mount the motherboards on standoffs / insulate etc. and how you mount PSUs. > Also for future reference, I saw a reference to dc-dc converters for power > supply. Is it possible to use motherboards that do not guzzle electricity > and generate a lot of heat and are yet powerful. It seems that not much more > is needed that motherboards, CPUs, memory, harddrives and an ethernet card. > For a low energy system, has any one explored ultra low energy consuming and > heat generating power solutions that maybe use low wattage DC? > A lot of telecoms racks are wired for 48V DC - but by the time you've gone from AC - 48V DC + DC voltage drop + conversion the other way for anything that requires AC it's massively inefficient :( Car / lorry mobile equipment runs on 12 or 24V - but anything larger than a laptop usually needs a DC -> AC inverter and 110/240V AC out. Something like the Intel Atom dual core would work well - but it's limited in memory and I/O. The "Beowulf in a lunch box" used 12 Via mini-ITX boards - but it was designed as a fun project. > On Sat, Dec 13, 2008 at 8:50 AM, Mark Hahn wrote: > > > What is 1u? > >> > > > -- > Best regards, > arjuna > http://www.brahmaforces.com Best regards, AndyC > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From diep at xs4all.nl Sun Dec 14 08:56:49 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: Message-ID: <16A5C694-180E-47FA-BF4E-AED9FF8D8A68@xs4all.nl> On Dec 11, 2008, at 6:51 AM, arjuna wrote: > Hello all again: > > I thought I would add a little more background about myself and the > intended cluster. I am an artist and a computer programmer and am > planning on using this cluster as a starting point to do research > on building an ideal cluster for Animation for my own personal/ > entrepreneurial work. It would reside in my art studio. As an > artist the idea of rack mounting the commodity PCS is much more fun > that piling up the PCS. > > I was thinking of working with a local hardware friend and figuring > out how to screw on motherboards onto hardware type racks. Im sure > there are better tried and tested racks out there that are not > expensive. Any suggestions on the actual physical hardware for > constructing racks for upto 16PCs. > Hello Arjuna, I'm a bit interested you mention this. When i negotiate about animations, which sometimes already were edits from major movies, the hardware used is nothing more than a simple PC to produce it. Animation Designers work for small amounts of money, so paying for a lot of hardware, more than some fast PC or fast macintosh, with 2 very good big TFT's (apple offers some fantastic dual set of TFT's - though quite expensive for a design studio maybe). It sure takes a couple of minutes to render animations in high resolutions, yet i'm quite amazed you need more hardware for this, yes even a cluster. Isn't some 16 core Shanghai box with a lot of RAM already total overpower for this? Can you explain where you need all that cpu power for? Best Regards, Vincent > Also any thoughts on racks versus piles of PCS. > > A lot of the posts on the internet are old and out of date. I am > wondering what the upto date trends are in racking commodity > computers to create beowulf clusters. What should i be reading? > > -- > Best regards, > arjuna > http://www.brahmaforces.com > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Sun Dec 14 09:10:47 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: On Sat, 13 Dec 2008, arjuna wrote: > A simple question though...Aluminum plates are used because aluminum is does > not conduct electricity. Is this correct? Aluminum is an EXCELLENT conductor of electricity, one of the best! Basically all metals conduct electricity. When you mount the motherboards you MUST take care to use spacers in the right places (under the holes for mounting screws on the motherboards, usually) to keep the solder traces of the motherboard from shorting out! Your question makes me very worried on your behalf. Electricity is quite dangerous, and in general messing with it should be avoided by anyone that does not already know things like this. In India, with 240 VAC as standard power, this is especially true. True, the power supplied to the motherboards is in several voltages 12V and under, but believe it or not you can kill yourself with 12V, and starting a fire with 12V is even easier. I would >>strongly<< suggest that you find a friend with some electrical engineering experience, or read extensively on electricity and electrical safety before attempting any sort of motherboard mount. Mark's suggestion of hot melt glue, for example, is predicated on your PRESUMED knowledge that cookie sheets or aluminum sheets are conductors, that the motherboard has many traces carrying current, and that when you mount the motherboard you must take great care to ensure that current-carrying traces CANNOT come in contact with metal. The reasons aluminum plates are suggested are a) it's cheap; b) it's easily drilled/tapped for screws; c) it's fireproof AS LONG AS YOU DON'T GET IT TOO HOT (heaven help you if you ever do start it on fire, as it then burns like thermite -- oh wait, thermite IS aluminum plus iron oxide); d) it reflects/traps EM radiation. Wood would be just as good except for the fireproof bit (a big one, though -- don't use wood) and the EM reflecting part. The aluminum plates should probably all be grounded back to a common ground. The common ground should NOT be a current carrying neutral -- I'm not an expert on 240 VAC as distributed in India and hesitate to advise you on where/how to safely ground them. You should probably read about "ground loops" before you mess with any of this. Seriously, this is dangerous and you can hurt yourself or others if you don't know what you are doing. You need to take the time to learn to the point where you KNOW how electricity works and what a conductor is vs an insulator and what electrical codes are and WHY they are what they are before you attempt to work with bare motherboards and power supplies. It is possible to kill yourself with a nine volt transistor radio battery (believe it or not) although you have to work a bit to do so. It is a lot easier with 12V, and even if you don't start a fire, you will almost certainly blow your motherboard/CPU/memory and power supply if you short out 12V in the wrong place. > Also for future reference, I saw a reference to dc-dc converters for power > supply. Is it possible to use motherboards that do not guzzle electricity > and generate a lot of heat and are yet powerful. It seems that not much more > is needed that motherboards, CPUs, memory, harddrives and an ethernet card. > For a low energy system, has any one explored ultra low energy consuming and > heat generating power solutions that maybe use low wattage DC? The minimum power requirements are dictated by your choice of motherboard, CPU, memory, and peripherals. Period. They require several voltages to be delivered into standardized connectors from a supply capable of providing sufficient power at those voltages. Again, it is clear from your question that you don't understand what power is or the thermodynamics of supplying it, and you should work on learning this (where GIYF). As I noted in a previous reply, typical motherboard draws are going to be in the 100W to 300+W loaded, and either you provide this or the system fails to work. To provide 100W to the motherboard, your power supply will need to draw 20-40% more than this, lost in the conversion from 120 VAC or 240 VAC to the power provided to the motherboard and peripherals. Again, you have no choice here. The places you do have a choice are: a) Buying motherboards etc with lower power requirements. If you are using recycled systems, you use what you've got, but when you buy in the future you have some choice here. However, you need to be aware of what you are optimizing! One way to save power is to run at lower clock, for example -- there is a tradeoff between power drawn and speed. But slower systems just mean you draw lower power for longer, and you may well pay about the same for the net energy required for a computation! You need to optimize average draw under load times the time required to complete a computation, not just "power", weighted with how fast you want your computations to complete and your budget. b) You have a LIMITED amount of choice in power supplies. That's the 20-40% indicated above. A cheap power supply or one that is incorrectly sized relative to the load is more likely to waste a lot of power as heat operating at baseline and be on the high end of the power draw required to operate a motherboard (relatively inefficient). A more expensive one (correctly sized for the application) will waste less energy as heat providing the NECESSARY power for your system. That is, you don't have a lot of choice when getting started -- you're probably best off just taking the power supplies out of the tower cases of your existing systems and using them (or better, just using a small stack of towers without remounting them until you see how clustering works for you, which is safe AND effective). When you have done some more research and learned about electricity, power supplies, and so on using a mix of Google/web, books, and maybe a friend who works with electricity and is familiar with power distribution and code requirements (if any) in New Delhi, THEN on your SECOND pass you can move on to a racked cluster with custom power supplies matched to specific "efficient" motherboards. rgb > > On Sat, Dec 13, 2008 at 8:50 AM, Mark Hahn wrote: > What is 1u? > > > rack-mounted hardware is measured in units called "units" ;) > 1U means 1 rack unit: roughly 19" wide and 1.75" high. ?racks > are all > the same width, and rackmount unit consumes some number of units > in height. > (rack depth is moderately variable.) ?(a full rack is generally > 42"). > > a 1U server is a basic cluster building block - pretty well > suited, > since it's not much taller than a disk, and fits a motherboard > pretty nicely (clearance for dimms if designed properly, a > couple optional cards, passive CPU heatsinks.) > > What is a blade system? > > > it is a computer design that emphasizes an enclosure and fastening > mechanism > that firmly locks buyers into a particular vendor's high-margin line > ;) > > in theory, the idea is to factor a traditional server into separate > components, such as shared power supply, unified management, and often > some semi-integrated network/san infrastructure. ?one of the main > original > selling points was power management: that a blade enclosure would have > fewer, more fully loaded, more efficnet PSUs. ?and/or more reliable. > blades are often claimed to have superior managability. ?both of these > factors are very, very arguable, since it's now routine for 1U servers > to have nearly the same PSU efficiency, for instance. ?and in reality, > simple managability interfaces like IPMI are far better (scalably > scriptable) > than a too-smart gui per enclosure, especially if you have 100 > enclosures... > > goes into a good rack in terms of size and matieral > (assuming it has to be > insulated) > > > ignoring proprietary crap, MB sizes are quite standardized. ?and since > 10 million random computer shops put them together, they're incredibly > forgiving when it comes to mounting, etc. ?I'd recommend just > glue-gunning > stuff into place, and not worring too much. > > Anyone using clusters for animation on this list? > > > not much, I think. ?this list is mainly "using commodity clusters to > do stuff fairly reminiscent of traditional scientific supercomputing". > > animation is, in HPC terms, embarassingly parallel and often quite > IO-intensive. ?both those are somewhat derogatory. ?all you need to do > an animation farm is some storage, a network, nodes and probably a > scheduler or at least task queue-er. > > > > > -- > Best regards, > arjuna > http://www.brahmaforces.com > > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From lindahl at pbm.com Sun Dec 14 16:37:33 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] ntpd wonky? In-Reply-To: References: <20081209180633.GA21193@bx9> Message-ID: <20081215003733.GA2394@bx9> On Thu, Dec 11, 2008 at 11:01:31PM -0800, Bernard Li wrote: > Have you tried other pools eg. pool.ntp.org? That is stable for me. So it's not me, it's Red Hat's pool that's wonky. I see that CentOS switched to using ntp.org in 5.2, which I didn't automagically get thanks to rpm creating ntpservers.rpmnew, even though I hadn't modified the ntpservers file. Mmf. -- greg From gdjacobs at gmail.com Sun Dec 14 18:16:09 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: <4945BDE9.3070806@gmail.com> Robert G. Brown wrote: > On Sat, 13 Dec 2008, arjuna wrote: > >> A simple question though...Aluminum plates are used because aluminum >> is does >> not conduct electricity. Is this correct? > > Aluminum is an EXCELLENT conductor of electricity, one of the best! > Basically all metals conduct electricity. When you mount the > motherboards you MUST take care to use spacers in the right places > (under the holes for mounting screws on the motherboards, usually) to > keep the solder traces of the motherboard from shorting out! > > Your question makes me very worried on your behalf. Electricity is > quite dangerous, and in general messing with it should be avoided by > anyone that does not already know things like this. In India, with 240 > VAC as standard power, this is especially true. True, the power > supplied to the motherboards is in several voltages 12V and under, but > believe it or not you can kill yourself with 12V, and starting a fire > with 12V is even easier. You can actually do a good job of TIG welding with 12V. > I would >>strongly<< suggest that you find a friend with some electrical > engineering experience, or read extensively on electricity and > electrical safety before attempting any sort of motherboard mount. > Mark's suggestion of hot melt glue, for example, is predicated on your > PRESUMED knowledge that cookie sheets or aluminum sheets are > conductors, that the motherboard has many traces carrying current, and > that when you mount the motherboard you must take great care to ensure > that current-carrying traces CANNOT come in contact with metal. > > The reasons aluminum plates are suggested are a) it's cheap; b) it's > easily drilled/tapped for screws; c) it's fireproof AS LONG AS YOU DON'T > GET IT TOO HOT (heaven help you if you ever do start it on fire, as it > then burns like thermite -- oh wait, thermite IS aluminum plus iron > oxide); d) it reflects/traps EM radiation. Ask the Royal Navy about fireproof aluminum. > Wood would be just as good except for the fireproof bit (a big one, > though -- don't use wood) and the EM reflecting part. > > The aluminum plates should probably all be grounded back to a common > ground. The common ground should NOT be a current carrying neutral -- > I'm not an expert on 240 VAC as distributed in India and hesitate to > advise you on where/how to safely ground them. You should probably read > about "ground loops" before you mess with any of this. Commodity ATX power supplies will have a grounded frame. Mounting the power supply to the pan will work quite well. > Seriously, this is dangerous and you can hurt yourself or others if you > don't know what you are doing. You need to take the time to learn to > the point where you KNOW how electricity works and what a conductor is > vs an insulator and what electrical codes are and WHY they are what they > are before you attempt to work with bare motherboards and power > supplies. It is possible to kill yourself with a nine volt transistor > radio battery (believe it or not) although you have to work a bit to do > so. It is a lot easier with 12V, and even if you don't start a fire, > you will almost certainly blow your motherboard/CPU/memory and power > supply if you short out 12V in the wrong place. Yes, it is possible to kill yourself with low voltage. You have to really work at it and/or be unlucky, but it can be done. A DC resistance from leg to arm of 100 ohms or so is hard to achieve. Stabbing oneself with electrified needles, for starters. >> Also for future reference, I saw a reference to dc-dc converters for >> power >> supply. Is it possible to use motherboards that do not guzzle electricity >> and generate a lot of heat and are yet powerful. It seems that not >> much more >> is needed that motherboards, CPUs, memory, harddrives and an ethernet >> card. >> For a low energy system, has any one explored ultra low energy >> consuming and >> heat generating power solutions that maybe use low wattage DC? > > The minimum power requirements are dictated by your choice of > motherboard, CPU, memory, and peripherals. Period. They require > several voltages to be delivered into standardized connectors from a > supply capable of providing sufficient power at those voltages. Again, > it is clear from your question that you don't understand what power is > or the thermodynamics of supplying it, and you should work on learning > this (where GIYF). As I noted in a previous reply, typical motherboard > draws are going to be in the 100W to 300+W loaded, and either you > provide this or the system fails to work. To provide 100W to the > motherboard, your power supply will need to draw 20-40% more than this, > lost in the conversion from 120 VAC or 240 VAC to the power provided to > the motherboard and peripherals. Again, you have no choice here. > > The places you do have a choice are: > > a) Buying motherboards etc with lower power requirements. If you are > using recycled systems, you use what you've got, but when you buy in the > future you have some choice here. However, you need to be aware of what > you are optimizing! One way to save power is to run at lower clock, for > example -- there is a tradeoff between power drawn and speed. But > slower systems just mean you draw lower power for longer, and you may > well pay about the same for the net energy required for a computation! > You need to optimize average draw under load times the time required to > complete a computation, not just "power", weighted with how fast you > want your computations to complete and your budget. > > b) You have a LIMITED amount of choice in power supplies. That's the > 20-40% indicated above. A cheap power supply or one that is incorrectly > sized relative to the load is more likely to waste a lot of power as > heat operating at baseline and be on the high end of the power draw > required to operate a motherboard (relatively inefficient). A more > expensive one (correctly sized for the application) will waste less > energy as heat providing the NECESSARY power for your system. > > That is, you don't have a lot of choice when getting started -- you're > probably best off just taking the power supplies out of the tower cases > of your existing systems and using them (or better, just using a small > stack of towers without remounting them until you see how clustering > works for you, which is safe AND effective). When you have done some > more research and learned about electricity, power supplies, and so on > using a mix of Google/web, books, and maybe a friend who works with > electricity and is familiar with power distribution and code > requirements (if any) in New Delhi, THEN on your SECOND pass you can > move on to a racked cluster with custom power supplies matched to > specific "efficient" motherboards. > > rgb Although this list is quite liberal, we should be fair to Donald Becker and Co. in pointing out that most of the questions you have, here, are general computer hardware/electrical questions. They are best dealt with aside from this list. However, we'll make sure you have a good place to start. Buff up on basic electronics, both theoretical and practical. Start with a good textbook like Grob, "Basic Electronics" and something like an electronics projects kit. We used to be able to buy them at Radio Shack. If you just want to cut to the chase and assemble some computers, GIYF. Look for "build your own computer". For example, I have included one of the first links available. It appears to be fairly thorough. http://www.pcmech.com/byopc/ Once you're at the stage where you are comfortable with computer hardware, at least, and perhaps electronics in general, then you will be prepared to build a Beowulf. We will be happy to help with any questions at this stage, buy please check google first always, as sometimes the answer is already out there. -- Geoffrey D. Jacobs From rgb at phy.duke.edu Sun Dec 14 20:30:19 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: <4945BDE9.3070806@gmail.com> References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> <4945BDE9.3070806@gmail.com> Message-ID: On Sun, 14 Dec 2008, Geoff Jacobs wrote: > Yes, it is possible to kill yourself with low voltage. You have to > really work at it and/or be unlucky, but it can be done. A DC resistance > from leg to arm of 100 ohms or so is hard to achieve. Stabbing oneself > with electrified needles, for starters. As in: http://www.darwinawards.com/darwin/darwin1999-50.html Improving the human race one incident at a time... There was also the joy of touching a 9V battery across the tip of your tongue (safe enough, but more than enough to convince you that 9V can make you go "Ow"). Or the dangers of 12V batteries in a saltwater environment or even when it is raining and your hands are wet and your skin is maybe a bit split when you grab one by the posts. As I said, do NOT mess with even "low" voltage electricity unless you know things like what a conductor is, or you too might qualify for a Darwin! And we'd hate that...:-) rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From rgb at phy.duke.edu Sun Dec 14 22:05:01 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: On Mon, 15 Dec 2008, arjuna wrote: > The only reason I mention "Alumninum" was because I noticed that the > motherboards in the tower cases were stuck onto some metal and my hardware > person told me that this is a special material that does not conduct > electricity. And since everyone was taking about aluminum boards here,? i > put 2 and 2 together, and obviously instead of 4 I got 10... I think this is most unlikely (that they are on some non-conductive metal). At least I've never encountered one mounted like that. Also, it isn't terribly easy to make a "non-conductive metal" -- it is pretty much an oxymoron, in fact. One can INSULATE a metal by e.g. spraying it with epoxy or enameling it, but the "metallic bond" in metals forms a "nearly free electron" gas. > Given that we are working with a stepped down voltage the risk is lower, > however since we are creating a system, it would? make sense to make it > entirely safe. Wood like you said is a fire hazard,? aluminum conducts > electricity. > > Then in your experience, what would be the right material to use to avoid > electrical and fire hazards, assuming its not a space ship kevelar or other > impossible to find substance or one that is prohibitively expensive. As several people have said, aluminum or steel sheeting is fine, but learn how to mount motherboards on it safely with risers. If you actually look at a motherboard mounted in most cases, you'll see that it is sitting on somewhere between four and eight small metal pedestal/standoffs that the screws actually screw into. The pedestals are locked into the case mount, which is usually steel or a composite metal in commercial cases. As always, using Google I easily found: http://www.youtube.com/watch?v=7YXro0Gs6Vc and you can actually WATCH people mount motherboards into cases. This will show you pretty much what you have to do to mount motherboards onto e.g. cookie sheets, including where you have to drill holes and mount standoffs. > I realize i have a learning curve with: > > 1) Building computer hardware > 2) Electronics > > I will now attempt to read up and practise both of the above by assembling a > computer and finding some basic electronic books and materials to assemble. > > In the mean time I do want to make a 1 U system that i dreamed of last > night, where 3 mother boards go on 1 plate (assuming the right material and > precautions to make it safe) This plate goes in some kind of casing for > safety for now, later to be removed and put into a rack... Lots of Youtube out there. rgb > > Then its time to play with the parrallel processing software.... > > On Sun, Dec 14, 2008 at 10:40 PM, Robert G. Brown wrote: > On Sat, 13 Dec 2008, arjuna wrote: > > A simple question though...Aluminum plates are used > because aluminum is does > not conduct electricity. Is this correct? > > > Aluminum is an EXCELLENT conductor of electricity, one of the best! > Basically all metals conduct electricity. ?When you mount the > motherboards you MUST take care to use spacers in the right places > (under the holes for mounting screws on the motherboards, usually) to > keep the solder traces of the motherboard from shorting out! > > Your question makes me very worried on your behalf. ?Electricity is > quite dangerous, and in general messing with it should be avoided by > anyone that does not already know things like this. ?In India, with > 240 > VAC as standard power, this is especially true. ?True, the power > supplied to the motherboards is in several voltages 12V and under, but > believe it or not you can kill yourself with 12V, and starting a fire > with 12V is even easier. > > I would >>strongly<< suggest that you find a friend with some > electrical > engineering experience, or read extensively on electricity and > electrical safety before attempting any sort of motherboard mount. > Mark's suggestion of hot melt glue, for example, is predicated on your > PRESUMED knowledge that cookie sheets or aluminum sheets are > conductors, that the motherboard has many traces carrying current, and > that when you mount the motherboard you must take great care to ensure > that current-carrying traces CANNOT come in contact with metal. > > The reasons aluminum plates are suggested are a) it's cheap; b) it's > easily drilled/tapped for screws; c) it's fireproof AS LONG AS YOU > DON'T > GET IT TOO HOT (heaven help you if you ever do start it on fire, as it > then burns like thermite -- oh wait, thermite IS aluminum plus iron > oxide); d) it reflects/traps EM radiation. > > Wood would be just as good except for the fireproof bit (a big one, > though -- don't use wood) and the EM reflecting part. > > The aluminum plates should probably all be grounded back to a common > ground. ?The common ground should NOT be a current carrying neutral -- > I'm not an expert on 240 VAC as distributed in India and hesitate to > advise you on where/how to safely ground them. ?You should probably > read > about "ground loops" before you mess with any of this. > > Seriously, this is dangerous and you can hurt yourself or others if > you > don't know what you are doing. ?You need to take the time to learn to > the point where you KNOW how electricity works and what a conductor is > vs an insulator and what electrical codes are and WHY they are what > they > are before you attempt to work with bare motherboards and power > supplies. ?It is possible to kill yourself with a nine volt transistor > radio battery (believe it or not) although you have to work a bit to > do > so. ?It is a lot easier with 12V, and even if you don't start a fire, > you will almost certainly blow your motherboard/CPU/memory and power > supply if you short out 12V in the wrong place. > > Also for future reference, I saw a reference to dc-dc > converters for power > supply. Is it possible to use motherboards that do not > guzzle electricity > and generate a lot of heat and are yet powerful. It seems > that not much more > is needed that motherboards, CPUs, memory, harddrives and > an ethernet card. > For a low energy system, has any one explored ultra low > energy consuming and > heat generating power solutions that maybe use low wattage > DC? > > > The minimum power requirements are dictated by your choice of > motherboard, CPU, memory, and peripherals. ?Period. ?They require > several voltages to be delivered into standardized connectors from a > supply capable of providing sufficient power at those voltages. > ?Again, > it is clear from your question that you don't understand what power is > or the thermodynamics of supplying it, and you should work on learning > this (where GIYF). ?As I noted in a previous reply, typical > motherboard > draws are going to be in the 100W to 300+W loaded, and either you > provide this or the system fails to work. ?To provide 100W to the > motherboard, your power supply will need to draw 20-40% more than > this, > lost in the conversion from 120 VAC or 240 VAC to the power provided > to > the motherboard and peripherals. ?Again, you have no choice here. > > The places you do have a choice are: > > ?a) Buying motherboards etc with lower power requirements. ?If you are > using recycled systems, you use what you've got, but when you buy in > the > future you have some choice here. ?However, you need to be aware of > what > you are optimizing! ?One way to save power is to run at lower clock, > for > example -- there is a tradeoff between power drawn and speed. ?But > slower systems just mean you draw lower power for longer, and you may > well pay about the same for the net energy required for a computation! > You need to optimize average draw under load times the time required > to > complete a computation, not just "power", weighted with how fast you > want your computations to complete and your budget. > > ?b) You have a LIMITED amount of choice in power supplies. ?That's the > 20-40% indicated above. ?A cheap power supply or one that is > incorrectly > sized relative to the load is more likely to waste a lot of power as > heat operating at baseline and be on the high end of the power draw > required to operate a motherboard (relatively inefficient). ?A more > expensive one (correctly sized for the application) will waste less > energy as heat providing the NECESSARY power for your system. > > That is, you don't have a lot of choice when getting started -- you're > probably best off just taking the power supplies out of the tower > cases > of your existing systems and using them (or better, just using a small > stack of towers without remounting them until you see how clustering > works for you, which is safe AND effective). ?When you have done some > more research and learned about electricity, power supplies, and so on > using a mix of Google/web, books, and maybe a friend who works with > electricity and is familiar with power distribution and code > requirements (if any) in New Delhi, THEN on your SECOND pass you can > move on to a racked cluster with custom power supplies matched to > specific "efficient" motherboards. > > ? rgb > > > > On Sat, Dec 13, 2008 at 8:50 AM, Mark Hahn > wrote: > ? ? ? ? ? ?What is 1u? > > > ? ? ?rack-mounted hardware is measured in units called > "units" ;) > ? ? ?1U means 1 rack unit: roughly 19" wide and 1.75" > high. ?racks > ? ? ?are all > ? ? ?the same width, and rackmount unit consumes some > number of units > ? ? ?in height. > ? ? ?(rack depth is moderately variable.) ?(a full rack is > generally > ? ? ?42"). > > ? ? ?a 1U server is a basic cluster building block - > pretty well > ? ? ?suited, > ? ? ?since it's not much taller than a disk, and fits a > motherboard > ? ? ?pretty nicely (clearance for dimms if designed > properly, a > ? ? ?couple optional cards, passive CPU heatsinks.) > > ? ? ? ? ? ?What is a blade system? > > > it is a computer design that emphasizes an enclosure and > fastening > mechanism > that firmly locks buyers into a particular vendor's > high-margin line > ;) > > in theory, the idea is to factor a traditional server into > separate > components, such as shared power supply, unified > management, and often > some semi-integrated network/san infrastructure. ?one of > the main > original > selling points was power management: that a blade > enclosure would have > fewer, more fully loaded, more efficnet PSUs. ?and/or more > reliable. > blades are often claimed to have superior managability. > ?both of these > factors are very, very arguable, since it's now routine > for 1U servers > to have nearly the same PSU efficiency, for instance. ?and > in reality, > simple managability interfaces like IPMI are far better > (scalably > scriptable) > than a too-smart gui per enclosure, especially if you have > 100 > enclosures... > > ? ? ?goes into a good rack in terms of size and matieral > ? ? ?(assuming it has to be > ? ? ?insulated) > > > ignoring proprietary crap, MB sizes are quite > standardized. ?and since > 10 million random computer shops put them together, > they're incredibly > forgiving when it comes to mounting, etc. ?I'd recommend > just > glue-gunning > stuff into place, and not worring too much. > > ? ? ?Anyone using clusters for animation on this list? > > > not much, I think. ?this list is mainly "using commodity > clusters to > do stuff fairly reminiscent of traditional scientific > supercomputing". > > animation is, in HPC terms, embarassingly parallel and > often quite > IO-intensive. ?both those are somewhat derogatory. ?all > you need to do > an animation farm is some storage, a network, nodes and > probably a > scheduler or at least task queue-er. > > > > > -- > Best regards, > arjuna > http://www.brahmaforces.com > > > > Robert G. Brown ? ? ? ? ? ? ? ? ? ? ? ?http://www.phy.duke.edu/~rgb/ > Duke University Dept. of Physics, Box 90305 > Durham, N.C. 27708-0305 > Phone: 1-919-660-2567 ?Fax: 919-660-2525 ? ? email:rgb@phy.duke.edu > > > > > -- > Best regards, > arjuna > http://www.brahmaforces.com > > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From nixon at nsc.liu.se Sun Dec 14 23:36:43 2008 From: nixon at nsc.liu.se (Leif Nixon) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] For grins...India In-Reply-To: <493F7BF9.70306@aplpi.com> (stephen mulcahy's message of "Wed\, 10 Dec 2008 08\:21\:13 +0000") References: <493F7BF9.70306@aplpi.com> Message-ID: stephen mulcahy writes: > Iceland have energy literally pumping out of the ground - if they can > sort out their connectivity to the US and Europe I think they'll > quickly become the data centre to the world. I understand there are some decent sized data centres popping up around Keflavik, and I seem to recall something about new fibres to mainland Europe being put in place. -- Leif Nixon - Systems expert ------------------------------------------------------------ National Supercomputer Centre - Linkoping University ------------------------------------------------------------ From amacater at galactic.demon.co.uk Mon Dec 15 13:04:42 2008 From: amacater at galactic.demon.co.uk (Andrew M.A. Cater) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> <4945BDE9.3070806@gmail.com> Message-ID: <20081215210442.GB3929@galactic.demon.co.uk> On Sun, Dec 14, 2008 at 11:30:19PM -0500, Robert G. Brown wrote: > On Sun, 14 Dec 2008, Geoff Jacobs wrote: > >> Yes, it is possible to kill yourself with low voltage. You have to >> really work at it and/or be unlucky, but it can be done. A DC resistance >> from leg to arm of 100 ohms or so is hard to achieve. Stabbing oneself >> with electrified needles, for starters. > > As in: > > http://www.darwinawards.com/darwin/darwin1999-50.html > > Improving the human race one incident at a time... > > There was also the joy of touching a 9V battery across the tip of your > tongue (safe enough, but more than enough to convince you that 9V can > make you go "Ow"). Or the dangers of 12V batteries in a saltwater > environment or even when it is raining and your hands are wet and your > skin is maybe a bit split when you grab one by the posts. > A small 9V lamp battery, a motor on one side of the winding to provide "pulsed AC" and an audio step up transformer made a superb electric shock machine and largish sparks :) Teenage radio hams: Shorting out 12V batteries with thin copper wire to see which ones still held charge and were worth paying the scrap dealer for :( [We're all still here - just :) ] 12V will weld a wedding ring to pretty much anything - or produce an effective amputation with ready made diathermy to seal up the wound :( > As I said, do NOT mess with even "low" voltage electricity unless you > know things like what a conductor is, or you too might qualify for a > Darwin! And we'd hate that...:-) > > rgb > AndyC From saville at comcast.net Sun Dec 14 12:14:42 2008 From: saville at comcast.net (Gregg Germain) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] MPICH upgrade question Message-ID: <49456932.1020208@comcast.net> Hi all, This is a little more in the way of a systems question: I'm presently running a cluster with mpich2-1.0.6p1 installed in /usr/local. I want to upgrade to mpich2-1.0.8. Should I run the 1.0.6 Make uninstall first? Or just install 1.0.8 on top of 1.0.6? Or is there some other cleanup method I should use to eliminate 1.0.6 first? thanks! Gregg From brahmaforces at gmail.com Sun Dec 14 21:07:53 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: <16A5C694-180E-47FA-BF4E-AED9FF8D8A68@xs4all.nl> References: <16A5C694-180E-47FA-BF4E-AED9FF8D8A68@xs4all.nl> Message-ID: Hi Vincent: I am a believer in beginners mind. Some of the things that i am doing like indian classical music, classical drawing, python programming are all things i got inspired to do and dived right into. And the inspiration is usually right on and after several years of doing it, one achieve a masterful level. This does entail a lot of passion, inspiration and the courage to go ahead and do it. I have been bitten by the animation bug, and this has something to do with the fact that i am currently deeply immersed in a several year training in the classical drawing for classical realist painters. Animators study drawing, but probably not in this depth. There is a huge synergy as you would know between drawing and animation which requires good visual mental faculties that are developed by drawing. So having become inspired by the animation, i have been researching it avidly. As an artist I am interested not in making short fun clips but eventually entire movies. I realize this is a large undertaking, but thats what i do...So is vocal indian classical music, kung fu and classical drawing. In researching Pixar which makes these kind of movies, i found they use farms of machines for rendering. Therefore the idea was planted in my brain, and being a good computer programmer I decided to ride the learning curve(and i know how to ride these wild horses given the amount of learning required in the classical arts and the time and patience it takes) and make me a beowulf which is the cheapest fastest machine. Made from commodity components it would be cheaper than high end machines (hence the relevance to this list in case you are wondering) it would eventually give me the power (having ridden the curve over time, not tomorow) to do the type of rendering that the big boys are doing. If Pixar is using render farms then i assume that is a good well thought out way to go for eventually creating full length animated movies. Also i am interested in special effects. The whole logic of the beowulf is to use cheap commodity machines to make super computers(to put is a bit simplistically) So why would one use an expensive stand alone machine with eventually limited capacity? If it were possible to make full length, cutting edge animated movies using stand alone machines then the boys who actually make them would be using them for their productions right? Hello Arjuna, > > I'm a bit interested you mention this. When i negotiate about animations, > which sometimes already were edits from major movies, > the hardware used is nothing more than a simple PC to produce it. Animation > Designers work for small amounts of money, > so paying for a lot of hardware, more than some fast PC or fast macintosh, > with 2 very good big TFT's (apple offers some fantastic > dual set of TFT's - though quite expensive for a design studio maybe). > > It sure takes a couple of minutes to render animations in high resolutions, > yet i'm quite amazed you need more hardware for this, > yes even a cluster. > > Isn't some 16 core Shanghai box with a lot of RAM already total overpower > for this? > > Can you explain where you need all that cpu power for? > > Best Regards, > Vincent > > Also any thoughts on racks versus piles of PCS. >> >> A lot of the posts on the internet are old and out of date. I am wondering >> what the upto date trends are in racking commodity computers to create >> beowulf clusters. What should i be reading? >> >> -- >> Best regards, >> arjuna >> http://www.brahmaforces.com >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> > > -- Best regards, arjuna http://www.brahmaforces.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081215/28784852/attachment.html From brahmaforces at gmail.com Sun Dec 14 21:29:46 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: Robert: Out of curiosity how come you were in Delhi? > > > I am in NewDelhi India. However I would prefer to put the cluster together >> myself, because >> > > Ya, that's where I lived for seven years growing up. > > > All excellent and traditional reasons, although you'll want to learn a > compiler, either C, C++ or Fortran. Which one is most appropriate > depends a little bit on the application space you want to work in, a > little bit on your personality. None are terribly like python. > Off c, c++ I would go for C++ to avoid the lower level stuff, but why would i need C++ or fortran, why can i not accomplish the same or more in python as it is a programming space of choice for me? > Oh, and New Delhi has one other unique-ish environmental constraint, > unless things have changed a lot since I lived there. Post-monsoon, > when it dries out again you have dust storms. I don't think most list > members can really imagine them, but I can (I used to climb a tree > outside of our house and feel the dust stinging my cheeks and erasing > the buildings all from sight). You will need to be able to keep the > dust that infiltrates EVERYWHERE in the houses at that time out of the > computer room, as computers (especially the cooling fans) don't like > dust. After a big one, you may need to shut down and vaccuum out the > insides of your systems. > Its not so bad, i have a bunch of computers in my art studio and they have been running for a while now, no problem...Clustering them would just mean putting the already running computers in a different config. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081215/89714d9f/attachment.html From brahmaforces at gmail.com Sun Dec 14 21:30:51 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: <3b6093ef0812130810m69c16e89t5fce87f06e0fb7f4@mail.gmail.com> References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> <3b6093ef0812130810m69c16e89t5fce87f06e0fb7f4@mail.gmail.com> Message-ID: Thanks Peter, am checking out the link, looks very interesting On Sat, Dec 13, 2008 at 9:40 PM, Peter Brady wrote: > Hello, > > If you're interested in do it yourself clustering for rendering > purposes you may want to check this out: http://helmer.sfe.se/. > > On Fri, Dec 12, 2008 at 10:47 AM, Dmitri Chubarov > wrote: > > Hello, > > > > my first reply missed the list by mistake so I will repeat a few points > that > > I mentioned there. > > > >> > >> What is 1u? > >> > >> What is a blade system? > > > > Compute clusters are often built of rack-server hardware meaning boxes > > different from desktop boxes and chipset that have features not necessary > > for desktop PCs like ECC memory, redundant power supply units, integrated > > management processors, RAID controllers, that altogether provide better > > reliability, since the failure rate for a cluster of a 100 nodes is 100 > > times higher than for a single node. You may not need any of it for a 16 > > node rendering farm. > > > >> Anyone using clusters for animation on this list? > > > > We are just writing up on a research project on distributed rendering. > > Rendering is the part in the animation process that requires the most > > processing power. We used 3dStudio Max (3DS) for modelling and V-Ray for > > rendering. 3DS has its own utility, called Backburner, for distributing > > frames among a number of cluster nodes. > > > > We observed that V-Ray failed on some certain frames thus stopping the > whole > > rendering queue, therefore the process was not completely automated. > > > > I would also repeat that a storage subsystem that uses an array of disks > is > > essential for performance. > >> > >> At this time I am trying to figure out the racks. Am meeting the > hardware > >> guy on Saturday and we were thinking of opening up the PCS i have lying > >> around and taking measurements of how the mother boards fit into the > >> cases,with the intention of creating a rack from scratch. Any ideas of > what > >> goes into a good rack in terms of size and matieral (assuming it has to > be > >> insulated) > > > > This sort of rack is more of a research project. On the contrary, the > usual > > kind of rack is an IEC standard server rack, > > http://en.wikipedia.org/wiki/19_inch_rack > > > >> > >> Am planning to run animation software (like blender) on it. Since > >> animation software requires large processing power i am assuming they > have > >> already worked on parrallelizing the code... > > > > > > Blender does not seem to have a driver to distribute rendering (I might > be > > wrong) but it can generate PovRay scripts and povray can make use of > > parallel processing in a number of ways. > > > > > > > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > -- Best regards, arjuna http://www.brahmaforces.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081215/6fce7ef7/attachment.html From brahmaforces at gmail.com Sun Dec 14 21:45:35 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> Message-ID: Robert: Thanks for the practical and good advice... I see what you are saying...When I was a child i did play with electrical circuits and tried to build a radio out a PCBs components and a soldering iron, needless to say, without proper guidance or google at that time, after months of inspired tinkering, i never got the radios to recieve anything.However it did give me a basic awareness and handling of electrical components, soldering irons and so on.. I understand insulators and conductors. Yes i should do some basic electronic study and that would be very interesting. Any groups online for the same or landmark books or hobby kits that you may have encountered growing up or earlier in your careers would be appreciated... The only reason I mention "Alumninum" was because I noticed that the motherboards in the tower cases were stuck onto some metal and my hardware person told me that this is a special material that does not conduct electricity. And since everyone was taking about aluminum boards here, i put 2 and 2 together, and obviously instead of 4 I got 10... Given that we are working with a stepped down voltage the risk is lower, however since we are creating a system, it would make sense to make it entirely safe. Wood like you said is a fire hazard, aluminum conducts electricity. Then in your experience, what would be the right material to use to avoid electrical and fire hazards, assuming its not a space ship kevelar or other impossible to find substance or one that is prohibitively expensive. I realize i have a learning curve with: 1) Building computer hardware 2) Electronics I will now attempt to read up and practise both of the above by assembling a computer and finding some basic electronic books and materials to assemble. In the mean time I do want to make a 1 U system that i dreamed of last night, where 3 mother boards go on 1 plate (assuming the right material and precautions to make it safe) This plate goes in some kind of casing for safety for now, later to be removed and put into a rack... Then its time to play with the parrallel processing software.... On Sun, Dec 14, 2008 at 10:40 PM, Robert G. Brown wrote: > On Sat, 13 Dec 2008, arjuna wrote: > > A simple question though...Aluminum plates are used because aluminum is >> does >> not conduct electricity. Is this correct? >> > > Aluminum is an EXCELLENT conductor of electricity, one of the best! > Basically all metals conduct electricity. When you mount the > motherboards you MUST take care to use spacers in the right places > (under the holes for mounting screws on the motherboards, usually) to > keep the solder traces of the motherboard from shorting out! > > Your question makes me very worried on your behalf. Electricity is > quite dangerous, and in general messing with it should be avoided by > anyone that does not already know things like this. In India, with 240 > VAC as standard power, this is especially true. True, the power > supplied to the motherboards is in several voltages 12V and under, but > believe it or not you can kill yourself with 12V, and starting a fire > with 12V is even easier. > > I would >>strongly<< suggest that you find a friend with some electrical > engineering experience, or read extensively on electricity and > electrical safety before attempting any sort of motherboard mount. > Mark's suggestion of hot melt glue, for example, is predicated on your > PRESUMED knowledge that cookie sheets or aluminum sheets are > conductors, that the motherboard has many traces carrying current, and > that when you mount the motherboard you must take great care to ensure > that current-carrying traces CANNOT come in contact with metal. > > The reasons aluminum plates are suggested are a) it's cheap; b) it's > easily drilled/tapped for screws; c) it's fireproof AS LONG AS YOU DON'T > GET IT TOO HOT (heaven help you if you ever do start it on fire, as it > then burns like thermite -- oh wait, thermite IS aluminum plus iron > oxide); d) it reflects/traps EM radiation. > > Wood would be just as good except for the fireproof bit (a big one, > though -- don't use wood) and the EM reflecting part. > > The aluminum plates should probably all be grounded back to a common > ground. The common ground should NOT be a current carrying neutral -- > I'm not an expert on 240 VAC as distributed in India and hesitate to > advise you on where/how to safely ground them. You should probably read > about "ground loops" before you mess with any of this. > > Seriously, this is dangerous and you can hurt yourself or others if you > don't know what you are doing. You need to take the time to learn to > the point where you KNOW how electricity works and what a conductor is > vs an insulator and what electrical codes are and WHY they are what they > are before you attempt to work with bare motherboards and power > supplies. It is possible to kill yourself with a nine volt transistor > radio battery (believe it or not) although you have to work a bit to do > so. It is a lot easier with 12V, and even if you don't start a fire, > you will almost certainly blow your motherboard/CPU/memory and power > supply if you short out 12V in the wrong place. > > Also for future reference, I saw a reference to dc-dc converters for power >> supply. Is it possible to use motherboards that do not guzzle electricity >> and generate a lot of heat and are yet powerful. It seems that not much >> more >> is needed that motherboards, CPUs, memory, harddrives and an ethernet >> card. >> For a low energy system, has any one explored ultra low energy consuming >> and >> heat generating power solutions that maybe use low wattage DC? >> > > The minimum power requirements are dictated by your choice of > motherboard, CPU, memory, and peripherals. Period. They require > several voltages to be delivered into standardized connectors from a > supply capable of providing sufficient power at those voltages. Again, > it is clear from your question that you don't understand what power is > or the thermodynamics of supplying it, and you should work on learning > this (where GIYF). As I noted in a previous reply, typical motherboard > draws are going to be in the 100W to 300+W loaded, and either you > provide this or the system fails to work. To provide 100W to the > motherboard, your power supply will need to draw 20-40% more than this, > lost in the conversion from 120 VAC or 240 VAC to the power provided to > the motherboard and peripherals. Again, you have no choice here. > > The places you do have a choice are: > > a) Buying motherboards etc with lower power requirements. If you are > using recycled systems, you use what you've got, but when you buy in the > future you have some choice here. However, you need to be aware of what > you are optimizing! One way to save power is to run at lower clock, for > example -- there is a tradeoff between power drawn and speed. But > slower systems just mean you draw lower power for longer, and you may > well pay about the same for the net energy required for a computation! > You need to optimize average draw under load times the time required to > complete a computation, not just "power", weighted with how fast you > want your computations to complete and your budget. > > b) You have a LIMITED amount of choice in power supplies. That's the > 20-40% indicated above. A cheap power supply or one that is incorrectly > sized relative to the load is more likely to waste a lot of power as > heat operating at baseline and be on the high end of the power draw > required to operate a motherboard (relatively inefficient). A more > expensive one (correctly sized for the application) will waste less > energy as heat providing the NECESSARY power for your system. > > That is, you don't have a lot of choice when getting started -- you're > probably best off just taking the power supplies out of the tower cases > of your existing systems and using them (or better, just using a small > stack of towers without remounting them until you see how clustering > works for you, which is safe AND effective). When you have done some > more research and learned about electricity, power supplies, and so on > using a mix of Google/web, books, and maybe a friend who works with > electricity and is familiar with power distribution and code > requirements (if any) in New Delhi, THEN on your SECOND pass you can > move on to a racked cluster with custom power supplies matched to > specific "efficient" motherboards. > > rgb > > > >> On Sat, Dec 13, 2008 at 8:50 AM, Mark Hahn wrote: >> What is 1u? >> >> >> rack-mounted hardware is measured in units called "units" ;) >> 1U means 1 rack unit: roughly 19" wide and 1.75" high. racks >> are all >> the same width, and rackmount unit consumes some number of units >> in height. >> (rack depth is moderately variable.) (a full rack is generally >> 42"). >> >> a 1U server is a basic cluster building block - pretty well >> suited, >> since it's not much taller than a disk, and fits a motherboard >> pretty nicely (clearance for dimms if designed properly, a >> couple optional cards, passive CPU heatsinks.) >> >> What is a blade system? >> >> >> it is a computer design that emphasizes an enclosure and fastening >> mechanism >> that firmly locks buyers into a particular vendor's high-margin line >> ;) >> >> in theory, the idea is to factor a traditional server into separate >> components, such as shared power supply, unified management, and often >> some semi-integrated network/san infrastructure. one of the main >> original >> selling points was power management: that a blade enclosure would have >> fewer, more fully loaded, more efficnet PSUs. and/or more reliable. >> blades are often claimed to have superior managability. both of these >> factors are very, very arguable, since it's now routine for 1U servers >> to have nearly the same PSU efficiency, for instance. and in reality, >> simple managability interfaces like IPMI are far better (scalably >> scriptable) >> than a too-smart gui per enclosure, especially if you have 100 >> enclosures... >> >> goes into a good rack in terms of size and matieral >> (assuming it has to be >> insulated) >> >> >> ignoring proprietary crap, MB sizes are quite standardized. and since >> 10 million random computer shops put them together, they're incredibly >> forgiving when it comes to mounting, etc. I'd recommend just >> glue-gunning >> stuff into place, and not worring too much. >> >> Anyone using clusters for animation on this list? >> >> >> not much, I think. this list is mainly "using commodity clusters to >> do stuff fairly reminiscent of traditional scientific supercomputing". >> >> animation is, in HPC terms, embarassingly parallel and often quite >> IO-intensive. both those are somewhat derogatory. all you need to do >> an animation farm is some storage, a network, nodes and probably a >> scheduler or at least task queue-er. >> >> >> >> >> -- >> Best regards, >> arjuna >> http://www.brahmaforces.com >> >> >> > Robert G. Brown http://www.phy.duke.edu/~rgb/ > Duke University Dept. of Physics, Box 90305 > Durham, N.C. 27708-0305 > Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu > > -- Best regards, arjuna http://www.brahmaforces.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081215/6cdb6910/attachment.html From brahmaforces at gmail.com Sun Dec 14 21:49:13 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: <4945BDE9.3070806@gmail.com> References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> <4945BDE9.3070806@gmail.com> Message-ID: Geoff: Point noted, am getting the Grob Basic electronics and also the mathematics for basic electronics book.... Any recommendations for some good electronic kits that anyone may have encountered or any reference to the top of the line electronics groups if any exist? On Mon, Dec 15, 2008 at 7:46 AM, Geoff Jacobs wrote: > Robert G. Brown wrote: > > On Sat, 13 Dec 2008, arjuna wrote: > > > >> A simple question though...Aluminum plates are used because aluminum > >> is does > >> not conduct electricity. Is this correct? > > > > Aluminum is an EXCELLENT conductor of electricity, one of the best! > > Basically all metals conduct electricity. When you mount the > > motherboards you MUST take care to use spacers in the right places > > (under the holes for mounting screws on the motherboards, usually) to > > keep the solder traces of the motherboard from shorting out! > > > > Your question makes me very worried on your behalf. Electricity is > > quite dangerous, and in general messing with it should be avoided by > > anyone that does not already know things like this. In India, with 240 > > VAC as standard power, this is especially true. True, the power > > supplied to the motherboards is in several voltages 12V and under, but > > believe it or not you can kill yourself with 12V, and starting a fire > > with 12V is even easier. > > You can actually do a good job of TIG welding with 12V. > > > I would >>strongly<< suggest that you find a friend with some electrical > > engineering experience, or read extensively on electricity and > > electrical safety before attempting any sort of motherboard mount. > > Mark's suggestion of hot melt glue, for example, is predicated on your > > PRESUMED knowledge that cookie sheets or aluminum sheets are > > conductors, that the motherboard has many traces carrying current, and > > that when you mount the motherboard you must take great care to ensure > > that current-carrying traces CANNOT come in contact with metal. > > > > The reasons aluminum plates are suggested are a) it's cheap; b) it's > > easily drilled/tapped for screws; c) it's fireproof AS LONG AS YOU DON'T > > GET IT TOO HOT (heaven help you if you ever do start it on fire, as it > > then burns like thermite -- oh wait, thermite IS aluminum plus iron > > oxide); d) it reflects/traps EM radiation. > > Ask the Royal Navy about fireproof aluminum. > > > Wood would be just as good except for the fireproof bit (a big one, > > though -- don't use wood) and the EM reflecting part. > > > > The aluminum plates should probably all be grounded back to a common > > ground. The common ground should NOT be a current carrying neutral -- > > I'm not an expert on 240 VAC as distributed in India and hesitate to > > advise you on where/how to safely ground them. You should probably read > > about "ground loops" before you mess with any of this. > > Commodity ATX power supplies will have a grounded frame. Mounting the > power supply to the pan will work quite well. > > > Seriously, this is dangerous and you can hurt yourself or others if you > > don't know what you are doing. You need to take the time to learn to > > the point where you KNOW how electricity works and what a conductor is > > vs an insulator and what electrical codes are and WHY they are what they > > are before you attempt to work with bare motherboards and power > > supplies. It is possible to kill yourself with a nine volt transistor > > radio battery (believe it or not) although you have to work a bit to do > > so. It is a lot easier with 12V, and even if you don't start a fire, > > you will almost certainly blow your motherboard/CPU/memory and power > > supply if you short out 12V in the wrong place. > > Yes, it is possible to kill yourself with low voltage. You have to > really work at it and/or be unlucky, but it can be done. A DC resistance > from leg to arm of 100 ohms or so is hard to achieve. Stabbing oneself > with electrified needles, for starters. > > >> Also for future reference, I saw a reference to dc-dc converters for > >> power > >> supply. Is it possible to use motherboards that do not guzzle > electricity > >> and generate a lot of heat and are yet powerful. It seems that not > >> much more > >> is needed that motherboards, CPUs, memory, harddrives and an ethernet > >> card. > >> For a low energy system, has any one explored ultra low energy > >> consuming and > >> heat generating power solutions that maybe use low wattage DC? > > > > The minimum power requirements are dictated by your choice of > > motherboard, CPU, memory, and peripherals. Period. They require > > several voltages to be delivered into standardized connectors from a > > supply capable of providing sufficient power at those voltages. Again, > > it is clear from your question that you don't understand what power is > > or the thermodynamics of supplying it, and you should work on learning > > this (where GIYF). As I noted in a previous reply, typical motherboard > > draws are going to be in the 100W to 300+W loaded, and either you > > provide this or the system fails to work. To provide 100W to the > > motherboard, your power supply will need to draw 20-40% more than this, > > lost in the conversion from 120 VAC or 240 VAC to the power provided to > > the motherboard and peripherals. Again, you have no choice here. > > > > The places you do have a choice are: > > > > a) Buying motherboards etc with lower power requirements. If you are > > using recycled systems, you use what you've got, but when you buy in the > > future you have some choice here. However, you need to be aware of what > > you are optimizing! One way to save power is to run at lower clock, for > > example -- there is a tradeoff between power drawn and speed. But > > slower systems just mean you draw lower power for longer, and you may > > well pay about the same for the net energy required for a computation! > > You need to optimize average draw under load times the time required to > > complete a computation, not just "power", weighted with how fast you > > want your computations to complete and your budget. > > > > b) You have a LIMITED amount of choice in power supplies. That's the > > 20-40% indicated above. A cheap power supply or one that is incorrectly > > sized relative to the load is more likely to waste a lot of power as > > heat operating at baseline and be on the high end of the power draw > > required to operate a motherboard (relatively inefficient). A more > > expensive one (correctly sized for the application) will waste less > > energy as heat providing the NECESSARY power for your system. > > > > That is, you don't have a lot of choice when getting started -- you're > > probably best off just taking the power supplies out of the tower cases > > of your existing systems and using them (or better, just using a small > > stack of towers without remounting them until you see how clustering > > works for you, which is safe AND effective). When you have done some > > more research and learned about electricity, power supplies, and so on > > using a mix of Google/web, books, and maybe a friend who works with > > electricity and is familiar with power distribution and code > > requirements (if any) in New Delhi, THEN on your SECOND pass you can > > move on to a racked cluster with custom power supplies matched to > > specific "efficient" motherboards. > > > > rgb > > Although this list is quite liberal, we should be fair to Donald Becker > and Co. in pointing out that most of the questions you have, here, are > general computer hardware/electrical questions. They are best dealt with > aside from this list. However, we'll make sure you have a good place to > start. > > Buff up on basic electronics, both theoretical and practical. Start with > a good textbook like Grob, "Basic Electronics" and something like an > electronics projects kit. We used to be able to buy them at Radio Shack. > > If you just want to cut to the chase and assemble some computers, GIYF. > Look for "build your own computer". For example, I have included one of > the first links available. It appears to be fairly thorough. > http://www.pcmech.com/byopc/ > > Once you're at the stage where you are comfortable with computer > hardware, at least, and perhaps electronics in general, then you will be > prepared to build a Beowulf. We will be happy to help with any questions > at this stage, buy please check google first always, as sometimes the > answer is already out there. > > -- > Geoffrey D. Jacobs > -- Best regards, arjuna http://www.brahmaforces.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081215/06b71877/attachment.html From brahmaforces at gmail.com Sun Dec 14 21:51:10 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> <4945BDE9.3070806@gmail.com> Message-ID: Robert: I value life too much to want to go that way just yet! However I do know what a conductor is, i was just trying to figure out what that metallic looking plate in the tower case was that housed the motherboards etc and was not a conductor...? On Mon, Dec 15, 2008 at 10:00 AM, Robert G. Brown wrote: > On Sun, 14 Dec 2008, Geoff Jacobs wrote: > > Yes, it is possible to kill yourself with low voltage. You have to >> really work at it and/or be unlucky, but it can be done. A DC resistance >> from leg to arm of 100 ohms or so is hard to achieve. Stabbing oneself >> with electrified needles, for starters. >> > > As in: > > http://www.darwinawards.com/darwin/darwin1999-50.html > > Improving the human race one incident at a time... > > There was also the joy of touching a 9V battery across the tip of your > tongue (safe enough, but more than enough to convince you that 9V can > make you go "Ow"). Or the dangers of 12V batteries in a saltwater > environment or even when it is raining and your hands are wet and your > skin is maybe a bit split when you grab one by the posts. > > As I said, do NOT mess with even "low" voltage electricity unless you > know things like what a conductor is, or you too might qualify for a > Darwin! And we'd hate that...:-) > > rgb > > > Robert G. Brown http://www.phy.duke.edu/~rgb/ > Duke University Dept. of Physics, Box 90305 > Durham, N.C. 27708-0305 > Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu > > > -- Best regards, arjuna http://www.brahmaforces.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081215/bb3c3648/attachment.html From brahmaforces at gmail.com Sun Dec 14 21:57:02 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> <4945BDE9.3070806@gmail.com> Message-ID: Hello all: Am following Roberts advice and to make a long road shorter before jumping on the learning curve of the spilling out of the computers from their cases, which i will now do in a second go around, I am getting ready to install the operating systems on each of the 3 computers and 1 laptop(should i even include the laptop?) Currently I am using the latest version of Mandriva linux. However this is a full end user install and comes with a lot of stuff. I assume I need a simpler install minus all the extraneous stuff on the nodes. Any recommendations or should i just install the laterst version of mandriva on all of this. Am researching an 8 port switch. Any experience regarding which brand or models might be best? On Mon, Dec 15, 2008 at 11:21 AM, arjuna wrote: > Robert: > > I value life too much to want to go that way just yet! However I do know > what a conductor is, i was just trying to figure out what that metallic > looking plate in the tower case was that housed the motherboards etc and was > not a conductor...? > > > On Mon, Dec 15, 2008 at 10:00 AM, Robert G. Brown wrote: > >> On Sun, 14 Dec 2008, Geoff Jacobs wrote: >> >> Yes, it is possible to kill yourself with low voltage. You have to >>> really work at it and/or be unlucky, but it can be done. A DC resistance >>> from leg to arm of 100 ohms or so is hard to achieve. Stabbing oneself >>> with electrified needles, for starters. >>> >> >> As in: >> >> http://www.darwinawards.com/darwin/darwin1999-50.html >> >> Improving the human race one incident at a time... >> >> There was also the joy of touching a 9V battery across the tip of your >> tongue (safe enough, but more than enough to convince you that 9V can >> make you go "Ow"). Or the dangers of 12V batteries in a saltwater >> environment or even when it is raining and your hands are wet and your >> skin is maybe a bit split when you grab one by the posts. >> >> As I said, do NOT mess with even "low" voltage electricity unless you >> know things like what a conductor is, or you too might qualify for a >> Darwin! And we'd hate that...:-) >> >> rgb >> >> >> Robert G. Brown http://www.phy.duke.edu/~rgb/ >> Duke University Dept. of Physics, Box 90305 >> Durham, N.C. 27708-0305 >> Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu >> >> >> > > > -- > Best regards, > arjuna > http://www.brahmaforces.com > -- Best regards, arjuna http://www.brahmaforces.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081215/113085ee/attachment.html From brahmaforces at gmail.com Sun Dec 14 22:59:11 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: <3b6093ef0812130810m69c16e89t5fce87f06e0fb7f4@mail.gmail.com> References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> <3b6093ef0812130810m69c16e89t5fce87f06e0fb7f4@mail.gmail.com> Message-ID: Peter: Thats amazing and insprational: Vincent, this would address your concern about beowulf for animation rendering and also cost... On Sat, Dec 13, 2008 at 9:40 PM, Peter Brady wrote: > Hello, > > If you're interested in do it yourself clustering for rendering > purposes you may want to check this out: http://helmer.sfe.se/. > > On Fri, Dec 12, 2008 at 10:47 AM, Dmitri Chubarov > wrote: > > Hello, > > > > my first reply missed the list by mistake so I will repeat a few points > that > > I mentioned there. > > > >> > >> What is 1u? > >> > >> What is a blade system? > > > > Compute clusters are often built of rack-server hardware meaning boxes > > different from desktop boxes and chipset that have features not necessary > > for desktop PCs like ECC memory, redundant power supply units, integrated > > management processors, RAID controllers, that altogether provide better > > reliability, since the failure rate for a cluster of a 100 nodes is 100 > > times higher than for a single node. You may not need any of it for a 16 > > node rendering farm. > > > >> Anyone using clusters for animation on this list? > > > > We are just writing up on a research project on distributed rendering. > > Rendering is the part in the animation process that requires the most > > processing power. We used 3dStudio Max (3DS) for modelling and V-Ray for > > rendering. 3DS has its own utility, called Backburner, for distributing > > frames among a number of cluster nodes. > > > > We observed that V-Ray failed on some certain frames thus stopping the > whole > > rendering queue, therefore the process was not completely automated. > > > > I would also repeat that a storage subsystem that uses an array of disks > is > > essential for performance. > >> > >> At this time I am trying to figure out the racks. Am meeting the > hardware > >> guy on Saturday and we were thinking of opening up the PCS i have lying > >> around and taking measurements of how the mother boards fit into the > >> cases,with the intention of creating a rack from scratch. Any ideas of > what > >> goes into a good rack in terms of size and matieral (assuming it has to > be > >> insulated) > > > > This sort of rack is more of a research project. On the contrary, the > usual > > kind of rack is an IEC standard server rack, > > http://en.wikipedia.org/wiki/19_inch_rack > > > >> > >> Am planning to run animation software (like blender) on it. Since > >> animation software requires large processing power i am assuming they > have > >> already worked on parrallelizing the code... > > > > > > Blender does not seem to have a driver to distribute rendering (I might > be > > wrong) but it can generate PovRay scripts and povray can make use of > > parallel processing in a number of ways. > > > > > > > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > -- Best regards, arjuna http://www.brahmaforces.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081215/72875fc6/attachment.html From brahmaforces at gmail.com Sun Dec 14 23:22:38 2008 From: brahmaforces at gmail.com (arjuna) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> <3b6093ef0812130810m69c16e89t5fce87f06e0fb7f4@mail.gmail.com> Message-ID: The Helmer used Plexi glass rather than aluminum to sit the mother boards on. Plexiglass is both cheap, and i would think, insulated and firesafe... On Mon, Dec 15, 2008 at 12:29 PM, arjuna wrote: > Peter: > > Thats amazing and insprational: > > Vincent, this would address your concern about beowulf for animation > rendering and also cost... > > On Sat, Dec 13, 2008 at 9:40 PM, Peter Brady wrote: > >> Hello, >> >> If you're interested in do it yourself clustering for rendering >> purposes you may want to check this out: http://helmer.sfe.se/. >> >> On Fri, Dec 12, 2008 at 10:47 AM, Dmitri Chubarov >> wrote: >> > Hello, >> > >> > my first reply missed the list by mistake so I will repeat a few points >> that >> > I mentioned there. >> > >> >> >> >> What is 1u? >> >> >> >> What is a blade system? >> > >> > Compute clusters are often built of rack-server hardware meaning boxes >> > different from desktop boxes and chipset that have features not >> necessary >> > for desktop PCs like ECC memory, redundant power supply units, >> integrated >> > management processors, RAID controllers, that altogether provide better >> > reliability, since the failure rate for a cluster of a 100 nodes is 100 >> > times higher than for a single node. You may not need any of it for a 16 >> > node rendering farm. >> > >> >> Anyone using clusters for animation on this list? >> > >> > We are just writing up on a research project on distributed rendering. >> > Rendering is the part in the animation process that requires the most >> > processing power. We used 3dStudio Max (3DS) for modelling and V-Ray for >> > rendering. 3DS has its own utility, called Backburner, for distributing >> > frames among a number of cluster nodes. >> > >> > We observed that V-Ray failed on some certain frames thus stopping the >> whole >> > rendering queue, therefore the process was not completely automated. >> > >> > I would also repeat that a storage subsystem that uses an array of disks >> is >> > essential for performance. >> >> >> >> At this time I am trying to figure out the racks. Am meeting the >> hardware >> >> guy on Saturday and we were thinking of opening up the PCS i have lying >> >> around and taking measurements of how the mother boards fit into the >> >> cases,with the intention of creating a rack from scratch. Any ideas of >> what >> >> goes into a good rack in terms of size and matieral (assuming it has to >> be >> >> insulated) >> > >> > This sort of rack is more of a research project. On the contrary, the >> usual >> > kind of rack is an IEC standard server rack, >> > http://en.wikipedia.org/wiki/19_inch_rack >> > >> >> >> >> Am planning to run animation software (like blender) on it. Since >> >> animation software requires large processing power i am assuming they >> have >> >> already worked on parrallelizing the code... >> > >> > >> > Blender does not seem to have a driver to distribute rendering (I might >> be >> > wrong) but it can generate PovRay scripts and povray can make use of >> > parallel processing in a number of ways. >> > >> > >> > >> > _______________________________________________ >> > Beowulf mailing list, Beowulf@beowulf.org >> > To change your subscription (digest mode or unsubscribe) visit >> > http://www.beowulf.org/mailman/listinfo/beowulf >> > >> > >> > > > > -- > Best regards, > arjuna > http://www.brahmaforces.com > -- Best regards, arjuna http://www.brahmaforces.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081215/5fe1430d/attachment.html From rgb at phy.duke.edu Mon Dec 15 13:19:29 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: <20081215210442.GB3929@galactic.demon.co.uk> References: <9f8092cc0812110131g28d1103cta38d9aa5fe880326@mail.gmail.com> <4945BDE9.3070806@gmail.com> <20081215210442.GB3929@galactic.demon.co.uk> Message-ID: On Mon, 15 Dec 2008, Andrew M.A. Cater wrote: > 12V will weld a wedding ring to pretty much anything - or produce an > effective amputation with ready made diathermy to seal up the wound :( Yeah, but it's wussy. I learned about electricity at age 2 by putting a bobby pin into a 120VAC socket with my bare fingers. 120V will turn a hairpin white hot and vaporize/burn it faster than a fuse can blow! Oooooowwwwwooooooch. rgb (Who has narrowly dodged, lessee, one, two, three, maybe four Darwins of his own over the years... until I learned to be properly scared:-) Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From dkondo at lri.fr Sun Dec 7 12:09:45 2008 From: dkondo at lri.fr (Derrick Kondo) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] CFP for special issue of Journal of Grid Computing on desktop grids and volunteer computing In-Reply-To: <60ec14620812040027s39571a37yab10f02ed1f8aba7@mail.gmail.com> References: <60ec14620812020432o420fdab1nc0b0f9bd58b43d85@mail.gmail.com> <60ec14620812020447p490e2c2dy1614008ae9ceb4c2@mail.gmail.com> <60ec14620812040027s39571a37yab10f02ed1f8aba7@mail.gmail.com> Message-ID: <60ec14620812071209r7974c41dk74a5e85dc48894d1@mail.gmail.com> CALL FOR PAPERS Journal of Grid Computing Special issue on desktop grids and volunteer computing Submission deadline: January 31, 2009 Web site: http://mescal.imag.fr/membres/derrick.kondo/cfp_jogc.htm ------------------------------------------------------------------------------ The Journal of Grid Computing will publish a special issue on Volunteer Computing and Desktop Grids. Desktop grids and volunteer computing systems (DGVCS's) utilize the free resources available in Intranet or Internet environments for supporting large-scale computation and storage. For over a decade, DGVCS's have been one of the largest and most powerful distributed computing systems in the world, offering a high return on investment for applications from a wide range of scientific domains (including computational biology, climate prediction, and high-energy physics). Recently, the FOLDING@home project broke the PetaFLOPS barrier with 41,145 Sony PS3 participants. While DGVCS's sustain up to PetaFLOPS of computing power from hundreds of thousands to millions of resources, fully leveraging the platform's computational power is still a major challenge because of the immense scale, high volatility, and extreme heterogeneity of such systems. The purpose of this special issue is to focus on recent advances in the development of scalable, fault-tolerant, and secure DGVCS's. As such, we invite submissions on DGVCS topics including (but not limited to) the following: * DGVCS middleware and software infrastructure (including management), with emphasis on virtual machines * incorporation of DGVCS's with Grid infrastructures * DGVCS programming environments and models * modeling, simulation, and emulation of large-scale, volatile environments * resource management and scheduling * resource measurement and characterization * novel DGVCS applications * data management (strategies, protocols, storage) * security on DGVCS's (reputation systems, result verification) * fault-tolerance on shared, volatile resources Submission Deadline 31 January 2009 Early submission encouraged Special Issue Editors Derrick Kondo INRIA, France derrick.kondo :: inria.fr Ad Emmen The Netherlands emmen :: genias.nl Editors-in-Chief Peter Kacsuk Ian Foster Submission Details Manuscripts formatted along the guidelines for authors of the Journal of Grid Computing must be submitted online by 31 January 2009. From gdjacobs at gmail.com Mon Dec 15 13:33:23 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: Message-ID: <4946CD23.6030302@gmail.com> Lux, James P wrote: > \ True, the power >>> supplied to the motherboards is in several voltages 12V and under, but >>> believe it or not you can kill yourself with 12V, and starting a fire >>> with 12V is even easier. >> You can actually do a good job of TIG welding with 12V. > > Hmm.. I think not. The voltage across the arc while welding might be 12V, > but the open circuit voltage will be higher, and almost all TIG rigs use a > HF/HV circuit to stabilize/start the arc (basically a small tesla coil).. You can do scratch start TIG with just DC. Pipeline welders do this all the time for the root pass. Good quality TIG supplies can output a square wave, so you've got continuous HF, but this is mostly used for aluminum (which is a common application, as you know). > I used to work in a special effects shop (physical effects, not computer > generated), and we had lots of welders around. The classic challenge for a > good TIG welder is to weld aluminum foil (or gum wrappers).. You graduate to foil by doing lots of coke cans. > The owner used to maintain that for offroading, one needed two batteries, > not as a spare, but because you'd need 24V to do decent stick welding (e.g. > To fix a broken frame or suspension component in the field) with a > coathanger and jumper cables. We called him on this stunt.. Got two car > batteries and jumper cables out and handed him the coat hangers.. Many off roaders have a second, modified alternator in their Jeep for this purpose. Holding an arc with no flux is a testament to the guy's skill, but I still wouldn't trust the welds. They're likely to be very, very porous. They didn't create all those rod formulations for fun. > You can do it, but it isn't pretty, nor what one might be proud of. But > with 12V.. No way.. > > Granted, TIG has a shielding gas, which the coathangers and batteries do > not, but I'm pretty sure you'd have a tough time striking the arc and > holding it stable with just 12V. Once you get a puddle going and the > electrode is hot, thermionic emission helps, but.... Never said it was ideal, or even easy. TIG makes it much easier to maintain the arc gap, and the current required is much, much lower. Try to do SMAW welding with the power levels a TIG supply uses using 1/8" rod and you'll get nothing but bubble gum. > On the other hand, 12V is just fine (if not overkill) for resistance or spot > welding. A microwave oven transformer with a single turn secondary of > copper tubing works great for that. > >> Commodity ATX power supplies will have a grounded frame. Mounting the >> power supply to the pan will work quite well. > > Except I use foam double stick tape, which is an insulator, so you need > another wire to ground it. I guess you could screw the PS to the baking > sheet, but that takes more time to drill holes, etc. One sheet metal screw > for grounding vs 4 screws in just the right pattern.. One is easy hackery vs > the other is precision machining. Just gang drill a single hole for grounding. That way, precision isn't really an issue, but you've got not-bad continuity. One site that I found a couple of years ago when I was debating building my own TIG unit was this: http://www3.telus.net/public/a5a26316/TIG_Welder.html Convert a standard 225 Amp AC "buzz box" to a good quality DC or square wave output, suitable for any metal which can be TIG welded. Very neat project. How does this relate to Beowulf? How else are you going to build your own telco rack? -- Geoffrey D. Jacobs From hearnsj at googlemail.com Mon Dec 15 13:39:41 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] MPICH upgrade question In-Reply-To: <49456932.1020208@comcast.net> References: <49456932.1020208@comcast.net> Message-ID: <9f8092cc0812151339t34aadab8j7c8427cad74482c3@mail.gmail.com> 2008/12/14 Gregg Germain > Hi all, > > This is a little more in the way of a systems question: > > I'm presently running a cluster with mpich2-1.0.6p1 installed in > /usr/local. > > I want to upgrade to mpich2-1.0.8. > > Should I run the 1.0.6 Make uninstall first? > mv /usr/local/mpich2 /usr/local/mpich2-1.0.6p1 Install version 1.0.8 A smart thing to do is to install version 1.0.8 to /usr/local/mpich2-1.0.8 then create a link from /usr/local/mpich2 to point to this. Then, if you put /usr/local/mpich2 {bin/lib} into your path as a user you will always be up to date. At this point, the clustering gods are laughing at me. An even smarter thing to do is to use modules modules.sourceforge.net There will be a modules RPM available for your distribution Short answer - no need to uninstall. Just move it to one side. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081215/897e413d/attachment.html From gdjacobs at gmail.com Mon Dec 15 13:42:40 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] MPICH upgrade question In-Reply-To: <49456932.1020208@comcast.net> References: <49456932.1020208@comcast.net> Message-ID: <4946CF50.6050002@gmail.com> Gregg Germain wrote: > Hi all, > > This is a little more in the way of a systems question: > > I'm presently running a cluster with mpich2-1.0.6p1 installed in > /usr/local. > > I want to upgrade to mpich2-1.0.8. > > Should I run the 1.0.6 Make uninstall first? > > Or just install 1.0.8 on top of 1.0.6? > > Or is there some other cleanup method I should use to eliminate 1.0.6 > first? > > thanks! > > Gregg I always deploy MPICH in a different directory for each library version/compiler combination. I then use a script so one can switch to the appropriate build wrapper easily. When replacing the old version outright, you should remove the old and install the new. -- Geoffrey D. Jacobs From gdjacobs at gmail.com Mon Dec 15 13:48:53 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: <4946CD23.6030302@gmail.com> References: <4946CD23.6030302@gmail.com> Message-ID: <4946D0C5.8070705@gmail.com> > One site that I found a couple of years ago when I was debating building > my own TIG unit was this: > http://www3.telus.net/public/a5a26316/TIG_Welder.html > > Convert a standard 225 Amp AC "buzz box" to a good quality DC or square > wave output, suitable for any metal which can be TIG welded. Very neat > project. > > How does this relate to Beowulf? How else are you going to build your > own telco rack? Alas, my recollection was flawed. This particular box simply has a built in HF start function (similar to what used to be optional with the classic Dialarc 250). It cannot generate a square wave output. -- Geoffrey D. Jacobs From csamuel at vpac.org Mon Dec 15 14:10:10 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] MPICH upgrade question In-Reply-To: <4946CF50.6050002@gmail.com> Message-ID: <369269424.328151229379010380.JavaMail.root@mail.vpac.org> ----- "Geoff Jacobs" wrote: > I always deploy MPICH in a different directory for each library > version/compiler combination. I then use a script so one can switch > to the appropriate build wrapper easily. When replacing the old version > outright, you should remove the old and install the new. We do this, modulo using Modules to control it rather than a home grown script. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From gdjacobs at gmail.com Mon Dec 15 14:26:11 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] MPICH upgrade question In-Reply-To: <369269424.328151229379010380.JavaMail.root@mail.vpac.org> References: <369269424.328151229379010380.JavaMail.root@mail.vpac.org> Message-ID: <4946D983.2090905@gmail.com> Chris Samuel wrote: > ----- "Geoff Jacobs" wrote: > >> I always deploy MPICH in a different directory for each library >> version/compiler combination. I then use a script so one can switch >> to the appropriate build wrapper easily. When replacing the old version >> outright, you should remove the old and install the new. > > We do this, modulo using Modules to control it rather than > a home grown script. Yes, I saw the previous post. If/when I'm to do things over... Well, actually, that would have me diving into Perceus, etc. -- Geoffrey D. Jacobs From diep at xs4all.nl Mon Dec 15 16:56:53 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <16A5C694-180E-47FA-BF4E-AED9FF8D8A68@xs4all.nl> Message-ID: <5D1C0A92-A5F7-42F7-9265-283C70831D56@xs4all.nl> On Dec 15, 2008, at 6:07 AM, arjuna wrote: > Hi Vincent: > > I am a believer in beginners mind. Some of the things that i am > doing like indian classical music, classical drawing, python > programming are all things i got inspired to do and dived right > into. And the inspiration is usually right on and after several > years of doing it, one achieve a masterful level. This does entail > a lot of passion, inspiration and the courage to go ahead and do it. > "My education and training at art academy was one big course in psychology" Maud Veraar (marketing manager highq.nl) After having negotiated with a couple of hundreds of graphics designers past 10 years, i definitely recognize in the start paragraph you write here that you also have strong feelings in that direction that you admire people who have that social gift that these creative professionals all have in such abundance. Basically that means that a social part of the brain has developed itself very well where most on this list need more than a paid mission to Mars to discover a part of it. Wanna bet they gonna find courageous volunteers for that single ticket to Mars? > I have been bitten by the animation bug, and this has something to > do with the fact that i am currently deeply immersed in a several > year training in the classical drawing for classical realist > painters. Animators study drawing, but probably not in this > depth. There is a huge synergy as you would know between drawing > and animation which requires good visual mental faculties that are > developed by drawing. Already visited Amsterdam? The best painters from what we call 'the golden age', Netherlands was world power number 1 for a year or 100 a year or 300 ago and the huge money that this produced the dutch used especially to let the best painters on the planet make paintings of a quality that todays artists do not have (as in todays society a lot of additional factors determine the success of artists rather than just their work - nowadays you also need to be verbal strong and sell your product to the audience, maybe that reduced the level of todays art quite a bit. The best museums on the planet to see this brilliant type of realist painting is in the National Museum (and next to it at 100 meters away is the Van Gogh Museum) located in Amsterdam. This is worlds best and a really LARGE collection of paintings. In fact the huge museum just shows a very tiny part of all the paintings they posses. Because of the beauty of those paintings donors buy slowly at auctions paintings to get them back where they were made. Some paintings get sold for tens of millions of euro's. A price really high if you consider the tens of thousands of paintings made in those centuries that survived until today. > > So having become inspired by the animation, i have been researching > it avidly. As an artist I am interested not in making short fun > clips but eventually entire movies. I realize this is a large > undertaking, but thats what i do...So is vocal indian classical > music, kung fu and classical drawing. > > In researching Pixar which makes these kind of movies, i found they > use farms of machines for rendering. Therefore the idea Yeah all those art academy guys are worlds best marketing managers, specialized in selfpromotion. I lack that gift, as Greg Lindahl will happily confirm for you here, though he never commented on this list what he finds from president-elect Obama's energy speech that is at his website already for a while. A small problem for artists is of course that if 1 out of a 10k artists is one of the better art artists on planet earth, which regrettably means that each year 9999 ex-students need a job as they lack the talents of a Rembrandt van Rijn, or a Vincent van Gogh. I'm quite sure Pixar bought that hardware to provide that huge base of designers some computing power to produce what they need. Where would Pixar get the money to pay for so many people who basically would be jobless based upon skillset? Making animations is not so easy as most people guess. Very few are good at it. Most complicated is designing the characters. I remember that a very experienced designer who had made a female character, which took him quite long to make, as it had like tens of thousands of rectangles, that i asked him the question: "What is the biggest difference between a human being and that ape that you drew?". After i commented that the forehead of a human being is further forward than from an ape, he agreed with me and corrected that. Then it looked like a human. The GUI programmer then toyed for a week with that character to get the boobs bigger ;) After that it looked like something. I quote this because this designer already is one of the better designers. Designing human beings, especially the head, it takes a designer up to a month to create. All he (amazingly very few 'she's are designing 3d graphics) needs during all that time is a real cheapo PC. There is plenty of system time available to create the animations. Designing the graphics is what eats all time and most money. Note i hear the word '3d studio max' just too much. Yeah sure, me as big layman designer also toyed a year or 10 ago in 3d studio max and i still would do it. For big movies and most big game companies other (a bit older) products get used (lightwave, maya etc). 3d studio max is basically popular at art academies (maybe because of licensing). That will change slowly of course. You cannot convert animations easily from 1 product to another. Really a big problem that is, as a designer trained in product A will just not manage (in 99% of the cases) to work with another software product. > was planted in my brain, and being a good computer programmer I > decided to ride the learning curve(and i know how to ride these > wild horses given the amount of learning required in the classical > arts and the time and patience it takes) and make me a beowulf > which is the cheapest fastest machine. > In this mailing list the word 'good computer programmer' means basically that you don't need a cluster at all. All you need in that case is a Tesla card which has a price of say 1300 dollar or so at newegg and render with your own 3d engine the animations. Yeah that engine is better than what lightwave has (not surprising). Especially for light effects these are tricky to get right. That's what we did do (not in cuda, simply at the pc). A single core P4 can render a second or 10 of animations of an entire scene within a few minutes. That's a 3d engine in C/C++ code, not even using SSE2 assembler. Let alone using 240 cores (as everything is single precision float anyway) of a graphics card. That last would really kick butt. A programmer can port such code to such hardware platform. A single GPU is so so much faster for this. Also it has enough ram. When a few parts of animation rendered, you can stream that from card to the PC where the pc processor can compress it to some sort of lossy high quality mp4 format and put it on the disk. Not a single cluster with quadcores can compete against this. Especially not against its price. > Made from commodity components it would be cheaper than high end > machines (hence the relevance to this list in case you are > wondering) it would eventually give me the power (having ridden the > curve over time, not tomorow) to do the type of rendering that the > big boys are doing. > If Pixar is using render farms then i assume that is a good well > thought out way to go for eventually creating full length animated > movies. Also i am interested in special effects. > > The whole logic of the beowulf is to use cheap commodity machines > to make super computers(to put is a bit simplistically) So > why would one use an expensive stand alone machine with eventually > limited capacity? If it were possible to make full length, cutting > edge animated movies using stand alone machines then the boys who > actually make them would be using them for their productions right? > With a simple box you can already produce an entire movie handsdown within a few hours. If you want to do more than being sysadmin of the system, namely be a programmer, you might want to do the rendering at some sort of Tesla type hardware. That's a tad of programming of course, but it goes completely realtime in such case. If it is ok for pilots in training that manner, it must be ok for you as well. A big cluster with expensive network for the i/o is going to be very expensive compared to the box that can do the job. > > Hello Arjuna, > > I'm a bit interested you mention this. When i negotiate about > animations, which sometimes already were edits from major movies, > the hardware used is nothing more than a simple PC to produce it. > Animation Designers work for small amounts of money, > so paying for a lot of hardware, more than some fast PC or fast > macintosh, with 2 very good big TFT's (apple offers some fantastic > dual set of TFT's - though quite expensive for a design studio maybe). > > It sure takes a couple of minutes to render animations in high > resolutions, yet i'm quite amazed you need more hardware for this, > yes even a cluster. > > Isn't some 16 core Shanghai box with a lot of RAM already total > overpower for this? > > Can you explain where you need all that cpu power for? > > Best Regards, > Vincent > > Also any thoughts on racks versus piles of PCS. > > A lot of the posts on the internet are old and out of date. I am > wondering what the upto date trends are in racking commodity > computers to create beowulf clusters. What should i be reading? > > -- > Best regards, > arjuna > http://www.brahmaforces.com > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > -- > Best regards, > arjuna > http://www.brahmaforces.com > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From kilian.cavalotti.work at gmail.com Tue Dec 16 01:36:44 2008 From: kilian.cavalotti.work at gmail.com (Kilian CAVALOTTI) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] MPICH upgrade question In-Reply-To: <9f8092cc0812151339t34aadab8j7c8427cad74482c3@mail.gmail.com> References: <49456932.1020208@comcast.net> <9f8092cc0812151339t34aadab8j7c8427cad74482c3@mail.gmail.com> Message-ID: <200812161036.45504.kilian.cavalotti.work@gmail.com> Hi John, On Monday 15 December 2008 22:39:41 John Hearns wrote: > A smart thing to do is to install version 1.0.8 to /usr/local/mpich2-1.0.8 > then create a link from /usr/local/mpich2 to point to this. > Then, if you put /usr/local/mpich2 {bin/lib} into your path as a user you > will always be up to date. I've been doing this for some time, but at some point, it raised some issues. Modifying the default version of a MPI lib is not necessarily something average users really like. From my experience (insert the mandatory YMMV here), most users don't really care about the MPI implementation/version their program uses, as long as it runs and performs reasonably. I too used to install libraries in their own dirs and link the default path to the latest version. But I had my share of "Did you change anything? My job doesn't run anymore, and it was working perfectly fine last week." So now, I've adopted the ?ber-conservative approach, which is to still install new versions of libraries in their own dirs, but let the user decide if she wants to use them or not: I don't link the default path anymore, so that people having used MPICH 1.0.6 for months will still continue to use it (assuming they know they use MPICH 1.0.6...) without having to modify anything in their scripts. And at the same time, people wanting to use a more recent version are free to do so. But that has to be a voluntary move. Once again, this is probably pretty dependent on your audience. > At this point, the clustering gods are laughing at me. > An even smarter thing to do is to use modules Yep, that's definitely easier to manage from the user standpoint. For each new version, a new module file, the appropriate announcement in the media channel used to communicate with users, and everybody's happy. :) Cheers, -- Kilian From hahn at mcmaster.ca Tue Dec 16 14:03:38 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: References: <16A5C694-180E-47FA-BF4E-AED9FF8D8A68@xs4all.nl> Message-ID: > super computers(to put is a bit simplistically) So why would one use an > expensive stand alone machine with eventually limited capacity? If it were right - you can't scale a single box much, or cheaply. but there are scalability issues for clusters, too. even ignoring that, there are lots of TCO-related reasons to prefer a single machine over a cluster, if you can. From landman at scalableinformatics.com Wed Dec 17 07:51:00 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] GPU-HMMer for interested people In-Reply-To: <49491D8C.60809@csc.fi> References: <49448743.7000605@scalableinformatics.com> <49491D8C.60809@csc.fi> Message-ID: <49491FE4.4050402@scalableinformatics.com> Hi Olli-Pekka Olli-Pekka Lehto wrote: > Joe Landman wrote: >> Hi folks >> >> GPU-HMMer (part of the MPI-HMMer effort) has just been >> announced/released at http://www.mpihmmer.org >> >> MPI-HMMer has itself been improved with parallel-IO and better >> scalability features. JP has measured some large number (about 180x) >> over single cores on a cluster for the MPI run. >> >> Enjoy! >> >> Joe >> > > Hi Joe, > > Looks quite promising. Here are results from a simple real-world test case: > > GPU: Dual GTX280, each with 1GB RAM > CPU: Single Intel Core2 quad Q9550 2.83GHz > > hmmsearch 4 threads sorted: 274.49s > hmmsearch 4 threads unsorted: 254.23s > cuda_hmmsearch unsorted 407.85s > cuda_hmmsearch sorted: 62.69s > cuda_hmmsearch sorted 2 simultaneous runs: 78.23s 80.79s > > Remarks: > > -Running hmmsort to sort the sequence database is critical to obtain > reasonable performance from cuda_hmmsearch. However, the regular > hmmsearch is slightly slower with the sorted database. > > -Running two simultaneous runs assigned to different GPUs on a dual-GPU > quad-core system yields some performance penalty, but is still quite > feasible. > > -I used the parameters THREADSIZE=320 BLOCKSIZE=64. I'm not completely > sure if these are the optimum values for GTX280. Any better suggestions? I'd suggest subscribing/posting to the mpihmmer list (http://lists.scalableinformatics.com/mailman/listinfo/mpihmmer) so we don't clutter beowulf with this. JP could likely tell you more about the appropriate sizing for GTX280. We have a pair of GTX260's in the lab that some runs were done on. Could you describe your run a bit more (sequence queries, hmm size, database and size)? On the mpihmmer list though ... Thanks! Joe > > Regards, > Olli-Pekka -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From daniel.kidger at quadrics.com Wed Dec 17 06:08:24 2008 From: daniel.kidger at quadrics.com (Daniel Kidger) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Inside Tsubame - the Nvidia GPU supercomputer - OpenCL In-Reply-To: <494252B4.7040106@cc.in2p3.fr> References: <20081212075655.GK11544@leitl.org> <49424594.1010709@univ-lemans.fr> <494252B4.7040106@cc.in2p3.fr> Message-ID: <1229522904.9580.29.camel@quadl007> Tsubame isn't just about delivering flops to production codes. It is trying to spearhead coprocessing. The people there have been porting codes to ClearSpeed / GPUs for a while - and hence been publishing their experiences. Daniel On Fri, 2008-12-12 at 12:01 +0000, Loic Tortay wrote: > Florent Calvayrac-Castaing wrote: > [...] > > > > Interesting. > > > > I understand why, when I submitted a joint exploratory project > > about GPU computing two years ago with a Japanese > > colleague we were ranked first in Japan and last in France ; the > > idea seems more popular in Japan if they can fork millions > > on an architecture it is not very quick to program for (at least > > maybe not as fast as Moore's law is increasing power). > > > They may be willing to spend millions because they already have programs > able to use the GPUs. > > If I'm not mistaken, the "Tsubame" cluster was initially using > Clearspeed accelerators (in Sun X4600 "fat" nodes). > > Therefore, they probably have appropriate programs that need little > adaptation (or less than many) to work on the GPUs. > > > Lo?c. > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From oplehto at csc.fi Wed Dec 17 07:41:00 2008 From: oplehto at csc.fi (Olli-Pekka Lehto) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] GPU-HMMer for interested people In-Reply-To: <49448743.7000605@scalableinformatics.com> References: <49448743.7000605@scalableinformatics.com> Message-ID: <49491D8C.60809@csc.fi> Joe Landman wrote: > Hi folks > > GPU-HMMer (part of the MPI-HMMer effort) has just been > announced/released at http://www.mpihmmer.org > > MPI-HMMer has itself been improved with parallel-IO and better > scalability features. JP has measured some large number (about 180x) > over single cores on a cluster for the MPI run. > > Enjoy! > > Joe > Hi Joe, Looks quite promising. Here are results from a simple real-world test case: GPU: Dual GTX280, each with 1GB RAM CPU: Single Intel Core2 quad Q9550 2.83GHz hmmsearch 4 threads sorted: 274.49s hmmsearch 4 threads unsorted: 254.23s cuda_hmmsearch unsorted 407.85s cuda_hmmsearch sorted: 62.69s cuda_hmmsearch sorted 2 simultaneous runs: 78.23s 80.79s Remarks: -Running hmmsort to sort the sequence database is critical to obtain reasonable performance from cuda_hmmsearch. However, the regular hmmsearch is slightly slower with the sorted database. -Running two simultaneous runs assigned to different GPUs on a dual-GPU quad-core system yields some performance penalty, but is still quite feasible. -I used the parameters THREADSIZE=320 BLOCKSIZE=64. I'm not completely sure if these are the optimum values for GTX280. Any better suggestions? Regards, Olli-Pekka -- Olli-Pekka Lehto, Systems Specialist, Special computing, CSC PO Box 405 02101 Espoo, Finland; tel +358 9 4572215, fax +358 9 4572302 CSC is the Finnish IT Center for Science, www.csc.fi, e-mail: Olli-Pekka.Lehto@csc.fi From niftyompi at niftyegg.com Wed Dec 17 16:31:24 2008 From: niftyompi at niftyegg.com (Nifty Tom Mitchell) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] SSD prices - q: how many writes/erases??? In-Reply-To: <20081212153526.GA31871@anuurn.compact> References: <20081211213857.GA7355@bx9> <494263C7.6030009@att.net> <4942726F.1050705@gmail.com> <20081212153526.GA31871@anuurn.compact> Message-ID: <20081218003124.GA3292@compegg.wr.niftyegg.com> On Fri, Dec 12, 2008 at 04:35:26PM +0100, Peter Jakobi wrote: > > > > Reliability is another question and I posted a quick response to > > > this list in a different email. > > > > This being my big concern with flash. > > related is this topic on SSD / flashes: > > what's the life time when changing the same file frequently? > aka "mapping block writes to cell erases" > aka "how many erases are possible?" > > In the days of yore, that was the limitation on using flash, as > writing a block to the same physical location on the flash (for some > to be defined sense of physical location :)) requires a whole slew of > blocks (let's call it a 'cell', maybe containing a few dozens or > thousands of blocks?) to be erased and a subset of them to be written. > > Does anyone have current and uptodate info or researched this issue > already? > > > === > Some of the questions I see before checking recent kernel sources > would be: > > - is there some remapping in the hardware of the ide emulation chip > space of say compactflash or usb sticks? > > - is part of this possible in the ide-emulation in the kernel? > > - or is part of this in the filesystem, that is suddenly after a > decade or more, the fs has to cope again with frequent bad blocks, > like the old bad blocks lists of the SCSI days 2 decades past? > > [basically: is there some 'newish' balancing to limit / redistribute > the number of erases over all cells? Is there a way to relocate cells > that resist erasing, ...?] > > - can I place a filesystem containing some files that are always > rewritten on flash and use say ordinary ext2 or vfat for this? > > - might I even be able to SWAP on flash nowadays? > > - Or do I still have to do voodoo with FUSE overlays or other tricks > to reduce the number of writes leading to cell erases? Maybe check if > there's a real log-structured filesystem available, that has seen > production use outside of labs (and doesn't fail by keeping its some > of its frequently changing metadata in always in the same location). > > -- > cu > Peter > jakobi@acm.org Sun and Micron recently reported a million plus cycles for a single level flash product. Current shipping product is on the order of 100000 cycles. A spinning 54000 rpm disk could possibly hammer a single block to the current limit in about 15 min or so... in a write never read scenario... but that would never happen in reality. This appears to imply that old spinning disk oriented file system structures would quickly cause failure were it not for buffering. File systems like ext[23] combined with kernel data buffering might never touch a disk but once if 100000 writes were to be issued back to back to back. Swap IO will also be subject to buffering/ cache. i.e. if a page was constantly being touched it would not be pushed to physical swap media -- other pages would (about once). Laptop suspend to swap... 100000 cycles -- 5 times an hour, 8 hour work day might be 6 years. Files that never change would be a reservoir of good bits. A lazy daemon that rewrote (with a copy strategy) the oldest file one at a time would expose good bits and sequester bits with larger rw cycle history. Many complex file system optimizations for data locality could be eliminated in ext[23] code to advantage. Metadata writes to a journal would insulate the meta data of the file system itself from numerous rewrites. Spinning media has a number of strong and well tested ECC and defect management features that flash does not yet have (as far as I know). Disk controllers hide this stuff from the OS today... older unix systems had dumber controllers so this stuff could be dusted off. I suspect that the current IO and filesystem code in the Linux system is less stressful than a first blush look at flash might lead one to believe. I have recently added swap to an SD card partion on my OLPC XO. Nine dollars of SD flash based swap goes a long way on the XO with it's activity work flow model. I also build ext file systems on USB keys to keep a growing pile of pdf reference documents handy. So far so good. -- T o m M i t c h e l l Found me a new hat, now what? From niftyompi at niftyegg.com Wed Dec 17 17:05:36 2008 From: niftyompi at niftyegg.com (Nifty Tom Mitchell) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] ntpd wonky? In-Reply-To: <20081215003733.GA2394@bx9> References: <20081209180633.GA21193@bx9> <20081215003733.GA2394@bx9> Message-ID: <20081218010536.GB3292@compegg.wr.niftyegg.com> On Sun, Dec 14, 2008 at 04:37:33PM -0800, Greg Lindahl wrote: > On Thu, Dec 11, 2008 at 11:01:31PM -0800, Bernard Li wrote: > > > Have you tried other pools eg. pool.ntp.org? > > That is stable for me. So it's not me, it's Red Hat's pool that's > wonky. > > I see that CentOS switched to using ntp.org in 5.2, which I didn't > automagically get thanks to rpm creating ntpservers.rpmnew, even > though I hadn't modified the ntpservers file. Mmf. Does it make sense to have a small local pool between the local cluster and the internet? A couple of venerable 'clock[1,2,...]' boxes with a single network interface sitting on a DMZ could have a buffering effect for a thousand boxes one NTP stratum behind them. Major co-location sites should have some provisioning for quality local NTP time references as well. Given the design of pool.ntp.org it makes sense for a business to do some quality audits of ntp.org hosts and contact local high quality ones directly. -- T o m M i t c h e l l Found me a new hat, now what? From csamuel at vpac.org Wed Dec 17 20:31:34 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: <16A5C694-180E-47FA-BF4E-AED9FF8D8A68@xs4all.nl> Message-ID: <1607549355.453581229574694805.JavaMail.root@mail.vpac.org> ----- "Vincent Diepeveen" wrote: > It sure takes a couple of minutes to render animations in high > resolutions, yet i'm quite amazed you need more hardware for this, > yes even a cluster. I might be missing something with your argument, but surely if this was the case then there would no need for the *only* Australian HPC system on the Top500 to be at Animal Logic (Happy Feet) and the 4 New Zealand systems on the Top500 to all to be identical clusters at Weta Digital (Kong, LotR, etc) ? Sad that neither country can manage to get a scientific HPC system on there (for now).. :-( cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From hearnsj at googlemail.com Thu Dec 18 01:15:33 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:06 2009 Subject: [Beowulf] Newbie Question: Racks versus boxes and good rack solutions for commodity hardware In-Reply-To: <5D1C0A92-A5F7-42F7-9265-283C70831D56@xs4all.nl> References: <16A5C694-180E-47FA-BF4E-AED9FF8D8A68@xs4all.nl> <5D1C0A92-A5F7-42F7-9265-283C70831D56@xs4all.nl> Message-ID: <9f8092cc0812180115m6c42342bq8d4d80bfd70382bd@mail.gmail.com> 2008/12/16 Vincent Diepeveen > > That's what we did do (not in cuda, simply at the pc). > A single core P4 can render a second or 10 of animations of an entire scene > within a few minutes. > That's a 3d engine in C/C++ code, not even using SSE2 assembler. > > Where do you get these figures from? The rule of thumb I was used to was that a single frame of an animated feature film will take an hour to render. What happens when CPU horsepower increases is that animators use more sophisticated rendering. Let's think of the original Toy Story - the toys all happened to have flat surfaces in the main. Move on a couple of years to Monsters Inv and Shrek and you start to see the creatures having realistic moving fur - but the rendering time per frame stays about constant. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081218/792b63c8/attachment.html From laytonjb at att.net Thu Dec 18 04:42:07 2008 From: laytonjb at att.net (Jeff Layton) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] SSD prices - q: how many writes/erases??? In-Reply-To: <20081218003124.GA3292@compegg.wr.niftyegg.com> References: <20081211213857.GA7355@bx9> <494263C7.6030009@att.net> <4942726F.1050705@gmail.com> <20081212153526.GA31871@anuurn.compact> <20081218003124.GA3292@compegg.wr.niftyegg.com> Message-ID: <494A451F.8010900@att.net> Nifty Tom Mitchell wrote: > On Fri, Dec 12, 2008 at 04:35:26PM +0100, Peter Jakobi wrote: > > > Sun and Micron recently reported a million plus cycles for a single level flash > product. Current shipping product is on the order of 100000 cycles. > From what I understand these were cherry picked parts from a normal production run. Has anyone heard if this is a new production process or a new concept in SSDs? BTW - Samsung has reported 500K rewrites from cherry picked parts. If they are cherry picked, then it's not really a turning point in SSDs. It also bothers me that a normal production run can have parts with rewrites at 100K and 1M. Sounds like there are some variabilities that can't be controlled. Don't forget that 100K rewrites are SLC products and MLC's are general 10K. Although the Intel drives state 100K with MLC (and pretty decent performance). I'm not sure of the magic in the Intel drives though. Could be over provisioning but I don't know for sure. Anyone care to comment? Jeff From landman at scalableinformatics.com Thu Dec 18 06:03:19 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] SSD prices - q: how many writes/erases??? In-Reply-To: <494A451F.8010900@att.net> References: <20081211213857.GA7355@bx9> <494263C7.6030009@att.net> <4942726F.1050705@gmail.com> <20081212153526.GA31871@anuurn.compact> <20081218003124.GA3292@compegg.wr.niftyegg.com> <494A451F.8010900@att.net> Message-ID: <494A5827.8000700@scalableinformatics.com> Jeff Layton wrote: > Nifty Tom Mitchell wrote: >> On Fri, Dec 12, 2008 at 04:35:26PM +0100, Peter Jakobi wrote: >> >> Sun and Micron recently reported a million plus cycles for a single >> level flash >> product. Current shipping product is on the order of 100000 cycles. >> > > From what I understand these were cherry picked parts from > a normal production run. Has anyone heard if this is a new > production process or a new concept in SSDs? BTW - Samsung > has reported 500K rewrites from cherry picked parts. I thought the Samsung parts were actually a production run (low yield). I hadn't heard that they had done a cherry pick on the existing process. > > If they are cherry picked, then it's not really a turning point > in SSDs. It also bothers me that a normal production run can > have parts with rewrites at 100K and 1M. Sounds like there > are some variabilities that can't be controlled. Similar things happen with RAM. > Don't forget that 100K rewrites are SLC products and MLC's > are general 10K. Although the Intel drives state 100K with > MLC (and pretty decent performance). I'm not sure of the Hmmm ... http://www.ritekusa.com/family_ssd.asp?family_id=9&group=2&division_id=4&group_id=Business I can't find the Intel P/E cycle, but the Ridata units are 2x10^6 (2E+6). -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From james.p.lux at jpl.nasa.gov Thu Dec 18 06:39:16 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] SSD prices - q: how many writes/erases??? In-Reply-To: <494A5827.8000700@scalableinformatics.com> Message-ID: >> >> If they are cherry picked, then it's not really a turning point >> in SSDs. It also bothers me that a normal production run can >> have parts with rewrites at 100K and 1M. Sounds like there >> are some variabilities that can't be controlled. > > Similar things happen with RAM. And in CPUs with respect to speed grades, for instance. It's not inherently a bad thing, as long as the selection process is well understood by the manufacturer (e.g. They obviously can't do a long term life test on them, so they're using some other non-destructive parameter or process indicator that correlates with life) > >> Don't forget that 100K rewrites are SLC products and MLC's >> are general 10K. Although the Intel drives state 100K with >> MLC (and pretty decent performance). I'm not sure of the > > Hmmm ... > http://www.ritekusa.com/family_ssd.asp?family_id=9&group=2&division_id=4&group > _id=Business > > I can't find the Intel P/E cycle, but the Ridata units are 2x10^6 (2E+6). > Is that the underlying device wearout life, or is it the apparent life at the "integrated unit"'s external interface. For instance, if they had a wear leveler and some smart EDAC inside an ASIC that provides the interface, and just added extra capacity to account for the life. After all, it's not like at N cycles, the device stops working. It just starts working "less well" and throwing more errors, and I'll guess (since I don't have the data here in front of me) that there's a fair amount of variability, even within a single device. Consider the testing needed to exhaustively verify the 2E6 number.. 16GB of 512 byte sectors.. That's 160E6 sectors, roughly. They don't give an "erase time" spec, but let's just say 1 millisecond to make things easy. So, to do one erase on ALL sectors takes 160,000 seconds, or about 2 days. In a mere 4 million or so days, one could actually verify the erase life. One can beat a single sector to death in 20,000 seconds or 6 hours. But, is a single sector a valid test? Nope.. You KNOW the EDAC is going to get in the way, not to mention that a single sector test doesn't address the variability across the device issue. You'd probably want to sample, oh, 100 or 1000 or so sectors of the 160 million, to get a reasonable statistical estimate. Now you're back in the days and weeks and months of testing (6000 hours is the better part of a year) regime. You could run accelerated life tests (very common for other electronics), where you run it hot. BUT... That's where the wear out model vs temperature becomes important, and I don't know that Flash is sufficiently well understood for that. Sure, for 2N2222 silicon junction transistors, accelerated life testing works, or even for a ?ium CPU.. But for a device where the basic mode of operation is tied to leakage currents and charge storage? Jim From diep at xs4all.nl Thu Dec 18 06:46:13 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] Sporthall 500 considerations In-Reply-To: <1607549355.453581229574694805.JavaMail.root@mail.vpac.org> References: <1607549355.453581229574694805.JavaMail.root@mail.vpac.org> Message-ID: <8B2AEF59-7C38-467F-B641-9BBD733B60E0@xs4all.nl> Heh Chris, Thanks for dropping a line. Heh, on your own department i happen to see that in systems support you've got a collegue Andrew Underwood. Would you mind asking him whether he's family from Paul Underwood, with 3 persons (Tony Reix, Paul and me says the fool) we're searching for Wagstaffs see http://wapedia.mobi/en/Wagstaff_prime Paul lives in Glouchestershire, UK. I see the Animal Logic machine has L5420's, which are 50 watt TDP. Very interesting. Didn't know Intel had printed that many of the L5420, more common is the E5420. What amazes me is that top500 claims it delivers 10 gflop see http:// www.top500.org/system/9810 This is pretty weird, as i tend to remember it can execute handsdown 2 vector instructions a cycle (SSE2) and that times 4 cores (maybe on paper even 3 vector instructions). So that gives it a theoretic peek of 2 * 2 * 2.5Ghz * 4 = 80 Gflop. Of course scaling to 4.0 is going to be tough, most software i see scaling to 6.89 at such 8 core nodes. Any explanation for this low number of gflops that top500 claims L5420 delivers? Interesting they go for energy efficient cpu's, knowing Australia is worlds biggest exporter of coals. Can you give some explanation on how much the cost is of power in Australia for supercomputers, is it a factor 20 cheaper than it is for normal households when using it in those quantities? Oh on animations and such. Yeah i worked with designers who produce animations as i needed them in my own products. They also work for companies that need animations for big movie edits and of course the same guys create TV-commercials. You know the 20 second ads on TV probably quite well. Majority of income of the best animators is making commercials and advertisement animations. Such an ad usually takes 6 months to get created (the graphics) by a team of about 2. Some sort of dual core cpu is more than enough to render everything. Of course by now that would be 2 Q6600's, one for each designer, simply because they have a cost of near nothing. Very interesting that a nation (australia) where a part of the population just receives 2 TV channels has such a big presence of Animal Logic. Maybe it is the usual salesreason. The psychology of technology really is a convincing argument sometimes to get deals for a huge price. Producer from Animal Logics enters office from a big multinational. Director from multinational: "i need a TV commercial and i need it within a month for product X". Animal Logic Producer: "we can garantuee that our huge supercomputer at animal logics can produce all what you ask for on time". Dang there you go as a 2 person company, job goes once again to a big company, even though you would've done it in 14 days and maybe at a tenth of the budget. Vincent p.s. considering it is so power efficient that cluster from animal logics, maybe they can sell some system time to interested parties, maybe google australia? On Dec 18, 2008, at 5:31 AM, Chris Samuel wrote: > > ----- "Vincent Diepeveen" wrote: > >> It sure takes a couple of minutes to render animations in high >> resolutions, yet i'm quite amazed you need more hardware for this, >> yes even a cluster. > > I might be missing something with your argument, but > surely if this was the case then there would no need > for the *only* Australian HPC system on the Top500 to > be at Animal Logic (Happy Feet) and the 4 New Zealand > systems on the Top500 to all to be identical clusters > at Weta Digital (Kong, LotR, etc) ? > > Sad that neither country can manage to get a > scientific HPC system on there (for now).. :-( > > cheers, > Chris > -- > Christopher Samuel - (03) 9925 4751 - Systems Manager > The Victorian Partnership for Advanced Computing > P.O. Box 201, Carlton South, VIC 3053, Australia > VPAC is a not-for-profit Registered Research Agency > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From landman at scalableinformatics.com Thu Dec 18 07:12:25 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] SSD prices - q: how many writes/erases??? In-Reply-To: References: Message-ID: <494A6859.8070306@scalableinformatics.com> Lux, James P wrote: >> I can't find the Intel P/E cycle, but the Ridata units are 2x10^6 (2E+6). >> > > Is that the underlying device wearout life, or is it the apparent life at > the "integrated unit"'s external interface. For instance, if they had a The latter I believe, as consumers (the mass market consumers) generally don't care about the former. > wear leveler and some smart EDAC inside an ASIC that provides the interface, > and just added extra capacity to account for the life. > > After all, it's not like at N cycles, the device stops working. It just > starts working "less well" and throwing more errors, and I'll guess (since I > don't have the data here in front of me) that there's a fair amount of > variability, even within a single device. > > Consider the testing needed to exhaustively verify the 2E6 number.. 16GB of > 512 byte sectors.. That's 160E6 sectors, roughly. They don't give an "erase > time" spec, but let's just say 1 millisecond to make things easy. So, to do > one erase on ALL sectors takes 160,000 seconds, or about 2 days. > > In a mere 4 million or so days, one could actually verify the erase life. Of course, this is why they do the statistical testing. > One can beat a single sector to death in 20,000 seconds or 6 hours. But, is > a single sector a valid test? Nope.. You KNOW the EDAC is going to get in > the way, not to mention that a single sector test doesn't address the > variability across the device issue. You'd probably want to sample, oh, 100 > or 1000 or so sectors of the 160 million, to get a reasonable statistical > estimate. Now you're back in the days and weeks and months of testing (6000 > hours is the better part of a year) regime. Yup. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From michf at post.tau.ac.il Wed Dec 17 23:10:49 2008 From: michf at post.tau.ac.il (Micha Feigin) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] Book recomendations for cluster programing Message-ID: <20081218091049.60e0f783@hubert.tau.ac.il> I'm trying to learn cluster/parallel programing properly. I've got some information on MPI, although I'm not sure if it's the best books. I was wondering if you have some book recommendations regarding the more specialized things, especially the cpu vs gpu paralelization issue (or as far as I understood, data intensive vs memory intensive programming and how to convert programs from one to the other) proper program/cluster topology considerations and such. Thanks From prentice at ias.edu Thu Dec 18 10:23:58 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] Book recomendations for cluster programing In-Reply-To: <20081218091049.60e0f783@hubert.tau.ac.il> References: <20081218091049.60e0f783@hubert.tau.ac.il> Message-ID: <494A953E.7070405@ias.edu> Micha Feigin wrote: > I'm trying to learn cluster/parallel programing properly. I've got some > information on MPI, although I'm not sure if it's the best books. I was > wondering if you have some book recommendations regarding the more specialized > things, especially the cpu vs gpu paralelization issue (or as far as I > understood, data intensive vs memory intensive programming and how to > convert programs from one to the other) proper program/cluster topology > considerations and such. As similar question was asked a couple of weeks ago: http://www.beowulf.org/archive/2008-December/024078.html http://www.beowulf.org/archive/2008-December/024079.html GPU parallelization is relatively new, so I doubt there are any complete books on the topic. There have been plenty of talks on it and research done in the area, so google is your best bet for that. NVidia does have plenty of documentation related to CUDA on their website: http://www.nvidia.com/object/cuda_home.html -- Prentice From gus at ldeo.columbia.edu Thu Dec 18 11:32:15 2008 From: gus at ldeo.columbia.edu (Gus Correa) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] Book recomendations for cluster programing In-Reply-To: <494A953E.7070405@ias.edu> References: <20081218091049.60e0f783@hubert.tau.ac.il> <494A953E.7070405@ias.edu> Message-ID: <494AA53F.3000303@ldeo.columbia.edu> Prentice Bisbal wrote: > Micha Feigin wrote: > >> I'm trying to learn cluster/parallel programing properly. I've got some >> information on MPI, although I'm not sure if it's the best books. I was >> wondering if you have some book recommendations regarding the more specialized >> things, especially the cpu vs gpu paralelization issue (or as far as I >> understood, data intensive vs memory intensive programming and how to >> convert programs from one to the other) proper program/cluster topology >> considerations and such. >> > > As similar question was asked a couple of weeks ago: > > http://www.beowulf.org/archive/2008-December/024078.html > http://www.beowulf.org/archive/2008-December/024079.html > > GPU parallelization is relatively new, so I doubt there are any complete > books on the topic. There have been plenty of talks on it and research > done in the area, so google is your best bet for that. NVidia does have > plenty of documentation related to CUDA on their website: > > http://www.nvidia.com/object/cuda_home.html > > Hello Mischa and list Prentice already referred to Peter Pacheco's book "Parallel Programming with MPI": http://www.cs.usfca.edu/mpi/ and RGB to Ian Foster's more general book "Designing and Building Parallel Programs": http://www-unix.mcs.anl.gov/dbpp/ GPU programming lacks a common standard API, the literature is small and vendor-dependent, and Prentice already pointed you to the current leader, NVidia and CUDA: http://www.nvidia.com/object/cuda_home.html Another pair of good MPI books (with some typos), are: William Gropp et al., Using MPI: http://www-unix.mcs.anl.gov/mpi/usingmpi/ and William Gropp et al., Using MPI-2: http://www-unix.mcs.anl.gov/mpi/usingmpi2/index.html You can find the detailed syntax and semantics of MPI commands on" Marc Snir et al. MPI: The Complete Reference (Vol. 1), 2nd Edition: *Volume 1 - The MPI Core *http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=5045 and William Gropp et al. MPI: The Complete Reference (Vol. 2) *Volume 2 - The MPI-2 Extensions * http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=4579 Lawrence Livermore National Lab has good tutorials, and other references (MPI and OpenMP): https://computing.llnl.gov/mpi/documentation.html https://computing.llnl.gov/tutorials/mpi/ For OpenMP a good source is Rohit Chandra et al. "Parallel Programming in OpenMP": http://www.amazon.com/Parallel-Programming-OpenMP-Rohit-Chandra/dp/1558606718 There are also two slide tutorials on OpenMP from Ohio Supercomputer Center: http://www.osc.edu/supercomputing/training/openmp/ http://www.osc.edu/supercomputing/training/openmp/openmp_0704.pdf http://www.osc.edu/supercomputing/training/openmp/openmp_0311.pdf I collected some parallel programming resources for our users here: http://fats-raid.ldeo.columbia.edu/pages/parallel_programming.html I hope this helps, Gus Correa --------------------------------------------------------------------- Gustavo Correa, PhD - Email: gus@ldeo.columbia.edu Lamont-Doherty Earth Observatory - Columbia University P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA --------------------------------------------------------------------- From rpnabar at gmail.com Thu Dec 18 11:59:06 2008 From: rpnabar at gmail.com (Rahul Nabar) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] max number of NFS threads: NFS config optimization Message-ID: For a while I've been seeing errors of this sort in my /var/log/messages kernel: nfsd: too many open TCP sockets, consider increasing the number of nfsd threads I googled a bunch and think the solution might be to boost RPCNFSDCOUNT in the line "[ -z "$RPCNFSDCOUNT" ] && RPCNFSDCOUNT=8" in the file /etc/init.d/nfs. Question: This seems a suspicious place to change it. Isn't there a nfs config file somewhere else? Question2: How high can I boost the number of NFS threads? Or how I should I? Is there any metric I can track to decide an optimum number? Most recomendations were for 32, 64 or 128. What do people suggest? Any downsides to having numbers that are too high? In case it matters, this is our master-node Linux server and has the /home directories for each user exported to about 200 odd compute-nodes. -- Rahul From hearnsj at googlemail.com Thu Dec 18 12:47:10 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] max number of NFS threads: NFS config optimization In-Reply-To: References: Message-ID: <9f8092cc0812181247t73a593can90bcc1a81ddbc64f@mail.gmail.com> 2008/12/18 Rahul Nabar > For a while I've been seeing errors of this sort in my /var/log/messages > > > > Question: This seems a suspicious place to change it. Isn't there a nfs > config file somewhere else? > This depends on your distribution. On SuSE Linux, look in /etc/sysconfig/nfs I used to set the number of NFS kernel threads to the number of clients when building a cluster. This is very probably misguided, and far too much. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081218/8cb9ee00/attachment.html From csamuel at vpac.org Thu Dec 18 14:39:35 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] max number of NFS threads: NFS config optimization In-Reply-To: <1145003785.500711229639781060.JavaMail.root@mail.vpac.org> Message-ID: <1869758310.500771229639975924.JavaMail.root@mail.vpac.org> ----- "Rahul Nabar" wrote: > For a while I've been seeing errors of this sort in my > /var/log/messages Are you using ext3 for that filesystem by some chance ? We long ago switched from RHEL to to Debian for our NFS servers so we could use XFS rather than ext3 as ext3 just couldn't keep up with the workload even back then. My understanding from a talk at LCA on Linux scalability many moons ago was that ext3 is single threaded through the journal daemon, so our theory was that as soon as you start to get a lot of writes to the filesystem they all start backing up waiting for the their request to be serviced. :-( We've seen the same with Zimbra on ext3 filesystems, but there we were able to ameliorate the problem by extending the ext3 commit interval from 5 to 15 seconds (it's just a mount option). Could be worth a shot, and you can try it live by doing: mount -o remount,commit=15 /home You can set it back to the default with: mount -o remount,commit=5 /home YMMV, no warranties, batteries not included, if it breaks you get to keep both parts.. ;-) Best of luck! Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Thu Dec 18 14:41:06 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] Book recomendations for cluster programing In-Reply-To: <494AA53F.3000303@ldeo.columbia.edu> Message-ID: <38169185.500871229640066702.JavaMail.root@mail.vpac.org> ----- "Gus Correa" wrote: > GPU programming lacks a common standard API, Hopefully OpenCL will help address this! -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From rpnabar at gmail.com Thu Dec 18 15:10:54 2008 From: rpnabar at gmail.com (Rahul Nabar) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] max number of NFS threads: NFS config optimization In-Reply-To: <20081218160224.7d157626.chekh@pcbi.upenn.edu> References: <20081218160224.7d157626.chekh@pcbi.upenn.edu> Message-ID: On Thu, Dec 18, 2008 at 3:02 PM, Alex Chekholko wrote: > > Up it to 32, see if you keep getting the message? :) Okie! Thanks Alex! I'm going to try 32 now. > > Also check that you're actually running just 8: > ps auxf|grep nfs Just checked that. Yes. It is truely 8. -- Rahul From rpnabar at gmail.com Thu Dec 18 15:19:44 2008 From: rpnabar at gmail.com (Rahul Nabar) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] max number of NFS threads: NFS config optimization In-Reply-To: <9f8092cc0812181246s458f659esd0a019b91b1ec42e@mail.gmail.com> References: <9f8092cc0812181246s458f659esd0a019b91b1ec42e@mail.gmail.com> Message-ID: On Thu, Dec 18, 2008 at 2:46 PM, John Hearns wrote: > This depends on your distribution. > On SuSE Linux, look in /etc/sysconfig/nfs Found it. Its the same file on Fedora, my distro. > > I used to set the number of NFS kernel threads to the number of clients when > building a cluster. This is very probably misguided, and far too much. > Oh! Well, I have 256 compute-nodes here right now. Not sure if that will work then. I've boosted my threads to 32 from 8. But 256 might be an overkill I suspect! -- Rahul From chekh at pcbi.upenn.edu Thu Dec 18 13:02:24 2008 From: chekh at pcbi.upenn.edu (Alex Chekholko) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] max number of NFS threads: NFS config optimization In-Reply-To: References: Message-ID: <20081218160224.7d157626.chekh@pcbi.upenn.edu> On Thu, 18 Dec 2008 13:59:06 -0600 "Rahul Nabar" wrote: > For a while I've been seeing errors of this sort in my /var/log/messages > > kernel: nfsd: too many open TCP sockets, consider increasing the number of > nfsd threads > > I googled a bunch and think the solution might be to boost RPCNFSDCOUNT in > the line "[ -z "$RPCNFSDCOUNT" ] && RPCNFSDCOUNT=8" in the file > /etc/init.d/nfs. > > Question: This seems a suspicious place to change it. Isn't there a nfs > config file somewhere else? > > Question2: How high can I boost the number of NFS threads? Or how I should > I? Is there any metric I can track to decide an optimum number? Most > recomendations were for 32, 64 or 128. What do people suggest? Any > downsides to having numbers that are too high? > > In case it matters, this is our master-node Linux server and has the > /home directories for each user > exported to about 200 odd compute-nodes. On EL, there's /etc/sysconfig/nfs, and /etc/defaults/nfs-kernel-server on Debian-based systems. Up it to 32, see if you keep getting the message? :) Also check that you're actually running just 8: ps auxf|grep nfs Regards, -- Alex Chekholko chekh@pcbi.upenn.edu From eric.l.2046 at gmail.com Thu Dec 18 23:58:50 2008 From: eric.l.2046 at gmail.com (Eric Liang) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? Message-ID: <494B543A.9040103@gmail.com> Hi all, Clusters, Grids, MPPs, MPI, OpenMP, HA, LB, GPGPU, FPGA, SMP, NUMA, SSE etc.. These abbreviations and terms almost cram my head, so I have to redvelop and re-index them in my memory(brain). As a newbie, when I read the articles in wikipekia, I got confused. In the segment Cluster categorizations of the article about Cluster , there're three classes: High-avaiablity clusters, Load-balancing clusters and Grid computing(?!). While in the article about Parallel computing , both "Cluster computing" and "Grid computing" are the subclasses of "Distribute computing" ,and the third one is "Massive parallel processing ". IMHO, the latter category is more reasonable(right or not?) . However, since there are too many cluster software products, how can I categorize Beowulf like clusters( loosely coupled, use MPI)? or what's the category of Beowulf like clusters? Thanks in advance for any suggestions from your knowledge. Eric -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://www.scyld.com/pipermail/beowulf/attachments/20081219/71c5cec6/signature.bin From james.p.lux at jpl.nasa.gov Fri Dec 19 06:34:36 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: <494B543A.9040103@gmail.com> Message-ID: On 12/18/08 11:58 PM, "Eric Liang" wrote: > Hi all, > Clusters, Grids, MPPs, MPI, OpenMP, HA, LB, GPGPU, FPGA, SMP, NUMA, > SSE etc.. > These abbreviations and terms almost cram my head, so I have to > redvelop and re-index them in my memory(brain). > > As a newbie, when I read the articles in wikipekia, I got confused. > In the segment Cluster categorizations > > of the article about Cluster > , there're three classes: > High-avaiablity clusters, Load-balancing clusters and Grid > computing(?!). While in the article about Parallel computing > rs> > , both "Cluster computing" and "Grid computing" are the subclasses of > "Distribute computing" ,and the third one is "Massive parallel > processing ". IMHO, the latter category is more reasonable(right or not?) . > However, since there are too many cluster software products, how > can I categorize Beowulf like clusters( loosely coupled, use MPI)? or > what's the category of Beowulf like clusters? > I've been looking over the Beowulf book by Seamus Heaney, and I can't find any reference to clusters in it, so maybe Wikipedia has been misled? Perhaps it's a translation issue? Beowulf -> the definition evolves.. But.. I'd say it's a bunch of interconnected commodity computers intended to form a single computational resource. Some would add "running open source software" Commodity is important. It's not special purpose hardware, but leverages the economies of scale to minimize cost. Bunch of interconnected is important.(no vector processors). Single computational resource -> you can devote the entire cluster to just one problem, as opposed to, say, a mass of transaction processing boxes or web servers in a high availability cluster. From hahn at mcmaster.ca Fri Dec 19 09:07:23 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: <494B543A.9040103@gmail.com> References: <494B543A.9040103@gmail.com> Message-ID: > Clusters, Grids, MPPs, MPI, OpenMP, HA, LB, GPGPU, FPGA, SMP, NUMA, > SSE etc.. > These abbreviations and terms almost cram my head, so I have to > redvelop and re-index them in my memory(brain). think of what the acronym is abbreviating, and the logic of that name. > As a newbie, when I read the articles in wikipekia, I got confused. > In the segment Cluster categorizations > that's horrible and incomplete. IMO, HA and load-balancing are not really distinctly different, since LB is really just active-active HA. (or HA is active-passive LB). other than HA/LB, clusters are computational. within compute clusters, the main distinction is how tightly coupled they (or the programs they run) are. grids are the extreme loose end: basically no inter-node communication, often geographically distributed, often ad-hoc collections of different kinds of machines run by different organizations. the opposite is a homogenous, tightly-coupled cluster with a dedicated local network optimized for inter-node communication and running few multi-node jobs. > , both "Cluster computing" and "Grid computing" are the subclasses of "cluster computing" is descriptive: the entity is a set of nodes somehow combined, usually by a local communication fabric. by definition, the nodes are separate, so distributed. (the 'distributed' here means that communication is by explicit message passing; the opposite is shared-memory, where communication is implicit and done by read/write operations to memory.) "grid" is a marketing term for "loosely coupled distributed clustering"; it was a trendy word 10 years ago, but has fallen into disuse because it's so generic (and not all that widely applicable). > "Distribute computing" ,and the third one is "Massive parallel > processing ". IMHO, the latter category is more reasonable(right or not?) . MPP doesn't mean much; its best to avoid the term and stick to more specific ones. > However, since there are too many cluster software products, how > can I categorize Beowulf like clusters( loosely coupled, use MPI)? or beowulf certainly does not imply loose coupling (or rule out PVM.) > what's the category of Beowulf like clusters? beowulf is compute clustering using mostly commodity hardware and mostly open-source software. From marcelosoaressouza at gmail.com Fri Dec 19 11:13:07 2008 From: marcelosoaressouza at gmail.com (Marcelo Souza) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] mpich2 1.0.8 package for Slackware 12.2 (i486) Message-ID: <12c9ca330812191113n4e313d0bx31ba7b018c4f71ce@mail.gmail.com> mpich2 1.0.8 package for Slackware 12.2 (i486) at: http://www.cebacad.net/files/mpich2-1.0.8-i486-goa.tgz -- Marcelo Soares Souza (marcelo@cebacad.net) http://marcelo.cebacad.net From rgb at phy.duke.edu Fri Dec 19 11:46:24 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: References: <494B543A.9040103@gmail.com> Message-ID: On Fri, 19 Dec 2008, Mark Hahn wrote: >> what's the category of Beowulf like clusters? > > beowulf is compute clustering using mostly commodity hardware and mostly > open-source software. And if you want to be really picky, it should be an architecture that "looks like a supercomputer" in the sense that it has an inside network containing nodes and a dual-network head node that "is the cluster" as far as the outside networks are concerned. That is, the original beowulf was designed to run relatively tightly coupled and synchronous code. However, at this point beowulf is like kleenex -- a brand name that has become synonymous with more than its strict original definition. People will refer to nearly any HPC cluster as "a beowulf", and quite a few people misuse the term to include some classes of HA clusters as well. It may be that in the end beowulf simply means "compute cluster", but on this list we try to rule out HA most of the time because it has some different concerns (as well as some concerns that are the same). My own clusters, for example, have always been much more in the distributed non-beowulf flavor, but they are definitely HPC and so I get to play. Grids are even further away from strict beowulfs, but we've certainly had many grid discussions on list. So this is really the "HPC compute cluster list" with rare nods to an HA topic, with the understanding that the most "interesting" list discussions focus on parallel computation coding (tools, languages etc), cluster design, and cluster networking advanced or mundane. rgb > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From rgb at phy.duke.edu Fri Dec 19 11:50:00 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: References: Message-ID: On Fri, 19 Dec 2008, Lux, James P wrote: > Beowulf -> the definition evolves.. But.. I'd say it's a bunch of > interconnected commodity computers intended to form a single computational > resource. Some would add "running open source software" Commodity is Some would, and it absolutely is in the original definition and perhaps even worth fighting for, or about. Not necessarily ONLY open source software, but the operating system should be open source if nothing else. > important. It's not special purpose hardware, but leverages the economies of > scale to minimize cost. Bunch of interconnected is important.(no vector > processors). Single computational resource -> you can devote the entire No SINGLE vector processors. A cluster of systems containing vector processors is just fine...;-) rgb > cluster to just one problem, as opposed to, say, a mass of transaction > processing boxes or web servers in a high availability cluster. > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From james.p.lux at jpl.nasa.gov Fri Dec 19 12:26:33 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: Message-ID: On 12/19/08 11:50 AM, "Robert G. Brown" wrote: > On Fri, 19 Dec 2008, Lux, James P wrote: > >> Beowulf -> the definition evolves.. But.. I'd say it's a bunch of >> interconnected commodity computers intended to form a single computational >> resource. Some would add "running open source software" Commodity is > > Some would, and it absolutely is in the original definition and perhaps > even worth fighting for, or about. Not necessarily ONLY open source > software, but the operating system should be open source if nothing > else. I agree philosophically, but one has to wonder if open source OS is an essential part of beowulf-ness, or just happened to be an enabling aspect (partly because Don B. was cranking out drivers for Linux.. Another enabling aspect). That is, beowulfs wouldn't have existed without open source OS, but neither would they have existed without cheap commodity PCs, which are hardly open source. And I think the real enabler was the realization that you could use the proverbial "pile o' PCs" to do real work. > >> important. It's not special purpose hardware, but leverages the economies of >> scale to minimize cost. Bunch of interconnected is important.(no vector >> processors). Single computational resource -> you can devote the entire > > No SINGLE vector processors. A cluster of systems containing vector > processors is just fine...;-) Is a group of vector processors a matrix processor (or tensor processor, depending on the rank?)? Or isn't it just a bigger vector processor? I think parallelism, in some sense (even if EP) is an essential part of beowulfness. As is "potentially single task-ness".. But I still wonder about the prevalence of clusters in Geatland. I'm rereading the historical document now, and I find mentions of halls of warriors (hmm. Parallelism, commodity, open source, all potentially working on a single task.. But that puts our man Beowulf really functioning as the head node, and nobody refers to the army as Beowulf. Finally, when it comes to really useful stuff (slaying of Grendel, etc.), B does it himself) Maybe the clustering stuff is in that digression in the middle of the wedding, which I always skip over. From prentice at ias.edu Fri Dec 19 13:39:44 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] OOM errors when running HPL Message-ID: <494C14A0.3070402@ias.edu> I've got a new problem with my cluster. Some of this problem may be with my queuing system (SGE), but I figured I'd post here first. I've been using hpl to test my new cluster. I generally run a small problem size (Ns=60000)so the job only runs 15-20 minutes. Last night, I upped the problem size by a factor of 10 to Ns=600000). Shortly after submitting the job, have the nodes were shown as down in Ganglia. I killed the job with qdel, and the majority of the nodes came back, but about 1/3 did not. When I came in this morning, there were kernel panic/OOM type messages on the consoles of the systems that never came back. I used to run hpl jobs much bigger than this on my cluster w/o a problem. There's nothing I actively changes, but there might have been some updates to the OS (kernel, libs, etc) since the last time I ran a job this big. Any ideas where I should begin looking? -- Prentice From rgb at phy.duke.edu Fri Dec 19 14:20:31 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: References: Message-ID: On Fri, 19 Dec 2008, Lux, James P wrote: > > > > On 12/19/08 11:50 AM, "Robert G. Brown" wrote: > >> On Fri, 19 Dec 2008, Lux, James P wrote: >> >>> Beowulf -> the definition evolves.. But.. I'd say it's a bunch of >>> interconnected commodity computers intended to form a single computational >>> resource. Some would add "running open source software" Commodity is >> >> Some would, and it absolutely is in the original definition and perhaps >> even worth fighting for, or about. Not necessarily ONLY open source >> software, but the operating system should be open source if nothing >> else. > > I agree philosophically, but one has to wonder if open source OS is an > essential part of beowulf-ness, or just happened to be an enabling aspect Absolutely. Read the original statement and definition on the beowulf.org website and it explains why. As have I a time or two. Too many things upon which performance critically depends are in the OS, and who wants to rely on Microsoft for bugfixes? These are the folks that left a critical exploit in Explorer for six months just last year, so that every Windows machine on the planet at that time had at least spyware infecting it at the end of it. > (partly because Don B. was cranking out drivers for Linux.. Another enabling > aspect). That is, beowulfs wouldn't have existed without open source OS, > but neither would they have existed without cheap commodity PCs, which are > hardly open source. And I think the real enabler was the realization that > you could use the proverbial "pile o' PCs" to do real work. It was (and is) both, or all three: Commodity PCs, Commodity network, and Open Source Operating System. Microsoft can sell a cluster product if they want, but they cannot call that product a beowulf cluster. > Is a group of vector processors a matrix processor (or tensor processor, > depending on the rank?)? Or isn't it just a bigger vector processor? I think it is either a tensor processor of some rank or possibly a graded division algebra, personally. Row AND column indices on your parallelization... > I think parallelism, in some sense (even if EP) is an essential part of > beowulfness. As is "potentially single task-ness".. I would certainly never argue, but at the very first PVM talk I ever attended, back in the fall of 1992, the demonstration by Vaidy involved a pile of (IIRC) a mixture of DECs, Suns, and (no kidding) a Cray. And of course I would argue that Geist, Dongarra et. al. are the ones that REALLY "invented" the HPC commodity parallel cluster. Lots of people were doing massively parallel computing on piles of workstations before "the beowulf", myself among them. The only real difference was that linux let one use REALLY cheap PCs, and to be honest the difference in performance between the P5/Pentium and a Sun workstation was so large that there wasn't much of a PRICE/performance advantage in a cluster of PCs vs a cluster of the cheaper Suns. The P6 -- the Pentium Pro -- was another matter. A dual PPro gave you a competive number of raw FLOPS compared to (say) an inexpensive Sun and let you buy a LOT more aggregate FLOPS/dollar. The point being that parallelizing vector processors has a longer history that the beowulf (CM5, anyone?) and is certainly "permitted" by the model, as long as they are commodity vector processors. > But I still wonder about the prevalence of clusters in Geatland. I'm > rereading the historical document now, and I find mentions of halls of > warriors (hmm. Parallelism, commodity, open source, all potentially working > on a single task.. But that puts our man Beowulf really functioning as the > head node, and nobody refers to the army as Beowulf. Finally, when it comes > to really useful stuff (slaying of Grendel, etc.), B does it himself) Maybe > the clustering stuff is in that digression in the middle of the wedding, > which I always skip over. There you've got me. But hey, I named my first efforts "distributed parallel supercomputers" and look where it got me:-) Not exactly a household word, eh? Where everybody (with a bit of nerd in them everybody, admittedly) knows what a beowulf is, sort of. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From rpnabar at gmail.com Fri Dec 19 15:53:46 2008 From: rpnabar at gmail.com (Rahul Nabar) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] max number of NFS threads: NFS config optimization Message-ID: >Are you using ext3 for that filesystem by some chance ? Thanks Chris! It is indeed an ext3. I will give the commit interval solution a shot. Can't destroy things too badly I think. Worst case I can remount. -- Rahul From dnlombar at ichips.intel.com Fri Dec 19 16:03:25 2008 From: dnlombar at ichips.intel.com (David N. Lombard) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] max number of NFS threads: NFS config optimization In-Reply-To: References: Message-ID: <20081220000325.GA21071@nlxdcldnl2.cl.intel.com> On Thu, Dec 18, 2008 at 12:59:06PM -0700, Rahul Nabar wrote: > For a while I've been seeing errors of this sort in my /var/log/messages > > kernel: nfsd: too many open TCP sockets, consider increasing the number of > nfsd threads > > I googled a bunch and think the solution might be to boost RPCNFSDCOUNT in > the line "[ -z "$RPCNFSDCOUNT" ] && RPCNFSDCOUNT=8" in the file > /etc/init.d/nfs. > > Question: This seems a suspicious place to change it. Isn't there a nfs > config file somewhere else? That would be a Bad Thing(TM). That line is looking to see if the shell variable RPCNFSDCOUNT has a value set; if not, it sets a default value of 8. Look up in the file. In my system (Fedora 7), this line is earlier in the file [ -f /etc/sysconfig/nfs ] && . /etc/sysconfig/nfs This line looks for the file /etc/sysconfig/nfs and sources it. So, you would put a line that looks like RPCNFSDCOUNT=16 if you wanted to set the value to 16. > Question2: How high can I boost the number of NFS threads? Or how I should > I? Is there any metric I can track to decide an optimum number? Most > recomendations were for 32, 64 or 128. What do people suggest? Any > downsides to having numbers that are too high? How many concurrent requests do you think you'll need to satisfy? Try that number. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From skylar at cs.earlham.edu Fri Dec 19 16:49:18 2008 From: skylar at cs.earlham.edu (Skylar Thompson) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] OOM errors when running HPL In-Reply-To: <494C14A0.3070402@ias.edu> References: <494C14A0.3070402@ias.edu> Message-ID: <494C410E.3070002@cs.earlham.edu> Prentice Bisbal wrote: > I've got a new problem with my cluster. Some of this problem may be with > my queuing system (SGE), but I figured I'd post here first. > > I've been using hpl to test my new cluster. I generally run a small > problem size (Ns=60000)so the job only runs 15-20 minutes. Last night, I > upped the problem size by a factor of 10 to Ns=600000). Shortly after > submitting the job, have the nodes were shown as down in Ganglia. > > I killed the job with qdel, and the majority of the nodes came back, but > about 1/3 did not. When I came in this morning, there were kernel > panic/OOM type messages on the consoles of the systems that never came > back. > > I used to run hpl jobs much bigger than this on my cluster w/o a > problem. There's nothing I actively changes, but there might have been > some updates to the OS (kernel, libs, etc) since the last time I ran a > job this big. Any ideas where I should begin looking? I've run into similar problems, and traced it to the way Linux overcommits RAM. What are your vm.overcommit_memory and vm.overcommit_ratio sysctls set to, and how much swap and RAM do the nodes have? -- -- Skylar Thompson (skylar@cs.earlham.edu) -- http://www.cs.earlham.edu/~skylar/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature Url : http://www.scyld.com/pipermail/beowulf/attachments/20081219/a5214870/signature.bin From rpnabar at gmail.com Fri Dec 19 17:32:36 2008 From: rpnabar at gmail.com (Rahul Nabar) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] max number of NFS threads: NFS config optimization In-Reply-To: <20081220000325.GA21071@nlxdcldnl2.cl.intel.com> References: <20081220000325.GA21071@nlxdcldnl2.cl.intel.com> Message-ID: On Fri, Dec 19, 2008 at 6:03 PM, David N. Lombard wrote: > This line looks for the file /etc/sysconfig/nfs and sources it. So, > you would put a line that looks like > > RPCNFSDCOUNT=16 > > if you wanted to set the value to 16. Thanks David. I did exactly that. > How many concurrent requests do you think you'll need to satisfy? > Try that number. How do I interpret "concurrent". I have the /home mounted by about 256 clients via NFS. Not each node would be accessing the same file at the same time though. Besides there could be two processes on a node accessing two different files on /home via NFS. What time-granularity of access are we looking at. Maybe some of my queries are irrelevant; sorry I am quite a newbiee about the internal workings of file-systems. More relevant though: Any ways / hacks to monitor , say, for 24 hours of system run how many concurrent requests I've had? Perhaps I can script something together for diagnostics? -- Rahul From gdjacobs at gmail.com Fri Dec 19 17:58:14 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: References: <494B543A.9040103@gmail.com> Message-ID: <494C5136.6080107@gmail.com> Mark Hahn wrote: >> Clusters, Grids, MPPs, MPI, OpenMP, HA, LB, GPGPU, FPGA, SMP, NUMA, >> SSE etc.. >> These abbreviations and terms almost cram my head, so I have to >> redvelop and re-index them in my memory(brain). > > think of what the acronym is abbreviating, and the logic of that name. > >> As a newbie, when I read the articles in wikipekia, I got confused. >> In the segment Cluster categorizations >> > > that's horrible and incomplete. It's wiki. Why don't we fix it? > IMO, HA and load-balancing are not really distinctly different, since LB > is really just active-active HA. (or HA is active-passive LB). > other than HA/LB, clusters are computational. within compute clusters, > the main distinction is how tightly coupled they (or the programs they run) > are. grids are the extreme loose end: basically no inter-node > communication, > often geographically distributed, often ad-hoc collections of different > kinds > of machines run by different organizations. the opposite is a homogenous, > tightly-coupled cluster with a dedicated local network optimized for > inter-node communication and running few multi-node jobs. > >> , both "Cluster computing" and "Grid computing" are the subclasses of > > "cluster computing" is descriptive: the entity is a set of nodes somehow > combined, usually by a local communication fabric. by definition, the > nodes > are separate, so distributed. (the 'distributed' here means that > communication is by explicit message passing; the opposite is > shared-memory, > where communication is implicit and done by read/write operations to > memory.) > > "grid" is a marketing term for "loosely coupled distributed clustering"; > it was a trendy word 10 years ago, but has fallen into disuse because > it's so generic (and not all that widely applicable). I always thought the idea was to charge for computing as a service (just like the electrical utility). Actually, many firms are doing this now. Amazon, for example. >> "Distribute computing" ,and the third one is "Massive parallel >> processing ". IMHO, the latter category is more reasonable(right or >> not?) . > > MPP doesn't mean much; its best to avoid the term and stick to more > specific ones. > >> However, since there are too many cluster software products, how >> can I categorize Beowulf like clusters( loosely coupled, use MPI)? or > > beowulf certainly does not imply loose coupling (or rule out PVM.) Far from it. In fact, a great deal of work goes into optimizing the interconnect and the software payload for tightly coupled, fine granularity workloads. >> what's the category of Beowulf like clusters? > > beowulf is compute clustering using mostly commodity hardware and mostly > open-source software. Designed to reduce the clock time and/or increase the maximum practical problem size of computational problems. -- Geoffrey D. Jacobs From jlb17 at duke.edu Fri Dec 19 19:29:22 2008 From: jlb17 at duke.edu (Joshua Baker-LePain) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] max number of NFS threads: NFS config optimization In-Reply-To: References: <20081220000325.GA21071@nlxdcldnl2.cl.intel.com> Message-ID: On Fri, 19 Dec 2008 at 7:32pm, Rahul Nabar wrote > More relevant though: Any ways / hacks to monitor , say, for 24 hours > of system run how many concurrent requests I've had? Perhaps I can > script something together for diagnostics? cat /proc/net/rpc/nfsd Have a look at the line that starts with "th". The numbers after th represent, from left to right: o The number of threads o The number of times requests have had to wait b/c all threads were busy o The amount of time 10% of the threads have been busy o The amount of time 20% of the threads have been busy o and so on, up to 100% of the threads. You want that 2nd number (and, by extension, the last number) to be 0 or at least very low. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF From hahn at mcmaster.ca Fri Dec 19 21:44:16 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: <494C5136.6080107@gmail.com> References: <494B543A.9040103@gmail.com> <494C5136.6080107@gmail.com> Message-ID: >> "grid" is a marketing term for "loosely coupled distributed clustering"; >> it was a trendy word 10 years ago, but has fallen into disuse because >> it's so generic (and not all that widely applicable). > > I always thought the idea was to charge for computing as a service (just > like the electrical utility). Actually, many firms are doing this now. > Amazon, for example. the Cloud is the new Grid ;) I don't think Grid did much to make money, though there was a lot of talk about how Grid would bring utility computing - as easy as plugging into a wall socket. From james.p.lux at jpl.nasa.gov Sat Dec 20 07:46:42 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: Message-ID: On 12/19/08 9:44 PM, "Mark Hahn" wrote: >>> "grid" is a marketing term for "loosely coupled distributed clustering"; >>> it was a trendy word 10 years ago, but has fallen into disuse because >>> it's so generic (and not all that widely applicable). >> >> I always thought the idea was to charge for computing as a service (just >> like the electrical utility). Actually, many firms are doing this now. >> Amazon, for example. > > the Cloud is the new Grid ;) > > I don't think Grid did much to make money, though there was a lot of talk > about how Grid would bring utility computing - as easy as plugging into a > wall socket. I think the problems with that model are more to do with non-computing issues than with "making it work". Things like assuring information security are a big deal for anyone who has serious money to spend on the computation. It's one thing to distribute SETI@home, totally another to distribute, say, MRI processing or FEM models of your new widget design or... From james.p.lux at jpl.nasa.gov Sat Dec 20 07:57:00 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: <494D0432.4050806@gmail.com> Message-ID: On 12/20/08 6:41 AM, "Liang Yupeng" wrote: > Lux, James P wrote: >> >> I've been looking over the Beowulf book by Seamus Heaney, and I can't find >> any reference to clus http://www.amazon.com/Beowulf-New-Verse-Translation-Bilingual/dp/0393320979ters in it, so maybe Wikipedia has been misled? Perhaps >> it's a translation issue? >> >> > Thanks for your reply, but "no reference in the book" does not mean that > Beowulf is not a cluster, does it? Especially since that particular book is a translation of the epic poem about Beowulf. http://www.amazon.com/Beowulf-New-Verse-Translation-Bilingual/dp/0393320979 (my daughter is using an older translation in her 10th grade english class, and prefers the new one) >> Beowulf -> the definition evolves.. But.. I'd say it's a bunch of >> interconnected commodity computers intended to form a single computational >> resource. Some would add "running open source software" Commodity is >> important. It's not special purpose hardware, but leverages the economies of >> scale to minimize cost. Bunch of interconnected is important.(no vector >> processors). Single computational resource -> you can devote the entire >> cluster to just one problem, as opposed to, say, a mass of transaction >> processing boxes or web servers in a high availability cluster. >> >> > Maybe I didn't get you (sorry for that), it looks like that your > explanation on Beowulf not the category about it but the description. > Thank you all the same. True enough.. But I'm not sure that developing a HPC taxonomy adds value to the process of doing the computation, or, more to the point, to selecting an appropriate way to solve one's particular computational needs. It's not like most folks sit there with their problem and try to analyze it in terms of some computing taxonomy. The "spectrum" of possible ways to build a computer is small enough that you can conceptualize them all simultaneously. It's not like trying to decide where to put a particular hummingbird in the Linnaean scheme. (K P C F G S, etc.) > From eric.l.2046 at gmail.com Sat Dec 20 08:48:37 2008 From: eric.l.2046 at gmail.com (Eric Liang) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: References: Message-ID: <494D21E5.8080400@gmail.com> First of all, sorry for sending mails by the address: start2046@gmail.com. Just like the pgp signature says , I use that one in my country(yes,china). Lux, James P wrote: > > On 12/20/08 6:41 AM, "Liang Yupeng" wrote: > > >> Lux, James P wrote: >> >>> I've been looking over the Beowulf book by Seamus Heaney, and I can't find >>> any reference to clus >>> > http://www.amazon.com/Beowulf-New-Verse-Translation-Bilingual/dp/0393320979ters > in it, so maybe Wikipedia has been misled? Perhaps > the amzon said: Looking for something? We're sorry. The Web address you entered is not a functioning page on our site >>> it's a translation issue? >>> >>> > > >> Thanks for your reply, but "no reference in the book" does not mean that >> Beowulf is not a cluster, does it? >> > > Especially since that particular book is a translation of the epic poem > about Beowulf. > http://www.amazon.com/Beowulf-New-Verse-Translation-Bilingual/dp/0393320979 > (my daughter is using an older translation in her 10th grade english class, > and prefers the new one) > > Maybe I've read the story on wiki and somebody have talk about why use the name(beowulf) in this list(IIRC). :-P > > >>> Beowulf -> the definition evolves.. But.. I'd say it's a bunch of >>> interconnected commodity computers intended to form a single computational >>> resource. Some would add "running open source software" Commodity is >>> important. It's not special purpose hardware, but leverages the economies of >>> scale to minimize cost. Bunch of interconnected is important.(no vector >>> processors). Single computational resource -> you can devote the entire >>> cluster to just one problem, as opposed to, say, a mass of transaction >>> processing boxes or web servers in a high availability cluster. >>> >>> >>> >> Maybe I didn't get you (sorry for that), it looks like that your >> explanation on Beowulf not the category about it but the description. >> Thank you all the same. >> > > True enough.. But I'm not sure that developing a HPC taxonomy adds value to > the process of doing the computation, or, more to the point, to selecting an > appropriate way to solve one's particular computational needs. > Yes, if I've decided use beowulf(or another one), but how can I make the decision if I couldn't tell the differences of the cluster softwares? > It's not like most folks sit there with their problem and try to analyze it > in terms of some computing taxonomy. The "spectrum" of possible ways to > build a computer is small enough that you can conceptualize them all > simultaneously. It's not like trying to decide where to put a particular > hummingbird in the Linnaean scheme. (K P C F G S, etc.) > Hmm, I didn't mean that. I put the question here just because the wiki's explanations have puzzled me, and IMO, this mailling list is a appropriately place where most fellows should have the related knowledge.For me, if I am *interested* in one software, I'll learn more and more details about it and compare it to the others, not just use it.If I have offended you,I ask your pardon. Many thanks for your reply and the books. Eric -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://www.scyld.com/pipermail/beowulf/attachments/20081221/b5011851/signature.bin From deadline at eadline.org Sat Dec 20 12:29:13 2008 From: deadline at eadline.org (Douglas Eadline) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: References: Message-ID: <33557.192.168.1.213.1229804953.squirrel@mail.eadline.org> When the topic of history comes up I always suggest getting it from the source: Beowulf Breakthroughs: The Path to Commodity Supercomputing by Tom Sterling http://www.linux-mag.com/id/1378 Plus I always point back to the definition introduced in the original "How to Build a Beowulf book (Thomas Sterling, John Salmon, Donald J. Becker and Daniel F. Savarese, MIT Press, ISBN 0-262-69218-X): A Beowulf is a collection of personal computers interconnected by widely-available networking technology running one of several open source, Unix- like operating systems. Note that "personal computers" was used as the whole "x86 server" market had not even developed (think Pentium Pro era) as the web was in it's infancy. -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From marcelosoaressouza at gmail.com Sat Dec 20 12:49:53 2008 From: marcelosoaressouza at gmail.com (Marcelo Souza) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] mpich2 1.0.8 package for Debian 5.0 Lenny Message-ID: <12c9ca330812201249m4c1126c0i137fe61cef8568d1@mail.gmail.com> mpich2 1.0.8 complete package (bin and libs) for Debian 5.0 Lenny (i386) at: http://www.cebacad.net/files/mpich2_1.0.8_i386.deb http://www.cebacad.net/files/mpich2_1.0.8_i386.deb.md5 Soon x86_64 arch too... -- Marcelo Soares Souza (marcelo@cebacad.net) http://marcelo.cebacad.net From deadline at eadline.org Sat Dec 20 12:53:53 2008 From: deadline at eadline.org (Douglas Eadline) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: <494D21E5.8080400@gmail.com> References: <494D21E5.8080400@gmail.com> Message-ID: <41669.192.168.1.213.1229806433.squirrel@mail.eadline.org> >> > Maybe I've read the story on wiki and somebody have talk about why use > the name(beowulf) in this list(IIRC). :-P >> >> Authoritative answer: http://www.linux-mag.com/id/1378 -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Sat Dec 20 16:08:00 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: <494D21E5.8080400@gmail.com> Message-ID: On 12/20/08 8:48 AM, "Eric Liang" wrote: > First of all, sorry for sending mails by the address: > start2046@gmail.com. Just like the pgp signature says , I use that one > in my country(yes,china). > > Lux, James P wrote: >> >> On 12/20/08 6:41 AM, "Liang Yupeng" wrote: >> >> >>> Lux, James P wrote: >>> >>>> I've been looking over the Beowulf book by Seamus Heaney, and I can't find >>>> any reference to clus >>>> >> http://www.amazon.com/Beowulf-New-Verse-Translation-Bilingual/dp/0393320979te >> rs >> in it, so maybe Wikipedia has been misled? Perhaps >> > the amzon said: > Looking for something? > We're sorry. The Web address you entered is not a functioning page on > our site Interesting.. In any case, the book is a recent new translation of the epic poem written around 1000 years ago. Much more readable, but also of no connection to beowulf (the cluster computing). >> > Maybe I've read the story on wiki and somebody have talk about why use > the name(beowulf) in this list(IIRC). :-P You'd have to ask Don Becker about that. He chose the name when he built the first one a few years ago. http://www.beowulf.org/ has a little history. >>> Maybe I didn't get you (sorry for that), it looks like that your >>> explanation on Beowulf not the category about it but the description. >>> Thank you all the same. >>> >> >> True enough.. But I'm not sure that developing a HPC taxonomy adds value to >> the process of doing the computation, or, more to the point, to selecting an >> appropriate way to solve one's particular computational needs. >> > Yes, if I've decided use beowulf(or another one), but how can I make the > decision if I couldn't tell the differences of the cluster softwares? Oh.. That's a bit more difficult. And, you're right, the Wikipedia entry isn't necessarily the best way to go about it. Asking here on the list is probably a better way. If you describe your problem, then folks will most likely respond with their opinions of what sort of computational architecture is well suited. Opinions WILL differ, but I think everyone here will justify their opinion (some at more length than others) and you can use that to see if it's applicable to your particular problem. >> > Hmm, I didn't mean that. I put the question here just because the wiki's > explanations have puzzled me, and IMO, this mailling list is a > appropriately place where most fellows should have the related > knowledge. You've got that right. This *is* the place to ask. > For me, if I am *interested* in one software, I'll learn more > and more details about it and compare it to the others, not just use > it.If I have offended you,I ask your pardon. > Many thanks for your reply and the books. > Gosh.. No offense at all. But you should be aware that "beowulf" is a fairly generic description of a general approach to computing, so there's no "Beowulf software" in the sense of a particular installable program or set of programs or a Linux distro. Check out Robert G. Brown's book online.. (there's a link on the http://www.beowulf.org/ site, if you can't find it elsewhere) Jim From eric.l.2046 at gmail.com Sat Dec 20 19:13:53 2008 From: eric.l.2046 at gmail.com (Eric Liang) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: <41669.192.168.1.213.1229806433.squirrel@mail.eadline.org> References: <494D21E5.8080400@gmail.com> <41669.192.168.1.213.1229806433.squirrel@mail.eadline.org> Message-ID: <494DB471.4040003@gmail.com> Douglas Eadline wrote: >> Maybe I've read the story on wiki and somebody have talk about why use >> the name(beowulf) in this list(IIRC). :-P >> >>> > > Authoritative answer: > > http://www.linux-mag.com/id/1378 > > > Thanks Doug, I'll read it. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://www.scyld.com/pipermail/beowulf/attachments/20081221/0e893e33/signature.bin From eric.l.2046 at gmail.com Sat Dec 20 20:10:34 2008 From: eric.l.2046 at gmail.com (Eric Liang) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: References: Message-ID: <494DC1BA.6090602@gmail.com> > > On 12/20/08 8:48 AM, "Eric Liang" wrote: > > =20 >> First of all, sorry for sending mails by the address: >> start2046@gmail.com. Just like the pgp signature says , I use that one= >> in my country(yes,china). >> >> Lux, James P wrote: >> =20 >>> On 12/20/08 6:41 AM, "Liang Yupeng" wrote: >>> >>> >>> =20 >>>> Lux, James P wrote: >>>> >>>> =20 >>>>> I've been looking over the Beowulf book by Seamus Heaney, and I can= 't find >>>>> any reference to clus >>>>> >>>>> =20 >>> http://www.amazon.com/Beowulf-New-Verse-Translation-Bilingual/dp/0393= 320979te >>> rs >>> in it, so maybe Wikipedia has been misled? Perhaps >>> >>> =20 >> the amzon said: >> Looking for something? >> We're sorry. The Web address you entered is not a functioning page on >> our site >> =20 > > > Interesting.. In any case, the book is a recent new translation of the = epic > poem written around 1000 years ago. Much more readable, but also of no > connection to beowulf (the cluster computing). > =20 >> Maybe I've read the story on wiki and somebody have talk about why use= >> the name(beowulf) in this list(IIRC). :-P >> =20 > > You'd have to ask Don Becker about that. He chose the name when he buil= t the > first one a few years ago. http://www.beowulf.org/ has a little histor= y. > > =20 OK,but the history is something additional, and what I've got on that is enough (from you guys and websites) , so no need to bother him. > =20 >>>> Maybe I didn't get you (sorry for that), it looks like that your >>>> explanation on Beowulf not the category about it but the descriptio= n. >>>> Thank you all the same. >>>> >>>> =20 >>> True enough.. But I'm not sure that developing a HPC taxonomy adds va= lue to >>> the process of doing the computation, or, more to the point, to selec= ting an >>> appropriate way to solve one's particular computational needs. >>> >>> =20 >> Yes, if I've decided use beowulf(or another one), but how can I make t= he >> decision if I couldn't tell the differences of the cluster softwares? >> =20 > > Oh.. That's a bit more difficult. And, you're right, the Wikipedia entr= y > isn't necessarily the best way to go about it. Asking here on the list= is > probably a better way. If you describe your problem, then folks will mo= st > likely respond with their opinions of what sort of computational > architecture is well suited. Opinions WILL differ, but I think everyon= e > here will justify their opinion (some at more length than others) and y= ou > can use that to see if it's applicable to your particular problem. > > =20 >> Hmm, I didn't mean that. I put the question here just because the wiki= 's >> explanations have puzzled me, and IMO, this mailling list is a >> appropriately place where most fellows should have the related >> knowledge. >> =20 > > You've got that right. This *is* the place to ask. > > > =20 >> For me, if I am *interested* in one software, I'll learn more >> and more details about it and compare it to the others, not just use >> it.If I have offended you,I ask your pardon. >> Many thanks for your reply and the books. >> >> =20 > > Gosh.. No offense at all.=20 Just in case, for those rustic words because of my unskilled English :-[ > But you should be aware that "beowulf" is a fairly > generic description of a general approach to computing, so there's no > "Beowulf software" in the sense of a particular installable program or = set > of programs or a Linux distro. > =20 Hmm, maybe the word "project" is more appropriate. > Check out Robert G. Brown's book online.. (there's a link on the > http://www.beowulf.org/ site, if you can't find it elsewhere) > > Jim > =20 Got that link from his signature several days ago, it looks like that I should read the book />. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://www.scyld.com/pipermail/beowulf/attachments/20081221/0d07e89e/signature.bin From deadline at eadline.org Sun Dec 21 11:08:50 2008 From: deadline at eadline.org (Douglas Eadline) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: References: Message-ID: <33928.192.168.1.213.1229886530.squirrel@mail.eadline.org> > > You'd have to ask Don Becker about that. He chose the name when he built > the > first one a few years ago. http://www.beowulf.org/ has a little history. > Actually, it was Tom Sterling, from the Linux mag article: (http://www.linux-mag.com/id/1378) Why ?Beowulf?? It?s possible that you?ve read this far for the single purpose of finding out where the name ?Beowulf? came from. For a long time, we used the cover story about seeking a metaphor for the little guy defeating the big bully, as in ?David and Goliath.? The cover story continued that ?Goliath? had already been used by a famous historic computer, and the giant was the wrong side of the metaphor anyway. However, I didn?t think I would ever get famous building a computer named ?David.? Searching other cultures for equivalent folklore, I recalled the epic saga of Beowulf, the Scandinavian hero who saved the Hrothgar?s Danish kingdom by defeating the monster Grendel. (I didn?t need to search the archives of ancient civilizations to discover the saga of Beowulf. My mother did her graduate studies in old and Middle English literature and I was brought up on the stuff.) Good story, but completely untrue; so, I discarded the name as a possibility. But in truth, I?d been struggling to come up with some cutesy acronym and failing miserably. With some small embarrassment, you can find examples of this in our early papers, which included such terms as ?piles of PCs? and even ?PoPC.? The first term was picked up by others at least briefly. Thankfully, the second never was. Then one afternoon, Lisa, Jim Fischer?s accounts manager, called me and said, ?I?ve got to file paperwork in 15 minutes and I need the name of your project fast!? or some words to that effect. I was desperate. I looked around my office for inspiration, which had eluded me the entire previous month, and my eyes happened on my old, hardbound copy of Beowulf, which was lying on top of a pile of boxes in the corner. Honestly, I haven?t a clue why it was there. As I said, I was desperate. With the phone still in my hand and Lisa waiting not all that patiently on the other end, I said, ?What the hell, call it ?Beowulf.? No one will ever hear of it anyway.? End of story. But the other truth is I didn?t actually name Beowulf, the computer. I only named Beowulf, the project. Someone out there in the land of the press coined the term ?Beowulf-class system,? not me. I would love to know who it was. That?s the real irony: I get the credit for naming Beowulf clusters and actually I didn?t do it. -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Sun Dec 21 19:48:04 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: References: Message-ID: >> I don't think Grid did much to make money, though there was a lot of talk >> about how Grid would bring utility computing - as easy as plugging into a >> wall socket. > > I think the problems with that model are more to do with non-computing > issues than with "making it work". Things like assuring information > security are a big deal for anyone who has serious money to spend on the > computation. I guess I don't really see why security is non-computing issue - to me, Grid is basically irrelevant to HPC because computing is not a fungible commodity. that is, there are vastly different, incompatible kinds of computation, and it really matters where the computing is located and how it's connected. if Java had conquered the world and if the network had become too cheap to meter and all running at a gazillabit, maybe Grid might have worked. (in spite of the fact that latency isn't really subject to technical fixes, and even within a city, nodes will be milliseconds apart - that is millions of clock cycles apart.) From prentice at ias.edu Mon Dec 22 05:52:44 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] OOM errors when running HPL In-Reply-To: <494C1A2E.4020109@tuffmail.us> References: <494C14A0.3070402@ias.edu> <494C1A2E.4020109@tuffmail.us> Message-ID: <494F9BAC.1000409@ias.edu> Alan Louis Scheinine wrote: > A year ago large memory jobs would cause AMD nodes to crash > on the cluster for which I was system administrator. > /var/log/messages showed out of memory errors before the crash. > I can't say that the problem has been solved, I refer to last > year because I changed jobs. > > In order to understand if the problem is a known bug (as in the > case cited above) please specify the main board, the amount of > memory, the number of cores and the version of the kernel. > > You wrote: >> I used to run hpl jobs much bigger than this on my cluster w/o a >> problem. > > How does the amount of memory on the new cluster compare to the cluster > in which you did not have a problem. In particular, the amount of > memory per core, assuming all cores were used in your testing. Alan, thanks for the reply. It's the same cluster - jobs that ran on it a few weeks ago, are no longer running. There has been no hardware changes, so I don't think it's a hardware problem. The only difference I can think if is that I'm now using SGE to launch these jobs, which I may not have been doing the last time I ran a job this big. The only other possible software changes are kernel package updates that may have occurred since the last successful run of a job this big. -- Prentice From prentice at ias.edu Mon Dec 22 14:12:39 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] OOM errors when running HPL In-Reply-To: <494C410E.3070002@cs.earlham.edu> References: <494C14A0.3070402@ias.edu> <494C410E.3070002@cs.earlham.edu> Message-ID: <495010D7.4090802@ias.edu> Skylar Thompson wrote: > Prentice Bisbal wrote: >> I've got a new problem with my cluster. Some of this problem may be with >> my queuing system (SGE), but I figured I'd post here first. >> >> I've been using hpl to test my new cluster. I generally run a small >> problem size (Ns=60000)so the job only runs 15-20 minutes. Last night, I >> upped the problem size by a factor of 10 to Ns=600000). Shortly after >> submitting the job, have the nodes were shown as down in Ganglia. >> >> I killed the job with qdel, and the majority of the nodes came back, but >> about 1/3 did not. When I came in this morning, there were kernel >> panic/OOM type messages on the consoles of the systems that never came >> back. >> >> I used to run hpl jobs much bigger than this on my cluster w/o a >> problem. There's nothing I actively changes, but there might have been >> some updates to the OS (kernel, libs, etc) since the last time I ran a >> job this big. Any ideas where I should begin looking? > > I've run into similar problems, and traced it to the way Linux > overcommits RAM. What are your vm.overcommit_memory and > vm.overcommit_ratio sysctls set to, and how much swap and RAM do the > nodes have? > I found the problem - it was me. I never ran HPL problems with Ns=600k. The largest job I ran was ~320k. I figured this out after checking my notes. Sorry for the trouble. However, I did want to configure my systems so that they handle requests for more memory more gracefully, so I added this to my sysctl.conf file (Thanks for the reminder, Skylar!) vm.overcommit_memory=2 vm.overcommit_ratio=100 I am actually using this on many of my other computational servers to prevent OOM crashes, but forgot to add this to my cluster nodes. Thanks to everyone for the replies. -- Prentice From rpnabar at gmail.com Mon Dec 22 17:28:51 2008 From: rpnabar at gmail.com (Rahul Nabar) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] using Nagios to monitor compute nodes: NPRE vs check_by_ssh Message-ID: I just installed Nagios to try and monitor my 256 compute nodes centrally. It seems to work like a charm for all the public services (ping, ssh etc.) but now I was getting more ambitious and wanted to try to monitor the private services too (disk usage; process loads; torque ; pbs etc.). I was just confused whether (1) to use the NPRE plugin (seems like a pain to deploy onto all 256 nodes) or (2) go via the check_by_ssh route. (I already have paswordless logins from master-nodes to slave-nodes) I'd like (2) because it is more secure and seems easier to deploy but I'm a bit afraid if this will overtax my central server. Any suggestions? Are other users using Nagios here? -- Rahul From hearnsj at googlemail.com Tue Dec 23 01:30:31 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] using Nagios to monitor compute nodes: NPRE vs check_by_ssh In-Reply-To: References: Message-ID: <9f8092cc0812230130x1d639f83uc33b2b7345fdae1a@mail.gmail.com> 2008/12/23 Rahul Nabar > > I'd like (2) because it is more secure and seems easier to deploy but > I'm a bit afraid if this will overtax my central server. > > Any suggestions? Are other users using Nagios here? > Rahul, I'm not a Nagios expert, but I have used NRPE for monitoring quite some time on the past. My answer would be to give it a try - I'm sure that for the workload type monitoring you are talkign about you could turn down the frequency of the checks if the load is too high. It would be interesting to hear your findings on the list when you try it. Re NRPE, why do you say it would be too difficult to deploy? The method would be to get it right on one node, then set it up on other nodes using a parallel command shell, such as pdsh or cexec. You do have something like that in place I trust? John H -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081223/4cc9a762/attachment.html From hearnsj at googlemail.com Tue Dec 23 03:54:39 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] Not all cores are created equal Message-ID: <9f8092cc0812230354t3ac23c6y222d07deddc7ec9f@mail.gmail.com> I'm surprised this has not been flagged up yet. Shamelessly passed on from Slashdot: http://www.gcn.com/online/vol1_no1/47765-1.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081223/9513f893/attachment.html From prentice at ias.edu Tue Dec 23 06:41:14 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] Not all cores are created equal In-Reply-To: <9f8092cc0812230354t3ac23c6y222d07deddc7ec9f@mail.gmail.com> References: <9f8092cc0812230354t3ac23c6y222d07deddc7ec9f@mail.gmail.com> Message-ID: <4950F88A.3070102@ias.edu> John Hearns wrote: > I'm surprised this has not been flagged up yet. Shamelessly passed on > from Slashdot: > http://www.gcn.com/online/vol1_no1/47765-1.html John, Thanks for posting. I just read it. The problems the researchers cite are nothing new. Those issues have been around, and written about, for years the only difference is that the processors are now on the same die. Regarding the issue handling interrupts and how that can affect performance is discussed in this paper, which determined that in multiprocessors systems, it's best to let one processor do all the interrupt handling (and nothing else) http://hpc.pnl.gov/people/fabrizio/papers/sc03_noise.pdf The other problem they mention, when data one processor needs is in another processor's cache, and how that affects performance is nothing new, either. That's why we have processor affinity. I learned about that in my computer architecture class years ago - before multi-core processors existed. The above is based on the GCN article, not the actual paper it refers to. I haven't read all 12 pages of it yet, but I did skim through it. It's interesting to note that the research was conducted only on the Intel Architecture. They mention that the AMD NUMA architecture is more complicated and that they hope to do future research on it. Not including the AMDs in this research lame. Showing the difference in performance between architectures - *that* would be meaningful information that we could all use. But I guess that leaves the authors an opportunity to publish an additional paper on the same topic, so they can list two publications on their CVs instead of one. -- Prentice From richard.walsh at comcast.net Tue Dec 23 06:57:37 2008 From: richard.walsh at comcast.net (richard.walsh@comcast.net) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] Not all cores are created equal In-Reply-To: <9f8092cc0812230354t3ac23c6y222d07deddc7ec9f@mail.gmail.com> Message-ID: <1427211240.468231230044257986.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> John/All, It is a useful reminder I guess, but I have to assume that this is something that folks on this list are familiar with, No?? State is more complicated for parallel work and less deterministic.? As these qualities accumulate in the processing of any workload a stochastic outcome (timings in this case, but it can also effect precision as anyone that has compare vector to scale results remembers) is the result. This effect is another manifestation of the behavior demonstrated by Galton in the late 1800s when he dropped English pence through a? pinned board (the pins are the identical cores, the pence parrallel processes) ... he got a gaussian distribution.? As the article suggests?you can reduce the variance by limiting the non-determinism in process and that is what eXludus (a Montreal based software scheduling company) is doing at the job and process level.? As the number of of cores and/or number of processes (virtual or otherwise) grows so does the variance in outcomes.? This is a on-die manifestion of job skew is it not, and another of the second law of thermodynamics ... rbw ----- Original Message ----- From: "John Hearns" To: beowulf@beowulf.org Sent: Tuesday, December 23, 2008 6:54:39 AM GMT -05:00 US/Canada Eastern Subject: [Beowulf] Not all cores are created equal I'm surprised this has not been flagged up yet. Shamelessly passed on from Slashdot: http://www.gcn.com/online/vol1_no1/47765-1.html _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081223/a3f0d3e5/attachment.html From hearnsj at googlemail.com Tue Dec 23 07:01:57 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] Not all cores are created equal In-Reply-To: <4950F88A.3070102@ias.edu> References: <9f8092cc0812230354t3ac23c6y222d07deddc7ec9f@mail.gmail.com> <4950F88A.3070102@ias.edu> Message-ID: <9f8092cc0812230701p5fbb7da9g64042a2a559cd5f7@mail.gmail.com> 2008/12/23 Prentice Bisbal > Regarding the issue handling interrupts and how that can affect > performance is discussed in this paper, which determined that in > multiprocessors systems, it's best to let one processor do all the > interrupt handling (and nothing else) Indeed. SGI Altix have 'bootcpusets' which means you can slice off one or two processors to take care of OS housekeeping tasks, -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081223/e91969c9/attachment.html From landman at scalableinformatics.com Tue Dec 23 07:21:33 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:07 2009 Subject: [Beowulf] Not all cores are created equal In-Reply-To: <4950F88A.3070102@ias.edu> References: <9f8092cc0812230354t3ac23c6y222d07deddc7ec9f@mail.gmail.com> <4950F88A.3070102@ias.edu> Message-ID: <495101FD.8030708@scalableinformatics.com> Prentice Bisbal wrote: > They mention that the AMD NUMA architecture is more complicated and that > they hope to do future research on it. Not including the AMDs in this > research lame. Showing the difference in performance between I am biased in that I know one of the authors. He and his work is anything but "lame". A more appropriate adjective would be "very good". Science tends to proceed in short papers like this one (I did read it), and narrow focus, with suggestions on additional work of potential direct/indirect interest. Which they (as responsible scientists/engineers) include. > architectures - *that* would be meaningful information that we could all > use. But I guess that leaves the authors an opportunity to publish an > additional paper on the same topic, so they can list two publications on > their CVs instead of one. No. Think page limits. We wrote a paper for AINA 2006. After it was accepted, we were told we had to whittle down our 12 pages to 6. For researchers, this means that you will likely get more papers out of the work, as a single large tome is less likely to be accepted at the publication than smaller more focused work. Given where you work, I would think you would be sensitive to this :) ... -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From rpnabar at gmail.com Tue Dec 23 10:24:23 2008 From: rpnabar at gmail.com (Rahul Nabar) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] using Nagios to monitor compute nodes: NPRE vs check_by_ssh In-Reply-To: <67fea89c0812222023j2f692a31ube02570a2b638a36@mail.gmail.com> References: <67fea89c0812222023j2f692a31ube02570a2b638a36@mail.gmail.com> Message-ID: On Mon, Dec 22, 2008 at 10:23 PM, Alex Younts wrote: > At my employer, we use a variety of monitoring tools for our various > clusters. Our nagios box is a VM with a single processor and 512MB of > memory. Currently, we monitor 1700 hosts, each with three or four > service checks a piece (two of which SSH to nodes to run scripts). We > check services about every 30 minutes. Thanks Alex! I will give that a shot now! Are there any torque / pbs / maui monitoring Nagios scripts out there? I wanted to avoid reinventing the wheel if at all possible! -- Rahul From csamuel at vpac.org Wed Dec 24 02:03:38 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Not all cores are created equal In-Reply-To: <9f8092cc0812230701p5fbb7da9g64042a2a559cd5f7@mail.gmail.com> Message-ID: <1273738995.751491230113018208.JavaMail.root@mail.vpac.org> ----- "John Hearns" wrote: > SGI Altix have 'bootcpusets' which means you can slice > off one or two processors to take care of OS housekeeping > tasks, Now that cpusets have been in the mainline kernel for some time you should be able to do this with any modern distro. I contemplated doing this on our Barcelona cluster, but sacrificing 1 core in 8 was a bit too much of a high price to pay. But people with higher core counts per node might find it attractive. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Wed Dec 24 02:07:50 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] max number of NFS threads: NFS config optimization In-Reply-To: Message-ID: <274397079.751521230113270619.JavaMail.root@mail.vpac.org> ----- "Rahul Nabar" wrote: > >Are you using ext3 for that filesystem by some chance ? > > Thanks Chris! No worries! > It is indeed an ext3. I will give the commit interval > solution a shot. I'd love to know whether that helped (or not) ? Happy Newtonmas [1] all! Chris [1] - http://www.paeps.cx/weblog/activism/newtonmas.html -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From hearnsj at googlemail.com Wed Dec 24 02:30:35 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Not all cores are created equal In-Reply-To: <1273738995.751491230113018208.JavaMail.root@mail.vpac.org> References: <9f8092cc0812230701p5fbb7da9g64042a2a559cd5f7@mail.gmail.com> <1273738995.751491230113018208.JavaMail.root@mail.vpac.org> Message-ID: <9f8092cc0812240230j64bd592axaa9e4eb05b61860a@mail.gmail.com> > I contemplated doing this on our Barcelona cluster, but > sacrificing 1 core in 8 was a bit too much of a high price > to pay. But people with higher core counts per node might > find it attractive. > > My prediction for the New Year - someone will produce a dedicated HPC node with multicore Nehalems, plus a cheap, single core processor for OS 'housekeeping' tasks. Maybe such a board will also have slots for GPU accelerators too. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081224/de2a4e74/attachment.html From hearnsj at googlemail.com Wed Dec 24 02:34:53 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] max number of NFS threads: NFS config optimization In-Reply-To: <274397079.751521230113270619.JavaMail.root@mail.vpac.org> References: <274397079.751521230113270619.JavaMail.root@mail.vpac.org> Message-ID: <9f8092cc0812240234p7f0ce8aaqb854904425f004d8@mail.gmail.com> 2008/12/24 Chris Samuel > > > Happy Newtonmas [1] all! > > Chris > > [1] - http://www.paeps.cx/weblog/activism/newtonmas.html > > Hey! I recognise that picture - Philip Paeps. A friend of mine, and he organises the annual beer drinking on the eve of the FOSDEM conference in Brussels. And, since this is Brussels, there is a large amount of very fine beer drunk. http://www.fosdem.org/2009/ February next year, if anyone wants to come along. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081224/3a1d1a79/attachment.html From csamuel at vpac.org Wed Dec 24 03:26:18 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Not all cores are created equal In-Reply-To: <9f8092cc0812240230j64bd592axaa9e4eb05b61860a@mail.gmail.com> Message-ID: <980452326.751971230117978621.JavaMail.root@mail.vpac.org> ----- "John Hearns" wrote: > My prediction for the New Year - someone will > produce a dedicated HPC node with multicore Nehalems, > plus a cheap, single core processor for OS 'housekeeping' > tasks. That's, umm, interesting.. :-) Perhaps with a ULP Core 2 from a notebook as the housekeeping one (yeah, it'll have 2 cores, but you can just hotplug one out if you don't want it).. > Maybe such a board will also have slots for GPU > accelerators too. I believe that people are already making external cages with PCIe slots that connect back over a high spec PCIe link. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From rpnabar at gmail.com Wed Dec 24 09:26:58 2008 From: rpnabar at gmail.com (Rahul Nabar) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] using SNMP to monitor disk usage and load factors on compute-nodes Message-ID: I was toying with the idea of monitoring some key stats from my compute-nodes using SNMP (eg. load factors; local disk usage; health of my pbs_moms etc.). Especially since Nagios docs. seem to recommend snmp as a recommended way to do the monitoring of private resources (as opposed to ssh or nrpe plugins). I've never been familiar with SNMP before (leave that my Dell switches have an option to export stats via SNMP that I never used!) What do the wise-Beowulf-sysadmins have to say? Any caeveats? I checked with "etc/init.d/snmpd status" which reports /etc/init.d/snmpd: Command not found." So I guess I first need to install "net-snmp". My compute-nodes are already behind a firewall so I guess security should not be an issue by running this additional service on my compute-nodes. Perhaps performance takes a tiny hit; but I doubt it! Does SNMP make for a sound monitoring-philosophy and are others using t on their clusters? -- Rahul From james.p.lux at jpl.nasa.gov Wed Dec 24 13:15:31 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Is this the J. Dongarra of Beowulf fame? Message-ID: Ran across the following quoted line from a SciGen created paper that was accepted to a conference and is getting some play on slashdot: "We performed a quantized emulation on Intel?(TM)s mobile telephones to prove the work of Italian mad scientist J. Dongarra." Recognizing the name, I'm prompted to ask the real question, is Jack an Italian mad scientist? The rest of the paper is full of interesting sentences: " While such a hypothesis is entirely a theoretical ambition, it rarely conflicts with the need to provide operating systems to computational biologists." http://entertainment.slashdot.org/article.pl?sid=08/12/23/2321242 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081224/e3759317/attachment.html From rgb at phy.duke.edu Wed Dec 24 13:56:52 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Is this the J. Dongarra of Beowulf fame? In-Reply-To: References: Message-ID: On Wed, 24 Dec 2008, Lux, James P wrote: > Ran across the following quoted line from a SciGen created paper that was > accepted to a conference and is getting some play on slashdot: > > "We performed a quantized emulation on Intel?(TM)s mobile telephones to > prove the work of Italian mad scientist J. Dongarra.? > > Recognizing the name, I?m prompted to ask the real question, is Jack an > Italian mad scientist? Italian I couldn't say -- his name sounds Irish to me. Mad -- well, he didn't act mad in our last conversation. Not even angry. Crazy? Possibly. All the Irish are Crazy. rgb > > The rest of the paper is full of interesting sentences: > " While such a hypothesis is entirely a theoretical ambition, it rarely > conflicts with the need to provide operating systems to computational > biologists.? As are, apparently, the authors of random slashdot authors...;-) > http://entertainment.slashdot.org/article.pl?sid=08/12/23/2321242 > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From james.p.lux at jpl.nasa.gov Wed Dec 24 14:30:45 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Is this the J. Dongarra of Beowulf fame? In-Reply-To: Message-ID: On 12/24/08 1:56 PM, "Robert G. Brown" wrote: > On Wed, 24 Dec 2008, Lux, James P wrote: > >> Ran across the following quoted line from a SciGen created paper that was >> accepted to a conference and is getting some play on slashdot: >> >> "We performed a quantized emulation on Intel?(TM)s mobile telephones to >> prove the work of Italian mad scientist J. Dongarra.? >> >> Recognizing the name, I?m prompted to ask the real question, is Jack an >> Italian mad scientist? > > Italian I couldn't say -- his name sounds Irish to me. Mad -- well, he > didn't act mad in our last conversation. Not even angry. > > Crazy? Possibly. All the Irish are Crazy. > > I just figured that the authors of SciGen may have had him as a professor, and this was their not-so-subtle dig at him. When you spoke with him did he have an Intel mobile phone? Has the effect of SciGen spread even wider than for submitting papers to conferences with low standards in exotic locations (how come I never get invitations in my email to vague multidisciplinary conferences in, say, Elko NV.. Just Orlando and various other places with nice beaches or other scenery..) From rgb at phy.duke.edu Wed Dec 24 14:46:40 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Is this the J. Dongarra of Beowulf fame? In-Reply-To: References: Message-ID: On Wed, 24 Dec 2008, Lux, James P wrote: > When you spoke with him did he have an Intel mobile phone? Has the effect of If he did, he didn't answer it:-) rgb > SciGen spread even wider than for submitting papers to conferences with low > standards in exotic locations (how come I never get invitations in my email > to vague multidisciplinary conferences in, say, Elko NV.. Just Orlando and > various other places with nice beaches or other scenery..) > > > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From richard.walsh at comcast.net Wed Dec 24 15:28:50 2008 From: richard.walsh at comcast.net (richard.walsh@comcast.net) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Is this the J. Dongarra of Beowulf fame? In-Reply-To: Message-ID: <395100395.648631230161330459.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> All, Actually the name is Sicilian ... although Jack is from Chicago as I recall. rbw ----- Original Message ----- From: "Robert G. Brown" To: "James P Lux" Cc: "Beowulf Mailing List" Sent: Wednesday, December 24, 2008 4:56:52 PM GMT -05:00 US/Canada Eastern Subject: Re: [Beowulf] Is this the J. Dongarra of Beowulf fame? On Wed, 24 Dec 2008, Lux, James P wrote: > Ran across the following quoted line from a SciGen created paper that was > accepted to a conference and is getting some play on slashdot: > > "We performed a quantized emulation on Intel?(TM)s mobile telephones to > prove the work of Italian mad scientist J. Dongarra.? > > Recognizing the name, I?m prompted to ask the real question, is Jack an > Italian mad scientist? Italian I couldn't say -- his name sounds Irish to me. Mad -- well, he didn't act mad in our last conversation. Not even angry. Crazy? Possibly. All the Irish are Crazy. rgb > > The rest of the paper is full of interesting sentences: > " While such a hypothesis is entirely a theoretical ambition, it rarely > conflicts with the need to provide operating systems to computational > biologists.? As are, apparently, the authors of random slashdot authors...;-) > http://entertainment.slashdot.org/article.pl?sid=08/12/23/2321242 > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081224/a7f26caf/attachment.html From patrick at myri.com Wed Dec 24 15:46:23 2008 From: patrick at myri.com (Patrick Geoffray) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Is this the J. Dongarra of Beowulf fame? In-Reply-To: References: Message-ID: <4952C9CF.3010800@myri.com> Lux, James P wrote: > Recognizing the name, I?m prompted to ask the real question, is Jack an > Italian mad scientist? Jack has Sicilian roots. Patrick From kspaans at student.math.uwaterloo.ca Wed Dec 24 18:13:14 2008 From: kspaans at student.math.uwaterloo.ca (Kyle Spaans) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Is this the J. Dongarra of Beowulf fame? In-Reply-To: References: Message-ID: <20081225021314.GA22381@student.math> Personally, as a 20-year-old enthusiast of beowulfish interests, I've only heard of Mr. Dongarra twice. First from a Swedish mathematics grad student in the #fortran IRC channel, talking about the FORTRAN legacy that Dongarra left behind with NETLIB code. In particular his coding style was mentioned. Secondly I've heard of Prof (I suppose I should say Prof) Dongarra through an Ottawa University professor that I conversed with about FORTRAN related exploits. (The professor is a Prof. Nash, of statistics, from the University of Ottawa.) Take that as you will, but for me it only means that Prof. Dongarra is only tengentially related to beowulf through NETLIB FORTRAN code. And thusly, probably is not a ``mad scientist'' of beowful fame. ;-) From rgb at phy.duke.edu Wed Dec 24 18:59:53 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Is this the J. Dongarra of Beowulf fame? In-Reply-To: <20081225021314.GA22381@student.math> References: <20081225021314.GA22381@student.math> Message-ID: On Wed, 24 Dec 2008, Kyle Spaans wrote: > Take that as you will, but for me it only means that Prof. Dongarra is only > tengentially related to beowulf through NETLIB FORTRAN code. And thusly, probably is > not a ``mad scientist'' of beowful fame. ;-) Dongarra was one of the primary people involved in the original development of PVM, which was the original cross-platform parallel/network programming support package. MPI had very different roots and was not originally designed to support network-based parallel computing. So in a sense, Dongarra was one of the real inventors of the commodity cluster. Post PVM, it was EASY to take existing workstations on a TCP/IP network and write parallel code, and the beowulf design was nearly inevitable as soon as Linux matured to where it could support it. Dongarra is also one of the people who intiated the ATLAS project, a linear algebra package that can yield as much as a factor of 2-3 performance edge over non-tuned linear algebra packages. For people with LA-intensive code, that's like doubling or trippling the size of their clusters. A bit of a mad scientist, sure. Or at least, a Very Smart Guy. rgb > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From patrick at myri.com Wed Dec 24 19:10:48 2008 From: patrick at myri.com (Patrick Geoffray) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Is this the J. Dongarra of Beowulf fame? In-Reply-To: <20081225021314.GA22381@student.math> References: <20081225021314.GA22381@student.math> Message-ID: <4952F9B8.10904@myri.com> Kyle, Kyle Spaans wrote: > Take that as you will, but for me it only means that Prof. Dongarra is only > tengentially related to beowulf through NETLIB FORTRAN code. And thusly, probably is > not a ``mad scientist'' of beowful fame. ;-) Jack Dongarra's group has produced a large set of free and open-source middlewares that are still widely used on clusters: HPL, Atlas, Lapack, Scalapack, PAPI and MPI/PVM at large. This was essential for commodity computing to develop. Patrick From alscheinine at tuffmail.us Wed Dec 24 20:51:42 2008 From: alscheinine at tuffmail.us (Alan Louis Scheinine) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Is this the J. Dongarra of Beowulf fame? In-Reply-To: <4952F9B8.10904@myri.com> References: <20081225021314.GA22381@student.math> <4952F9B8.10904@myri.com> Message-ID: <4953115E.2080800@tuffmail.us> I have received almost no E-mail today, aside from 15 messages in my Beowulf folder. The significance is left to the reader to interpret. http://www.netlib.org/utk/people/JackDongarra/ From alsimao at gmail.com Fri Dec 19 09:51:24 2008 From: alsimao at gmail.com (Alcides Simao) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Re: Beowulf Digest, Vol 58, Issue 49 In-Reply-To: <200812191715.mBJHFaTJ017148@bluewest.scyld.com> References: <200812191715.mBJHFaTJ017148@bluewest.scyld.com> Message-ID: <7be8c36b0812190951q6c435c35v8d9e67e02e8d23db@mail.gmail.com> Hello All! Puting it bluntly, Beowulf just refers to the type of equipment used, that is, commodity hardware and usually, Free OpenSource Software! Beowulfs can impersonate a great many type of cluster types, from HA, HPC, Grid, and so on :) If you want a jerkier way of putting this, consider Beowulf a 'Trademark'! You have BlueGenes, and so on, and you have Beowulf :) Just that! Beowulf, made in your own room :) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081219/fa63acca/attachment.html From ayounts at tinkergeek.com Mon Dec 22 20:23:46 2008 From: ayounts at tinkergeek.com (Alex Younts) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] using Nagios to monitor compute nodes: NPRE vs check_by_ssh In-Reply-To: References: Message-ID: <67fea89c0812222023j2f692a31ube02570a2b638a36@mail.gmail.com> At my employer, we use a variety of monitoring tools for our various clusters. Our nagios box is a VM with a single processor and 512MB of memory. Currently, we monitor 1700 hosts, each with three or four service checks a piece (two of which SSH to nodes to run scripts). We check services about every 30 minutes. The load on the central box does get up there are at times, but it is generally responsive and there's not much additional network load. We chose SSH based checks because we were already running Ganglia for statistics monitoring on the nodes and no one wanted to maintain yet another daemon.. It seemed like the best option for us. Best of luck with your cluster monitoring! Alex Younts On Mon, Dec 22, 2008 at 8:28 PM, Rahul Nabar wrote: > I just installed Nagios to try and monitor my 256 compute nodes > centrally. It seems to work like a charm for all the public services > (ping, ssh etc.) but now I was getting more ambitious and wanted to > try to monitor the private services too (disk usage; process loads; > torque ; pbs etc.). > > I was just confused whether (1) to use the NPRE plugin (seems like a > pain to deploy onto all 256 nodes) or (2) go via the check_by_ssh > route. (I already have paswordless logins from master-nodes to > slave-nodes) > > I'd like (2) because it is more secure and seems easier to deploy but > I'm a bit afraid if this will overtax my central server. > > Any suggestions? Are other users using Nagios here? > > -- > Rahul From ayounts at tinkergeek.com Tue Dec 23 11:05:07 2008 From: ayounts at tinkergeek.com (Alex Younts) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] using Nagios to monitor compute nodes: NPRE vs check_by_ssh In-Reply-To: References: <67fea89c0812222023j2f692a31ube02570a2b638a36@mail.gmail.com> Message-ID: <67fea89c0812231105w1105fbd1i630e6c9bf8f6ae8f@mail.gmail.com> We have quite a few different PBS servers running PBSPro 9.x. Our Nagios box has a bare install of the PBSPro and we wrote a check script that runs "pbsnodes -s $cluster-head-node $nodehostname" and checks to see if PBS thinks the node is happy. (We determine which PBS server to hit up based on the host name of the node.) Alex Younts On Tue, Dec 23, 2008 at 1:24 PM, Rahul Nabar wrote: > On Mon, Dec 22, 2008 at 10:23 PM, Alex Younts wrote: >> At my employer, we use a variety of monitoring tools for our various >> clusters. Our nagios box is a VM with a single processor and 512MB of >> memory. Currently, we monitor 1700 hosts, each with three or four >> service checks a piece (two of which SSH to nodes to run scripts). We >> check services about every 30 minutes. > > Thanks Alex! I will give that a shot now! Are there any torque / pbs / > maui monitoring Nagios scripts out there? I wanted to avoid > reinventing the wheel if at all possible! > > -- > Rahul > From gerry.creager at tamu.edu Fri Dec 26 15:16:04 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop Message-ID: <495565B4.4090701@tamu.edu> The subject line says it all: Hadoop: Anyone got any experience with it on clusters (OK, so Google does, but that really wasn't the question, was it?). We've a user who has requested its installation on one of our clusters, a high-throughput system. I'm a bit concerned that it's not gonna be real compatible with, say, Torque/Maui and Gluster, unless we were to install Xen across the whole cluster and instantiate it within Xen VMs. However, before I push all MY fears out into the discussion I'd prefer to see if anyone else has experience and can shed light on compatibility. Thanks, Gerry -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From shaeffer at neuralscape.com Fri Dec 26 20:23:38 2008 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <495565B4.4090701@tamu.edu> References: <495565B4.4090701@tamu.edu> Message-ID: <20081227042338.GA5408@synapse.neuralscape.com> On Fri, Dec 26, 2008 at 05:16:04PM -0600, Gerry Creager wrote: > The subject line says it all: Hadoop: Anyone got any experience with it > on clusters (OK, so Google does, but that really wasn't the question, > was it?). Hi, Google doesn't use Hadoop. Google published some papers on their distributed computing environment that they invented. Then some Java programmers implemented Hadoop after reading the papers published by Google. Hadoop is grossly inefficient, as it is written in Java. But it does work. Folks who use Hadoop include Yahoo. I believe Amazon uses it as well. If you care about CPU cycles, then you really don't want to get involved with Hadoop. Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer@neuralscape.com http://www.neuralscape.com From csamuel at vpac.org Sat Dec 27 06:09:02 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <7518762.879191230386922600.JavaMail.root@mail.vpac.org> Message-ID: <6315530.879211230386942982.JavaMail.root@mail.vpac.org> Hi Gerry, ----- "Gerry Creager" wrote: > I'm a bit concerned that it's not gonna be real compatible > with, say, Torque/Maui and Gluster There is the Hadoop on Demand (HOD) project to integrate Hadoop with Torque: http://hadoop.apache.org/core/docs/current/hod.html No idea how well it works though, not something we've been approached about yet. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From laytonjb at att.net Sat Dec 27 06:35:41 2008 From: laytonjb at att.net (Jeff Layton) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop Message-ID: <373782.24529.qm@web80706.mail.mud.yahoo.com> Sorry for top-posting (I hate these on-line email tools...) Did the person requesting Hadoop ever say why they wanted it? For example, do they have code written in MapReduce or do they think that Hadoop will give them faster throughput than something else? Hadoop is a project that really has 2 parts to it - an open-source MapReduce implementation, and a file system. From people I've talked to, the MapReduce part is used far more than the file system. But I've talked to some of the developers of the file system and there are some people who use the file system. In general the file system is basically a virtual file system ala' PVFS, GlusterFS or any object based storage (Panasas, Lustre). However it understand the idea of locality - that is where useful storage is in relation to the compute part of the problem. The idea being that you can reduce the time to transmit the data because the storage is closer. But, in general, the improvement you get is due to the network topology, not necessarily the file system itself. That's because, in general, MapReduce systems have network topologies with bottlenecks all over the place because they don't really need a full bi-sectional bandwidth network everywhere. So for example they may have good bandwidth to a switch within the rack, but outside the rack, they bandwidth is not so hot. But again, these are generalizations, and the details are always in the implementation. HadoopFS (lack of a better phrase on my part) is really designed for MapReduce codes - transactional codes. So if the person's code(s) fit this model, then it might be an interesting experiment to try. Otherwise, there are much better file systems for HPC :) BTW - I saw Karen's post about using Java with HadoopFS. Be sure to pay attention to that since getting a good 64-bit Java implementation for Linux is not always easy. There are a few out there (Sun has an early access program to a 64-bit Java) but the reports I've heard are that it's still early. Hope this helps. Jeff ________________________________ From: Gerry Creager To: Beowulf Mailing List Sent: Friday, December 26, 2008 6:16:04 PM Subject: [Beowulf] Hadoop The subject line says it all: Hadoop: Anyone got any experience with it on clusters (OK, so Google does, but that really wasn't the question, was it?). We've a user who has requested its installation on one of our clusters, a high-throughput system. I'm a bit concerned that it's not gonna be real compatible with, say, Torque/Maui and Gluster, unless we were to install Xen across the whole cluster and instantiate it within Xen VMs. However, before I push all MY fears out into the discussion I'd prefer to see if anyone else has experience and can shed light on compatibility. Thanks, Gerry -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081227/0111e150/attachment.html From hearnsj at googlemail.com Sat Dec 27 07:03:44 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Tales of Desperaux Message-ID: <9f8092cc0812270703q3e6c5b81x46b04266500b580c@mail.gmail.com> http://www.theregister.co.uk/2008/12/26/soho_rendering_supercomputer/ Excellent article on the 4000 core render farm for Tales of Desperaux, and the Lustre file system behind it. My only connection with the project being that I used to work for Steve Prescott yea long ago... back when the dinosaurs roamed the earth, if anyone gets the reference. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081227/2fb6b65b/attachment.html From gerry.creager at tamu.edu Sat Dec 27 07:39:44 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <20081227042338.GA5408@synapse.neuralscape.com> References: <495565B4.4090701@tamu.edu> <20081227042338.GA5408@synapse.neuralscape.com> Message-ID: <49564C40.2020804@tamu.edu> Karen, Thanks for the clarifications. I'm concerned about the software, but it looks like we'll install Hadoop On Demand, as someone has already promised a user we'd do it... If there were serious pitfalls, I might be able to slow it down some, but simply inefficient isn't sufficient... we have users writing MPI code who daily redefine "inefficient"! Again, thanks! gerry Karen Shaeffer wrote: > On Fri, Dec 26, 2008 at 05:16:04PM -0600, Gerry Creager wrote: >> The subject line says it all: Hadoop: Anyone got any experience with it >> on clusters (OK, so Google does, but that really wasn't the question, >> was it?). > > Hi, > Google doesn't use Hadoop. Google published some papers on their > distributed computing environment that they invented. Then some > Java programmers implemented Hadoop after reading the papers > published by Google. Hadoop is grossly inefficient, as it is written > in Java. But it does work. Folks who use Hadoop include Yahoo. I > believe Amazon uses it as well. If you care about CPU cycles, then > you really don't want to get involved with Hadoop. > > Karen -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From gerry.creager at tamu.edu Sat Dec 27 07:59:54 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <373782.24529.qm@web80706.mail.mud.yahoo.com> References: <373782.24529.qm@web80706.mail.mud.yahoo.com> Message-ID: <495650FA.2010105@tamu.edu> Jeff, I'm an old, guy and don't mind top-posts! Thanks for the insight! gerry Jeff Layton wrote: > Sorry for top-posting (I hate these on-line email tools...) > > Did the person requesting Hadoop ever say why they wanted it? For > example, do they have code written in MapReduce or do they think that > Hadoop will give them faster throughput than something else? > > Hadoop is a project that really has 2 parts to it - an open-source > MapReduce implementation, and a file system. From people I've talked to, > the MapReduce part is used far more than the file system. But I've > talked to some of the developers of the file system and there are some > people who use the file system. > > In general the file system is basically a virtual file system ala' PVFS, > GlusterFS or any object based storage (Panasas, Lustre). However it > understand the idea of locality - that is where useful storage is in > relation to the compute part of the problem. The idea being that you can > reduce the time to transmit the data because the storage is closer. But, > in general, the improvement you get is due to the network topology, not > necessarily the file system itself. That's because, in general, > MapReduce systems have network topologies with bottlenecks all over the > place because they don't really need a full bi-sectional bandwidth > network everywhere. So for example they may have good bandwidth to a > switch within the rack, but outside the rack, they bandwidth is not so > hot. But again, these are generalizations, and the details are always in > the implementation. > > HadoopFS (lack of a better phrase on my part) is really designed for > MapReduce codes - transactional codes. So if the person's code(s) fit > this model, then it might be an interesting experiment to try. > Otherwise, there are much better file systems for HPC :) > > BTW - I saw Karen's post about using Java with HadoopFS. Be sure to pay > attention to that since getting a good 64-bit Java implementation for > Linux is not always easy. There are a few out there (Sun has an early > access program to a 64-bit Java) but the reports I've heard are that > it's still early. > > Hope this helps. > > Jeff > > > ------------------------------------------------------------------------ > *From:* Gerry Creager > *To:* Beowulf Mailing List > *Sent:* Friday, December 26, 2008 6:16:04 PM > *Subject:* [Beowulf] Hadoop > > The subject line says it all: Hadoop: Anyone got any experience with it > on clusters (OK, so Google does, but that really wasn't the question, > was it?). > > We've a user who has requested its installation on one of our clusters, > a high-throughput system. I'm a bit concerned that it's not gonna be > real compatible with, say, Torque/Maui and Gluster, unless we were to > install Xen across the whole cluster and instantiate it within Xen VMs. > > However, before I push all MY fears out into the discussion I'd prefer > to see if anyone else has experience and can shed light on compatibility. > > Thanks, Gerry > -- > Gerry Creager -- gerry.creager@tamu.edu > Texas Mesonet -- AATLT, Texas A&M University > Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 > Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From landman at scalableinformatics.com Sat Dec 27 08:11:20 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <373782.24529.qm@web80706.mail.mud.yahoo.com> References: <373782.24529.qm@web80706.mail.mud.yahoo.com> Message-ID: <495653A8.5090305@scalableinformatics.com> Jeff Layton wrote: > BTW - I saw Karen's post about using Java with HadoopFS. Be sure to pay > attention to that since getting a good 64-bit Java implementation for > Linux is not always easy. There are a few out there (Sun has an early > access program to a 64-bit Java) but the reports I've heard are that > it's still early. Yeah, 64 bit java is sorta-kinda working. Sun just released a 64 bit Java plugin for nsapi (e.g. firefox/mozilla) oh, only ... 5 years after the first RFE. Not sure how well baked it is, I am playing with it for some of our customers. 64 bit Java shouldn't be hard, as Java VM's are supposed to hide details of the underlying architecture. It is a VM. But at the end of day, there could be (considerable) differences in execution due to object size differences ... that is, unless you completely ignore the underlying native intrinsic data sizes, your execution could have some ... er ... unexpected results. Which might by why Java 64 is so hard to create. They work so hard to hide the details of the underlying system (OS, CPU, memory, IO, network) from you, that in moving to a new ABI, there is so much to change, that it is ... non-trivial ... to do so. This said, I hear of Java's use in HPC every now and then. Some apps are interesting in that they leverage some capability of the underlying platform, like the Pervasive Software DataRush effort, and allow you to hide latency by massively threading their analyses. But as we have noted to many, I don't see the great unwashed masses/hoards of HPC developers rushing to Java due to its (many) downsides. There is a real tangible measurable performance penalty to abstraction. Introduce too much and you spend more time traversing the abstraction classes than you do doing the computation. Heck, we can't even write good compilers for non-OO code (e.g. compilers that generate near optimal instruction paths on existing CPUs on significant fraction of HPC code bases). Are we expecting to write even better JIT compilers and optimizers to solve a more difficult problem than the one we have basically punted on? I am a huge believer in programmer productivity (though I dispute the notion that Java's incredibly draconian type system coupled with its verbosity actually contributes to productivity), but underlying code performance is still one of the most important aspects in HPC. DataRush solves this by hiding latency of each task, by having so many tasks to work on. Sort of a Java version (weak analogy) of the old Tera MTA system. Other codes like Hadoop could do similar things ... schedule so much work that some actually gets done. A nascent (yet very real) problem for Java in addition to the above mentioned, for HPC usage going forward, is their complete lack of support for accelerators. Maybe someday, in another decade or so, they will start to support GPU computation ... not talking about OpenCL support, but real execution on the many cores that accelerators supply. The underlying architecture is changing fast enough that I don't think they can keep up. And end users want the performance. This provides a net incentive not to use Java, as it can't currently (or in the foreseeable future) support the emerging personal supercomputing systems with accelerators. Sure it can run on the CPUs, but then like all other codes, it runs head first into the memory wall, the IO bandwidth walls, and so forth. Sun of course will claim that the trick is to massively multithread the code, which means you don't focus on individual thread performance but on overall throughput. Which somewhat flies in the face of what HPC developers have been talking about for decades (tune for single processors first, then for parallel). So I won't disparage the users or use of Java in HPC, other than to note that the future on that platform in HPC may not be as bright as some marketeers might suggest. N.B. the recent MPI class we gave suggested that we need to re-tool it to focus more upon Fortran than C. There was no interest in Java from the class I polled. Some researchers want to use Matlab for their work, but most university computing facilities are loathe to spend the money to get site licenses for Matlab. Unfortunate, as Matlab is a very cool tool (been playing with it first in 1988 ...) its just not fast. The folks at Interactive Supercomputing might be able to help with this with their compiler. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From laytonjb at att.net Sat Dec 27 13:04:55 2008 From: laytonjb at att.net (Jeff Layton) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop Message-ID: <626490.10539.qm@web80706.mail.mud.yahoo.com> I hate to tangent (hijack?) this subject, but I'm curious about your class poll. Did the people who were interested in Matlab consider Octave? Thanks! Jeff ________________________________ From: Joe Landman To: Jeff Layton Cc: Gerry Creager ; Beowulf Mailing List Sent: Saturday, December 27, 2008 11:11:20 AM Subject: Re: [Beowulf] Hadoop N.B. the recent MPI class we gave suggested that we need to re-tool it to focus more upon Fortran than C. There was no interest in Java from the class I polled. Some researchers want to use Matlab for their work, but most university computing facilities are loathe to spend the money to get site licenses for Matlab. Unfortunate, as Matlab is a very cool tool (been playing with it first in 1988 ...) its just not fast. The folks at Interactive Supercomputing might be able to help with this with their compiler. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081227/30bdb4fd/attachment.html From tjrc at sanger.ac.uk Sun Dec 28 01:24:51 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <626490.10539.qm@web80706.mail.mud.yahoo.com> References: <626490.10539.qm@web80706.mail.mud.yahoo.com> Message-ID: <80FA8839-5AB1-4165-B2DF-AA58F5E93486@sanger.ac.uk> On 27 Dec 2008, at 9:04 pm, Jeff Layton wrote: > I hate to tangent (hijack?) this subject, but I'm curious about your > class poll. Did the people who were interested in Matlab consider > Octave? I can't speak for Joe's class, but when I've asked Matlab users here about using Octave instead, they're generally not interested. Partly this is a somewhat irrational "it doesn't have Matlab on the cover" thing, but largely it's the Matlab toolboxes they want access to. A recurrent theme we find with anyone doing stuff with any similar package, be it Matlab, Stata, R or whatever, is that they always hit massive difficulties when they try to scale up their analysis to larger problem sizes. If they come to me early in their project, I will point this out in advance, and suggest that while such languages are great for prototyping, they don't scale well. Usually, of course, the warning is ignored (they didn't pay for this advice, so no need to take it, right?) and six months down the line their stuff doesn't work any more when they try to run it on large production datasets, and I have to avoid the "I told you so" speech... :-) People have mentioned Hadoop here, and it might work for some types of work being done (I'm interested in the data-locality feature of its filesystem that was mentioned earlier in the thread) but no-one's using it at this site yet as far as I know. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From csamuel at vpac.org Sun Dec 28 02:13:18 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <80FA8839-5AB1-4165-B2DF-AA58F5E93486@sanger.ac.uk> Message-ID: <462378153.921871230459198422.JavaMail.root@mail.vpac.org> ----- "Tim Cutts" wrote: > I can't speak for Joe's class, but when I've asked Matlab users here > about using Octave instead, they're generally not interested. Partly > this is a somewhat irrational "it doesn't have Matlab on the cover" > thing, but largely it's the Matlab toolboxes they want access to. Very similar experiences here, even when it was impossible for us to get a 3rd party license for MATLAB (i.e. one our users at the member Universities could use). Now we've crawled over broken glass to get MATLAB DCS working for them and we do have some people taking it up, but it's a pretty hard configuration curve for them. :-( cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Sun Dec 28 02:15:59 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <495653A8.5090305@scalableinformatics.com> Message-ID: <1200996302.921901230459359077.JavaMail.root@mail.vpac.org> ----- "Joe Landman" wrote: Hi Joe, hope you're feeling better! > This said, I hear of Java's use in HPC every now and then. We have a few people using Java on the clusters here, our suspicion is mainly because that's all they've been taught (or have taught themselves). :-( cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From laytonjb at att.net Sun Dec 28 07:17:37 2008 From: laytonjb at att.net (Jeff Layton) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop Message-ID: <420088.84425.qm@web80702.mail.mud.yahoo.com> I think I understand why people want the toolboxes - it makes coding easy. From what I've seen people then stay with the "prototype" code and never move to a compiled language such as C or Fortran. It's been a long, long time, but I did all of my code prototyping for my PhD in Matlab and rewrote it in Fortran. I was easily able to get a 10x speedup. I guess people don't like 10x improvements in performance any more :) An option for people is to use the Matlab compiler to build faster code. I've never used it but the reports I've seen is that it works quite well. I'm not sure about how the toolboxes work with it - whether you compile them as well or they run as interpreted along with the compiled code. Octave has a number of toolboxes. I'm not sure if it covers what various people are doing, but they are out there. For larger problems, what about using ScaleMP? You can get a very large SMP system for running large problems. You may not need the extra CPUs, but they do get you the extra memory. That's one of the neat things about ScaleMP. You buy a couple of nodes with really fast CPUs and memory, and then buy the remaining nodes with really slow processors (cheap) with lots of memory. This allows you to tailor the SMP box to the application. Just be sure to run the code on the fastest CPUs (numactl). Thanks for the feedback. Jeff ________________________________ From: Tim Cutts To: Jeff Layton Cc: landman@scalableinformatics.com; Beowulf Mailing List Sent: Sunday, December 28, 2008 4:24:51 AM Subject: Re: [Beowulf] Hadoop On 27 Dec 2008, at 9:04 pm, Jeff Layton wrote: > I hate to tangent (hijack?) this subject, but I'm curious about your > class poll. Did the people who were interested in Matlab consider > Octave? I can't speak for Joe's class, but when I've asked Matlab users here about using Octave instead, they're generally not interested. Partly this is a somewhat irrational "it doesn't have Matlab on the cover" thing, but largely it's the Matlab toolboxes they want access to. A recurrent theme we find with anyone doing stuff with any similar package, be it Matlab, Stata, R or whatever, is that they always hit massive difficulties when they try to scale up their analysis to larger problem sizes. If they come to me early in their project, I will point this out in advance, and suggest that while such languages are great for prototyping, they don't scale well. Usually, of course, the warning is ignored (they didn't pay for this advice, so no need to take it, right?) and six months down the line their stuff doesn't work any more when they try to run it on large production datasets, and I have to avoid the "I told you so" speech... :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081228/9b1c1a44/attachment.html From james.p.lux at jpl.nasa.gov Sun Dec 28 10:28:59 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <420088.84425.qm@web80702.mail.mud.yahoo.com> Message-ID: On 12/28/08 7:17 AM, "Jeff Layton" wrote: > I think I understand why people want the toolboxes - it makes coding easy. > From what I've seen people then stay with the "prototype" code and never move > to a compiled language such as C or Fortran. It's been a long, long time, but > I did all of my code prototyping for my PhD in Matlab and rewrote it in > Fortran. I was easily able to get a 10x speedup. I guess people don't like 10x > improvements in performance any more :) Bearing in mind Hamming's admonition "the purpose of computing is insight, not numbers", it could well be that for a "research" application (contrasted with production) you don't need the 10x speedup. If the slow, easy to code, version gives you the answers you need in reasonable time, why change. OTOH, if your dissertation problem requires computation that takes months in Matlab, you've got two paths (at least): 1) spend some time (less than months) to learn how to code in a faster style/language/syste 2) reframe your problem so it doesn't require the computation (and then convince your committee of this, which could take longer than just doing the computation, eh?) If you're interested in faster than real time modeling, though, (say, you're doing real time control in robotics), then fast speed is essential. > > An option for people is to use the Matlab compiler to build faster code. I've > never used it but the reports I've seen is that it works quite well. I'm not > sure about how the toolboxes work with it - whether you compile them as well > or they run as interpreted along with the compiled code. Works quite well (a lot of the toolboxes are compiled as well).. Just don't expect to look at and debug the output of the compiler, so good test cases are important. Ad hoc development styles tend to be dicey. > > Octave has a number of toolboxes. I'm not sure if it covers what various > people are doing, but they are out there. Octave also does compilation. That is, just like Matlab, it does a just in time style of compilation to an intermediate form. A loop doesn't get reinterpreted on each pass through the loop. From laytonjb at att.net Sun Dec 28 11:43:45 2008 From: laytonjb at att.net (Jeff Layton) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop Message-ID: <707005.68683.qm@web80703.mail.mud.yahoo.com> Good points (as always). I've seen good examples of people using an interpreted language. It is easy to develop new code and algorithms and test them (I do that quite often). As you point out, this is great for some people who don't need to run the code many times. I've also seen people use the prototypes to develop compiled applications resulting in huge performance gains when they finalize an algorithm or as you point out, need the speed for their particular application. What I worry about are the "in-between" cases where people stick with interpreted code for whatever reason, even though they need the speed of a compiled application. I've seen example of this all too often. In many of the cases I've seen, these interpreted applications don't evolve and then the users start screaming that their applications don't scale so they need faster hardware, implying $$$. Ultimately, I think it's a trade-off between $$ spent on rewriting the application for for compiled languages or spending $$ on faster hardware. So, it's a case by case decision :) Thanks for the comments. Good insight that I missed. Plus, as usual, I've gotten way off track and I hate this silly web based email tool that doesn't have a good way to do quoted or indented replies. :) So I'll stop here. Jeff ________________________________ From: "Lux, James P" To: Jeff Layton ; Tim Cutts Cc: Beowulf Mailing List Sent: Sunday, December 28, 2008 1:28:59 PM Subject: Re: [Beowulf] Hadoop On 12/28/08 7:17 AM, "Jeff Layton" wrote: > I think I understand why people want the toolboxes - it makes coding easy. > From what I've seen people then stay with the "prototype" code and never move > to a compiled language such as C or Fortran. It's been a long, long time, but > I did all of my code prototyping for my PhD in Matlab and rewrote it in > Fortran. I was easily able to get a 10x speedup. I guess people don't like 10x > improvements in performance any more :) Bearing in mind Hamming's admonition "the purpose of computing is insight, not numbers", it could well be that for a "research" application (contrasted with production) you don't need the 10x speedup. If the slow, easy to code, version gives you the answers you need in reasonable time, why change. OTOH, if your dissertation problem requires computation that takes months in Matlab, you've got two paths (at least): 1) spend some time (less than months) to learn how to code in a faster style/language/syste 2) reframe your problem so it doesn't require the computation (and then convince your committee of this, which could take longer than just doing the computation, eh?) If you're interested in faster than real time modeling, though, (say, you're doing real time control in robotics), then fast speed is essential. > > An option for people is to use the Matlab compiler to build faster code. I've > never used it but the reports I've seen is that it works quite well. I'm not > sure about how the toolboxes work with it - whether you compile them as well > or they run as interpreted along with the compiled code. Works quite well (a lot of the toolboxes are compiled as well).. Just don't expect to look at and debug the output of the compiler, so good test cases are important. Ad hoc development styles tend to be dicey. > > Octave has a number of toolboxes. I'm not sure if it covers what various > people are doing, but they are out there. Octave also does compilation. That is, just like Matlab, it does a just in time style of compilation to an intermediate form. A loop doesn't get reinterpreted on each pass through the loop. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081228/01277535/attachment.html From gerry.creager at tamu.edu Mon Dec 29 07:01:21 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <626490.10539.qm@web80706.mail.mud.yahoo.com> References: <626490.10539.qm@web80706.mail.mud.yahoo.com> Message-ID: <4958E641.9050307@tamu.edu> OUR users are willing to pony up the funds to buy Matlab. We're already running Octave but they claimed they didn't know how to use it. Even after we showed them Matlab scripts that "just ran" on Octave. As for Fortran vs C, "real scientists program in Fortran. Real Old Scientists program in Fortran-66. Carbon-dated scientists can still recall IBM FORTRAN-G and -H." Actually, a number of our mathematicians use C for their codes, but don't seem to be doing much more than theoretical codes. The guys who're wwriting/rewriting practical codes (weather models, computational chemistry, reservoir simulations in solid earth) seem to stick to Fortran here. gerry Jeff Layton wrote: > I hate to tangent (hijack?) this subject, but I'm curious about your > class poll. Did the people who were interested in Matlab consider Octave? > > Thanks! > > Jeff > > ------------------------------------------------------------------------ > *From:* Joe Landman > *To:* Jeff Layton > *Cc:* Gerry Creager ; Beowulf Mailing List > > *Sent:* Saturday, December 27, 2008 11:11:20 AM > *Subject:* Re: [Beowulf] Hadoop > > N.B. the recent MPI class we gave suggested that we need to re-tool it > to focus more upon Fortran than C. There was no interest in Java from > the class I polled. Some researchers want to use Matlab for their work, > but most university computing facilities are loathe to spend the money > to get site licenses for Matlab. Unfortunate, as Matlab is a very cool > tool (been playing with it first in 1988 ...) its just not fast. The > folks at Interactive Supercomputing might be able to help with this with > their compiler. > -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From kus at free.net Mon Dec 29 08:33:23 2008 From: kus at free.net (Mikhail Kuzminsky) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <4958E641.9050307@tamu.edu> Message-ID: In message from Gerry Creager (Mon, 29 Dec 2008 09:01:21 -0600): > >As for Fortran vs C, "real scientists program in Fortran. Real Old >Scientists program in Fortran-66. Carbon-dated scientists can still >recall IBM FORTRAN-G and -H." :-) I didn't check, but may be I just have Fortran-G and H on my PC - as a part of free Turnkey MVS distribution working w/(free) Hercules emulator for IBM mainframes. > >Actually, a number of our mathematicians use C for their codes, but >don't seem to be doing much more than theoretical codes. The guys >who're wwriting/rewriting practical codes (weather models, >computational chemistry, reservoir simulations in solid earth) seem > to stick to Fortran here. Our group works in area of computational chemistry, and of course we write the programs on Fortran (95) :-) But I'm afraid that we'll start here the new cycle of "religious language war" :-) Mikhail Kuzminsky Computer Assistance to Chemical Research Center Zelinsky Institute of Organic Chemistry Moscow >gerry > >Jeff Layton wrote: >> I hate to tangent (hijack?) this subject, but I'm curious about your >> class poll. Did the people who were interested in Matlab consider >>Octave? >> >> Thanks! >> >> Jeff >> >> ------------------------------------------------------------------------ >> *From:* Joe Landman >> *To:* Jeff Layton >> *Cc:* Gerry Creager ; Beowulf Mailing List >> >> *Sent:* Saturday, December 27, 2008 11:11:20 AM >> *Subject:* Re: [Beowulf] Hadoop >> >> N.B. the recent MPI class we gave suggested that we need to re-tool >>it >> to focus more upon Fortran than C. There was no interest in Java >>from >> the class I polled. Some researchers want to use Matlab for their >>work, >> but most university computing facilities are loathe to spend the >>money >> to get site licenses for Matlab. Unfortunate, as Matlab is a very >>cool >> tool (been playing with it first in 1988 ...) its just not fast. >> The >> folks at Interactive Supercomputing might be able to help with this >>with >> their compiler. >> > >-- >Gerry Creager -- gerry.creager@tamu.edu >Texas Mesonet -- AATLT, Texas A&M University >Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 >Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 >_______________________________________________ >Beowulf mailing list, Beowulf@beowulf.org >To change your subscription (digest mode or unsubscribe) visit >http://www.beowulf.org/mailman/listinfo/beowulf > >-- >This message has been scanned for viruses and >dangerous content by MailScanner, and is >believed to be clean. > From james.p.lux at jpl.nasa.gov Mon Dec 29 08:48:37 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:08 2009 Subject: Octave vs Matlab Re: [Beowulf] Hadoop In-Reply-To: <4958E641.9050307@tamu.edu> Message-ID: On 12/29/08 7:01 AM, "Gerry Creager" wrote: > OUR users are willing to pony up the funds to buy Matlab. We're already > running Octave but they claimed they didn't know how to use it. Even > after we showed them Matlab scripts that "just ran" on Octave. I use both on Windows and MacOS X.. While it's true that Matlab scripts just run on Octave, the "setup" process is somewhat trickier for Octave on Windows. For instance, the whole Octave thing is very unix-like, especially in terms of paths, file functions, etc. so it's not quite so load and go as Matlab. Matlab also has a GUI which makes doing things like setting the path, working directory, etc, much easier for an unsophisticated user. Matlab's graphics and visualization functions are far more sophisticated than Octave's too. Octave has nothing comparable to Simulink or RealTimeWorkbench, which are sort of a step beyond a simple "toolbox" (e.g. Cranking out something like ccmatlab to make ccoctave shouldn't be a big challenge, or, for instance, the mapping toolbox.. But duplicating Simulink would be a lot of work) The other thing is that there are useful things out there for Matlab (e.g. A BSD Sockets library) that aren't trivially obtainable for Octave. To a certain extent, this is just because the originator has Matlab, and has only bothered to develop/compile/package things up for Matlab (i.e. They're basically making available something they had to build for their own purposes, not trying to create a product, per se). This is particularly the case when your Matlab app is trying to talk to other processes or hardware. The man or woman with the weird hardware is going to slog through doing the drivers once, perhaps distribute them for free once done (because they were paid by the taxpayer, for instance), but isn't going to do it again for a different environment, just out of the goodness of their heart. So, for just running standalone computational stuff, Octave gives Matlab a good run for the money, but as soon as you start thinking of Matlab as a generalized tool for tinkering about and such, it starts to lag behind. You pay for the ease and slickness of the Matlab tool. What might be interesting is a sort of cross development model:i.e. Develop in Matlab, which is nicer for interactive stuff, then use Octave, etc, to run the production job on a cluster (so you don't have to deal with Mathworks weird licensing on clusters). Octave vs Matlab is sort of like OO vs MSOffice.. It mostly works and for many tasks is perfectly adequate, but it's not the same. Jim Lux From hearnsj at googlemail.com Mon Dec 29 09:15:35 2008 From: hearnsj at googlemail.com (John Hearns) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: References: <4958E641.9050307@tamu.edu> Message-ID: <9f8092cc0812290915x45ee8808sc226b0f0c3de407e@mail.gmail.com> > > > :-) I didn't check, but may be I just have Fortran-G and H on my PC - as a > part of free Turnkey MVS distribution working w/(free) Hercules emulator for > IBM mainframes. >> >> > > Ah... Job Control Language. Deep, deep joy. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081229/bd1679d6/attachment.html From gus at ldeo.columbia.edu Mon Dec 29 09:18:32 2008 From: gus at ldeo.columbia.edu (Gus Correa) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <4958E641.9050307@tamu.edu> References: <626490.10539.qm@web80706.mail.mud.yahoo.com> <4958E641.9050307@tamu.edu> Message-ID: <49590668.4030007@ldeo.columbia.edu> Hello Beowulfers (This thread should be renamed "Matlab and Octave".) Matlab is the "lingua franca" for computing among students and young scientists, at least in Earth Sciences (solid earth, atmosphere/oceans/climate, geochemistry, etc), as I observe it here. A number of our students come from Physics, Chemistry, Biology, etc, hence the trend is probably more widespread. Some can get by graduate school even with Excel only. As others observed on this thread, Matlab is a great prototyping tool, which makes it very attractive. Integrated environment, with GUI, editor, online help, programming examples and tips, and instant visualization of results, is yet another high point of Matlab. For most people this type of environment is not only convenient, but also addictive. I like Octave, the command line is virtually identical to Matlab, but couldn't get all these GUI-sh bells and whistles to work in Octave. Because of this dependence, our Observatory has a Matlab site license. Moreover, several top numerical models in oceans and climate depend heavily on Matlab scripts for post-processing and data analysis. This may be the case in other areas too. For instance, not long ago I saw several job ads for Matlab programmers in the Princeton Plasma Physics Lab. As Matlab scripts and tasks get bigger and bigger, the positive feedback created the need and market for parallel versions of Matlab. In many cases Matlab is the only programming environment that science and engineering students came across with. It is introduced on Linear Algebra, Numerical Analysis, Signal Processing, and other classes, and it sticks, it settles down. As James, Gerry and others observed, a lot of people only need to do prototyping anyway: proof of concept, one-time calculations of modest size, and for this Matlab works very well. Matlab's cavalier approach to memory management - or perhaps the inadvertent cavalier approach to Matlab by naive users - may be the main cause for the scaling problems. Most failures I've seen in Matlab scripts come from exhaustion of computer resources, particularly memory. Even when you free memory judiciously, problems may arise. This happens here very often with people trying to do, say, singular value decomposition or principal component analysis on huge and dense matrices / datasets, etc. In the old days of punched cards, Fortran was part of the engineering and scientific training. Fortran was king in Intro to Computers classes or similar. That is no longer true. Fortran lost its charm and status among computer scientists (even John Backus abandoned it). In addition, today most college scientific curricula take for granted the computer literacy of its freshmen students. A mistake, I think. (A few students are great hackers, but most only know Skype, Facebook, MS Word.) I think Intro to Computers courses would continue to be useful for engineers and science majors. (Not for prospective computer scientists, of course, who need much more than that.) These courses should include basic Unix/Linux literacy, shell scripting (or Perl, or Python), the old-fashioned but effective principles of "structured programming" (call it "modular programming" to make it palatable), and the rudiments of a language of choice. This language may be Fortran, which continues to be the dominant one in science and engineering code, or perhaps C. However, when these Intro to Computers courses exist, they try to teach Java, C++, etc, often using Microsoft Studio, or another programming environment that traps the user, and doesn't give him/her the required computer craftsmanship (and autonomy) for their professional life. For most prospective engineers and general scientists a computer is more of a tool than a theoretical model. OO-languages, Turing machines, cellular automata, make nice class discussion topics, but can't replace the development of basic computer literacy and skills. My $0.02. Gus Correa --------------------------------------------------------------------- Gustavo Correa, PhD - Email: gus@ldeo.columbia.edu Lamont-Doherty Earth Observatory - Columbia University P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA --------------------------------------------------------------------- Gerry Creager wrote: > OUR users are willing to pony up the funds to buy Matlab. We're > already running Octave but they claimed they didn't know how to use > it. Even after we showed them Matlab scripts that "just ran" on Octave. > > As for Fortran vs C, "real scientists program in Fortran. Real Old > Scientists program in Fortran-66. Carbon-dated scientists can still > recall IBM FORTRAN-G and -H." > > Actually, a number of our mathematicians use C for their codes, but > don't seem to be doing much more than theoretical codes. The guys > who're wwriting/rewriting practical codes (weather models, > computational chemistry, reservoir simulations in solid earth) seem to > stick to Fortran here. > > gerry > > Jeff Layton wrote: >> I hate to tangent (hijack?) this subject, but I'm curious about your >> class poll. Did the people who were interested in Matlab consider >> Octave? >> >> Thanks! >> >> Jeff >> >> ------------------------------------------------------------------------ >> *From:* Joe Landman >> *To:* Jeff Layton >> *Cc:* Gerry Creager ; Beowulf Mailing List >> >> *Sent:* Saturday, December 27, 2008 11:11:20 AM >> *Subject:* Re: [Beowulf] Hadoop >> >> N.B. the recent MPI class we gave suggested that we need to re-tool it >> to focus more upon Fortran than C. There was no interest in Java from >> the class I polled. Some researchers want to use Matlab for their work, >> but most university computing facilities are loathe to spend the money >> to get site licenses for Matlab. Unfortunate, as Matlab is a very cool >> tool (been playing with it first in 1988 ...) its just not fast. The >> folks at Interactive Supercomputing might be able to help with this with >> their compiler. >> > From gus at ldeo.columbia.edu Mon Dec 29 12:38:15 2008 From: gus at ldeo.columbia.edu (Gus Correa) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Re: Matlab and Octave In-Reply-To: <20081229182145.GB9919@bx9> References: <626490.10539.qm@web80706.mail.mud.yahoo.com> <4958E641.9050307@tamu.edu> <49590668.4030007@ldeo.columbia.edu> <20081229182145.GB9919@bx9> Message-ID: <49593537.9020001@ldeo.columbia.edu> Greg Lindahl wrote: > On Mon, Dec 29, 2008 at 12:18:32PM -0500, Gus Correa wrote: > > >> (This thread should be renamed "Matlab and Octave".) >> > > Indeed, it only takes a few seconds to change a subject... > > I'm surprised that Columbia doesn't still have a Fortran or > computing-for-scientists class; they are often found in the Physics > department, or in CS but taught by a physicist. > > -- greg > Hi Greg, Beowulfers My statements about Intro to Computers courses were generic, not specific about this university. Our online course catalog (Fall/08, Spring/09) shows four Intro to Computers course flavors: Programming Matlab (two sections, one specific for Life Sciences) Programming C, Programming Java, and one generic Intro to Information Science (a survey of different things: WWW, databases, human-computer interfaces, etc) A search with the keyword "Fortran" didn't show any result. Besides Matlab, some courses also use Mathematica or Maple for programming. Prototyping tools seem to be preferred to computer languages. OO-languages seem to be preferred to procedural ones. C is preferred to Fortran. Basic Unix/Linux skills don't seem to be covered anywhere. I don't have any statistics or data, but I guess this is the picture across the country. However, there is computer and programming expertise spread across many departments, and there may be courses that use Fortran, as you supposed. To name a few, the QCD people in Physics (associated to the IBM BlueGene prototype), the computational chemists, the engineers, the professors at Applied Physics and Applied Math, and the medical imaging and genetics folks, etc, most likely are skilled in Fortran, teaching to and learning from their peers. However, these may not be general introductory courses for a broad audience, at least I couldn't find one with these characteristics. Our Earth Science Department has a small number of undergraduate students, and a large number in graduate school, coming from across the country and from abroad, not necessarily from Columbia. This is the sample I interact with. Our foreign students tend to have had more exposure to Unix/Linux and to programming than those from the US. My daughter's recent freshman Intro to Computers class at another high-ranked college consisted of C programming (K&R was the textbook) with OpenGL examples. I would guess fashionable/pedantic approaches push young people with no previous exposure to Unix and programming towards the more comfortable, useful, and sensible Matlab, and turn them away from other (equally useful and important) computer tools. Gus Correa --------------------------------------------------------------------- Lamont-Doherty Earth Observatory - Columbia University --------------------------------------------------------------------- From lindahl at pbm.com Mon Dec 29 13:46:48 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Re: Matlab and Octave In-Reply-To: <49593537.9020001@ldeo.columbia.edu> References: <626490.10539.qm@web80706.mail.mud.yahoo.com> <4958E641.9050307@tamu.edu> <49590668.4030007@ldeo.columbia.edu> <20081229182145.GB9919@bx9> <49593537.9020001@ldeo.columbia.edu> Message-ID: <20081229214648.GF20059@bx9> On Mon, Dec 29, 2008 at 03:38:15PM -0500, Gus Correa wrote: > I don't have any statistics or data, but I guess this is the picture > across the country. Most people need 2 data points before they generalize to the entire country. Many useless threads on this mailing list come from debating generalizations from 1 or 2 data points. I suggest that we not do that. -- greg From lindahl at pbm.com Mon Dec 29 14:09:34 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Hadoop In-Reply-To: <495565B4.4090701@tamu.edu> References: <495565B4.4090701@tamu.edu> Message-ID: <20081229220934.GG20059@bx9> On Fri, Dec 26, 2008 at 05:16:04PM -0600, Gerry Creager wrote: > We've a user who has requested its installation on one of our clusters, > a high-throughput system. You didn't say anything about what they wanted to do. Hadoop is designed to store a lot of data, and then enable what we HPC people would call nearly-embarrassingly-parallel computation with good locality -- it takes shards of mapreduce computation to run on the same system as the disk shards being processed. This means you'll have to dedicate systems over the long term to store the data (much like PVFS), and all of these systems will have to be a part of their mapreduce jobs. So if your queue system can run whole-cluster jobs easily, no problem. If, instead, they're just looking for a simple way to do embarrassingly parallel computations, without lots of persistent data, then you can probably point them at something easier and more friendly to your queue system. -- greg From gus at ldeo.columbia.edu Mon Dec 29 14:46:12 2008 From: gus at ldeo.columbia.edu (Gus Correa) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Re: Matlab and Octave In-Reply-To: <20081229214648.GF20059@bx9> References: <626490.10539.qm@web80706.mail.mud.yahoo.com> <4958E641.9050307@tamu.edu> <49590668.4030007@ldeo.columbia.edu> <20081229182145.GB9919@bx9> <49593537.9020001@ldeo.columbia.edu> <20081229214648.GF20059@bx9> Message-ID: <49595334.5040207@ldeo.columbia.edu> Greg Lindahl wrote: > On Mon, Dec 29, 2008 at 03:38:15PM -0500, Gus Correa wrote: > > >> I don't have any statistics or data, but I guess this is the picture >> across the country. >> > > Most people need 2 data points before they generalize to the entire > country. Many useless threads on this mailing list come from debating > generalizations from 1 or 2 data points. I suggest that we not do > that. > > -- greg > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Hi Greg, Beowulfers One sentence out of context is also only one data point to generalize from, right? If anything, blame me, don't blame most, please. :) Our student sample here is quite varied, and on average confirms what I said. There are exceptions, as always. I agree this doesn't make proper statistical evidence, but it is an indication. Other people's comments on this thread suggested the same too: prototyping tools (Matlab, Mathematica, etc) seem to be preferentially used by (and taught to) science and engineering students and young professionals, not Fortran or C, not the seasoned Unix/Linux programming environment and tools. Regardless of whether one thinks this trend is good or bad, progressive or not. It would be interesting to see the statistics of the "Computers 101" course syllabuses for non-CS majors across the country, what has been the emphasis on those courses, if anybody on this list knows an article or something about it. I would rather be proven wrong than right. I don't want to dwell on this, though. Thank you. Gus Correa --------------------------------------------------------------------- Lamont-Doherty Earth Observatory - Columbia University --------------------------------------------------------------------- From james.p.lux at jpl.nasa.gov Mon Dec 29 15:11:27 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Re: Matlab and Octave In-Reply-To: <49593537.9020001@ldeo.columbia.edu> Message-ID: On 1 > > My daughter's recent freshman Intro to Computers class > at another high-ranked college consisted of > C programming (K&R was the textbook) with OpenGL examples. > I would guess fashionable/pedantic approaches push young people > with no previous exposure to Unix and programming > towards the more comfortable, useful, and sensible Matlab, > and turn them away from other (equally useful and important) computer tools. > OTOH, computing is just a tool. We don't expect biologists to make their own microscope, or even understand optics, just how to effectively use the instrument. The vast majority of folks using a computer should do it by whatever makes their job easier. Certainly, some amount of knowledge of how software engineering or software development is done helps in the "being an informed consumer" area, but I'd hardly say that, for instance, all electrical engineers should be able to write good code in language X. If folks like those populating this list can make a good and seamless blend between the user facility of Matlab and the inexpensive computational horsepower available from a Beowulf, then all the better. There will, of course, always need to be folks who can really eke out the maximum in performance, and if they have some application domain specific knowledge, all the better. There are also certain research questions which are sufficiently complex that only someone who really understands the question can ask it effectively and who must also be a software whiz to get the answer in finite time, even with that hulking Beowulf in the room next door, but I would contend that they are in the minority. Thus, those sorts of applications (which we discuss daily, here) should not be the driver of undergraduate courses. Jim From niftyompi at niftyegg.com Mon Dec 29 16:11:59 2008 From: niftyompi at niftyegg.com (Nifty Tom Mitchell) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Not all cores are created equal In-Reply-To: <1273738995.751491230113018208.JavaMail.root@mail.vpac.org> References: <9f8092cc0812230701p5fbb7da9g64042a2a559cd5f7@mail.gmail.com> <1273738995.751491230113018208.JavaMail.root@mail.vpac.org> Message-ID: <20081230001159.GA3126@compegg.wr.niftyegg.com> On Wed, Dec 24, 2008 at 09:03:38PM +1100, Chris Samuel wrote: > ----- "John Hearns" wrote: > > > SGI Altix have 'bootcpusets' which means you can slice > > off one or two processors to take care of OS housekeeping > > tasks, > > Now that cpusets have been in the mainline kernel for > some time you should be able to do this with any modern > distro. > > I contemplated doing this on our Barcelona cluster, but > sacrificing 1 core in 8 was a bit too much of a high price > to pay. But people with higher core counts per node might > find it attractive. This seems like a be a benchmark decision based on application load and 'implied IO+OS' loading as well as the ability to localize the IO+OS activity to the sacrificed CPU core. Of interest CPU and system designers and OS engineers are set on the SMP model where all the parts are considered equal. This simplification ignores the reality that interrupts, networking, encryption and file IO are not floating point intensive and thus leave FPU core transistors idle. The decisions are different when dedicated IO channel processors or vector processors are built into the hardware of the system. Today the apparent cut and paste model of multi core CPU design where the most critical design issues are at the memory (cache) interface pushes the issue out to the cluster user/ manager and perhaps into the batch system. Outside of heat issues adding yet another FPU core is almost free given today's transistor budgets. For a long time I felt that the Intel Hyper-Threading was an interesting decision in that it all but stated that floating point was a second class activity in the system. However the complexity to add more execution units may have nixed more hyper-threading efforts. The benchmarking (combined with CPU affinity) work might be interesting. Leaving 12.5% of the FPU resource on the table might look like a lot at first but since the other seven cores might be idled by a slow rank sidetracked by interrupts and IO the benchmark FPU delta per rank need only be about one seventh of that (i.e 2%) to generate a net gain. This might be an easy percentage to gain by localizing interrupts and IO so user space activity affinity does not conflict. But this is not strictly SMP so the hardware and OS design may limit the gains. Two percent in an eight core system does not seem intuitive. Did I get this turned inside out? -- T o m M i t c h e l l Found me a new hat, now what? From csamuel at vpac.org Mon Dec 29 16:27:59 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Not all cores are created equal In-Reply-To: <20081230001159.GA3126@compegg.wr.niftyegg.com> Message-ID: <715201515.1016031230596879734.JavaMail.root@mail.vpac.org> ----- "Nifty Tom Mitchell" wrote: > On Wed, Dec 24, 2008 at 09:03:38PM +1100, Chris Samuel wrote: > > > I contemplated doing this on our Barcelona cluster, but > > sacrificing 1 core in 8 was a bit too much of a high price > > to pay. But people with higher core counts per node might > > find it attractive. > > This seems like a be a benchmark decision based on application > load and 'implied IO+OS' loading as well as the ability to > localize the IO+OS activity to the sacrificed CPU core. I'll leave that to sites that have a benchmarkable and characterisable workload. :-) We've got over 600 random users running random code (some very random indeed [1]) that covers all categories from self-written, through open source to commercial apps. cheers, Chris [1] - including a commercial code that segfaults in one particular program in libmsxml.so - yes, that appears to be a 3rd party implementation of the M$ XML library on Linux. When reported they claimed it was because we were running CentOS5 not RHEL4. Can't reproduce on RHEL4 because it crashes *before* that point on that distro. Gah. -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Mon Dec 29 22:14:40 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] CFP: 3rd International Workshop on Virtualization Technologies in Distributed Computing (VTDC-09) Message-ID: <2095550439.1019121230617680438.JavaMail.root@mail.vpac.org> ======================================================================== Call for Papers 3rd International Workshop on Virtualization Technologies in Distributed Computing (VTDC-09) http://grid-appliance.org/vtdc09 In conjunction with ICAC 2009 Barcelona, Spain, June 15 2009 ======================================================================== Workshop scope: --------------- Virtualization has proven to be a powerful enabler in the field of distributed computing and has led to the emergence of the cloud computing paradigm and the provisioning of Infrastructure-as-a-Service (IaaS). This new paradigm raises challenges ranging from performance evaluation of IaaS platforms, through new methods of resource management including providing Service Level Agreements (SLAs) and energy- and cost-efficient schedules, to the emergence of supporting technologies such as virtual appliance management. For the last three years, the VTDC workshop has served as a forum for the exchange of ideas and experiences studying the challenges and opportunities created by IaaS/cloud computing and virtualization technologies. VTDC brings together researchers in academia and industry who are involved in research, development and planning activities involving the use of virtualization in the context of distributed systems, where the opportunities and challenges with respect to the management of such virtualized systems is of interest to the ICAC community at large. Important dates: ---------------- * Submission deadline: February 20th, 2009 * Notification of acceptance: March 23rd, 2009 * Final manuscripts due: April 6, 2009 * Workshop: June 15, 2009 Topics: ------ Authors are invited to submit original and unpublished work that exposes a new problem, advocates a specific solution, or reports on actual experience. Papers should be submitted as full-length 8 page papers of double column text using single space 10pt size type on an 8.5x11 paper. Papers will be published in the proceedings of the workshop. VTDC 2009 topics of interest include, but are not limited to: * Infrastructure as a service (IaaS) * Virtualization in data centers * Virtualization for resource management and QoS assurance * Security aspects of using virtualization in a distributed environment * Virtual networks * Virtual data, storage as a service * Fault tolerance in virtualized environments * Virtualization in P2P systems * Virtualization-based adaptive/autonomic systems * The creation and management of environments/appliances * Virtualization technologies * Performance modeling (applications and systems) * Virtualization techniques for energy/thermal management * Case studies of applications on IaaS platforms * Deployment studies of virtualization technologies * Tools relevant to virtualization Organization: ------------- * General Chair: o Kate Keahey, University of Chicago, Argonne National Laboratory * Program Chair: o Renato Figueiredo, University of Florida * Steering Committee Chair: o Jose A. B. Fortes, University of Florida * Program Committee: o Franck Cappello, INRIA o Jeff Chase, Duke University o Peter Dinda, Northwestern University o Ian Foster, University of Chicago, Argonne National Laboratory o Sebastien Goasguen, Clemson University o Sverre Jarp, CERN o John Lange, Northwestern University o Matei Ripeanu, University of British Columbia o Paul Ruth, University of Mississippi o Kyung Ryu, IBM o Chris Samuel, Victorian Partnership for Advanced Computing o Frank Siebenlist, Argonne National Laboratory o Dilma da Silva, IBM o Mike Wray, HP o Dongyan Xu, Purdue University o Mazin Yousif, Avirtec o Ming Zhao, Florida International University * Publicity Chair: o Ming Zhao, Florida International University For more information: --------------------- VTDC-09 Web site: http://grid-appliance.org/vtdc09 ICAC-09 Web site: http://icac2009.acis.ufl.edu -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From jason at acm.org Tue Dec 30 06:13:02 2008 From: jason at acm.org (Jason Riedy) Date: Wed Nov 25 01:08:08 2009 Subject: [Beowulf] Re: Octave vs Matlab [was Re: Hadoop] In-Reply-To: (James P. Lux's message of "Mon, 29 Dec 2008 08:48:37 -0800") References: <4958E641.9050307@tamu.edu> Message-ID: <87y6xxu4y9.fsf_-_@sparse.dyndns.org> And James P. Lux writes: > Matlab's graphics and visualization functions are far more sophisticated > than Octave's too. Depends on what you're doing. jHandles, graceplot, octaviz (vtk), zenity, etc. all are good for different uses. However, I tend to use R for analysis. (Someday I'll polish and publish my simple Octave-SQLite interface. Makes communicating between environments easy even if not incredibly fast.) > Octave has nothing comparable to Simulink or RealTimeWorkbench, which are > sort of a step beyond a simple "toolbox" (e.g. Cranking out something like > ccmatlab to make ccoctave shouldn't be a big challenge, or, for instance, > the mapping toolbox.. But duplicating Simulink would be a lot of work) And Matlab(TM) has nothing comparable to Octave's multiple platform support or low-level interfaces. Once I needed something for Linux/IA-64 and AIX/Power, I moved to GNU Octave. I think many free software folks go to Scilab for Simulink-like functionality. I've never used Simulink or Scilab, though. The three environments (Matlab(TM), Octave, Scilab) are *different*, but they share a common language subset. This causes many people to expect any one to be a drop-in replacement for the others, just like many people get upset when a C++ compiler won't work on C99 code. > The other thing is that there are useful things out there for Matlab (e.g. A > BSD Sockets library) that aren't trivially obtainable for Octave. Define "trivially obtainable": http://octave.sourceforge.net/sockets/index.html Or try searching for "octave sockets" via Google... OctaveForge has many items. Unfortunately (or fortunately?) many are clones of Matlab(TM) gizmos and don't always take advantage of Octave features. > The man or woman with the weird hardware is going to slog through doing the > drivers once, perhaps distribute them for free once done (because they were > paid by the taxpayer, for instance), but isn't going to do it again for a > different environment, just out of the goodness of their heart. Indisputably, many users of Octave, Matlab(TM), R, etc. are reluctant programmers. I'd bet most HPC users are reluctant programmers... And sometimes it's amazing what corner cases people find and then rely upon. > So, for just running standalone computational stuff, Octave gives Matlab a > good run for the money, but as soon as you start thinking of Matlab as a > generalized tool for tinkering about and such, it starts to lag > behind. Depends on your use (and your interpretation of "it" above). For me, Matlab(TM) is pretty much useless. I need an environment on platforms where there is zero commercial interest. Also, I can't (or couldn't, last time I looked) extend the environment with compiled sparse matrix types and routines on par with the built-in ones. Then there is the cluster and parallel issue. And I'm not a reluctant programmer. When I needed changes, I dove into Octave and sent the changes upstream. Not so easy with Matlab(TM), as some ever-recurring bugs have shown. > You pay for the ease and slickness of the Matlab tool. Remember that a tool isn't easy if it doesn't exist *at all* for a required environment. > Octave vs Matlab is sort of like OO vs MSOffice.. It mostly > works and for many tasks is perfectly adequate, but it's not > the same. Octave is in no way as annoying as OpenOffice. ;) Jason From atp at piskorski.com Tue Dec 30 12:31:42 2008 From: atp at piskorski.com (Andrew Piskorski) Date: Wed Nov 25 01:08:09 2009 Subject: [Beowulf] Re: Octave vs Matlab [was Re: Hadoop] In-Reply-To: <87y6xxu4y9.fsf_-_@sparse.dyndns.org> References: <87y6xxu4y9.fsf_-_@sparse.dyndns.org> Message-ID: <20081230203142.GA38610@piskorski.com> On Tue, Dec 30, 2008 at 09:13:02AM -0500, Jason Riedy wrote: > And I'm not a reluctant programmer. When I needed changes, I > dove into Octave and sent the changes upstream. Not so easy with > Matlab(TM), as some ever-recurring bugs have shown. FWIW, I've never used Matlab, but a few years back (c. 2005 or so), a friend of mine was using it fairly extensively in her grad school research. The interesting bit, was she was using Matlab on both MS Windows and Linux, and noticed that many seemingly generic, non-platform-specific bugs were eventually fixed on Windows, but years later, they still hadn't been fixed on Linux. That made me wonder about their bug tracking and source control practices over there at the Mathworks... It also suggested that Matlab for Linux was very much a second class citizen in their product line. -- Andrew Piskorski http://www.piskorski.com/ From toon at moene.org Mon Dec 29 10:48:22 2008 From: toon at moene.org (Toon Moene) Date: Wed Nov 25 01:08:09 2009 Subject: [Beowulf] What's the category of Beowulf among Clusters? In-Reply-To: References: <494B543A.9040103@gmail.com> Message-ID: <49591B76.70505@moene.org> Robert G. Brown wrote: > On Fri, 19 Dec 2008, Mark Hahn wrote: > >>> what's the category of Beowulf like clusters? >> >> beowulf is compute clustering using mostly commodity hardware and >> mostly open-source software. > > And if you want to be really picky, it should be an architecture that > "looks like a supercomputer" "looks like a supercomputer" ? Funny. The last time I made a 4-CPU, 64-bit vector machine with 512 MWords of memory ready for operations (that description now fits the home machine sitting 3 feet from the tip of my nose), I had a Cray engineer on site to guide me through the process. "Supercomputer" - hah, humbug. -- Toon Moene - e-mail: toon@moene.org (*NEW*) - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/ Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.4/changes.html From toon at moene.org Mon Dec 29 11:39:27 2008 From: toon at moene.org (Toon Moene) Date: Wed Nov 25 01:08:09 2009 Subject: [Beowulf] Hadoop In-Reply-To: <626490.10539.qm@web80706.mail.mud.yahoo.com> References: <626490.10539.qm@web80706.mail.mud.yahoo.com> Message-ID: <4959276F.6050000@moene.org> [ I can't determine anymore who I'm replying to ... ] > N.B. the recent MPI class we gave suggested that we need to re-tool it > to focus more upon Fortran than C. There was no interest in Java from > the class I polled. Some researchers want to use Matlab for their work, > but most university computing facilities are loathe to spend the money > to get site licenses for Matlab. Unfortunate, as Matlab is a very cool > tool (been playing with it first in 1988 ...) its just not fast. The > folks at Interactive Supercomputing might be able to help with this with > their compiler. I think you (and they) are waging a "back-end" war. I have had problems for at least a decade with University students being baffled by Fortran because they were used to use Matlab. The tide is turning. People streaming in from academia into my Institute (Dutch Weather Forecasting Centre) now are clever enough to overcome the small syntactical differences between the Matlab language and Fortran 90+ and just proceed to use the latter, integrating flawlessly with our other software. -- Toon Moene - e-mail: toon@moene.org (*NEW*) - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/ Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.4/changes.html From mgrhferlamb2002 at yahoo.it Tue Dec 30 06:56:23 2008 From: mgrhferlamb2002 at yahoo.it (matheus reimann) Date: Wed Nov 25 01:08:09 2009 Subject: [Beowulf] An Ask about Compilation Message-ID: <431845.87523.qm@web25501.mail.ukl.yahoo.com> Hi, I am a newbie in the group, and I have an ask about Compilation of programs in seriel and paralell. I wanna have a programm which decides if the run is parallel or seriell. P.ex.: myexe.exe -> runs in seriell, and when I write: myexe.exe -parallel -> runs in parallel. I dont wanna compile ever when I need the program in seriel or in parallel. Usually is the seriell compilation with g++ and the parallel with mpiCC. Has anybody done a program like this before??? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081230/388f1075/attachment.html From coutinho at dcc.ufmg.br Tue Dec 30 13:12:58 2008 From: coutinho at dcc.ufmg.br (Bruno Coutinho) Date: Wed Nov 25 01:08:09 2009 Subject: [Beowulf] An Ask about Compilation In-Reply-To: <431845.87523.qm@web25501.mail.ukl.yahoo.com> References: <431845.87523.qm@web25501.mail.ukl.yahoo.com> Message-ID: 2008/12/30 matheus reimann > > Hi, > > I am a newbie in the group, and I have an ask about Compilation of programs in seriel and paralell. > > I wanna have a programm which decides if the run is parallel or seriell. P.ex.: myexe.exe -> runs in seriell, and when I write: myexe.exe -parallel -> runs in parallel. > > I dont wanna compile ever when I need the program in seriel or in parallel. Usually is the seriell compilation with g++ and the parallel with mpiCC. Has anybody done a program like this before??? mpiCC is a wrap to g++ that adds flags to find MPI include and library files. You can compile both with mpiCC. To run the serial version, you can call the binary directly and to run the parallel version run your program through mpirun to run inside the MPI enveronment. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081230/a92c59af/attachment.html From niftyompi at niftyegg.com Tue Dec 30 20:41:22 2008 From: niftyompi at niftyegg.com (NiftyOMPI Mitch) Date: Wed Nov 25 01:08:09 2009 Subject: [Beowulf] Not all cores are created equal In-Reply-To: <715201515.1016031230596879734.JavaMail.root@mail.vpac.org> References: <20081230001159.GA3126@compegg.wr.niftyegg.com> <715201515.1016031230596879734.JavaMail.root@mail.vpac.org> Message-ID: <88815dc10812302041x683ae81fhf312a6ec0b131c56@mail.gmail.com> On Mon, Dec 29, 2008 at 4:27 PM, Chris Samuel wrote: > > ----- "Nifty Tom Mitchell" wrote: > >> On Wed, Dec 24, 2008 at 09:03:38PM +1100, Chris Samuel wrote: >> >> > I contemplated doing this on our Barcelona cluster, but >> > sacrificing 1 core in 8 was a bit too much of a high price >> > to pay. But people with higher core counts per node might >> > find it attractive. >> >> This seems like a be a benchmark decision based on application >> load and 'implied IO+OS' loading as well as the ability to >> localize the IO+OS activity to the sacrificed CPU core. > > I'll leave that to sites that have a benchmarkable and > characterisable workload. :-) We've got over 600 random > users running random code (some very random indeed [1]) > that covers all categories from self-written, through open > source to commercial apps. > > cheers, > Chris > > [1] - including a commercial code that segfaults in one > particular program in libmsxml.so - yes, that appears to > be a 3rd party implementation of the M$ XML library on Linux. > When reported they claimed it was because we were running > CentOS5 not RHEL4. Can't reproduce on RHEL4 because it > crashes *before* that point on that distro. Gah. > > -- > Christopher Samuel - (03) 9925 4751 - Systems Manager Benchmarking with a long list of random applications is problematic. One additional hard to benchmark aspect of a big cluster is "between job" legacy I/O. After a process exits it is possible for pending data I/O to slow the startup of the next process. It may be simpler to sample the system state watching for waitio and any other activity measure you can track with a light hand. Statistical analysis and charting tools are available.... sample oriented benchmarks on cluster workloads are not common but I suspect they can tell us a lot. -- NiftyOMPI T o m M i t c h e l l