From gropp at mcs.anl.gov Sun Apr 2 08:07:01 2006 From: gropp at mcs.anl.gov (William Gropp) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] MPICH P4 error In-Reply-To: <20060331205522.17497.qmail@web8610.mail.in.yahoo.com> References: <20060331205522.17497.qmail@web8610.mail.in.yahoo.com> Message-ID: <6.2.1.2.2.20060402100539.051fcdb0@pop.mcs.anl.gov> An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20060402/0ade7da1/attachment.html From dzila at tassadar.physics.auth.gr Sun Apr 2 09:57:36 2006 From: dzila at tassadar.physics.auth.gr (Dimitris Zilaskos) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] wrf + mpich p4 problem with vanilla kernels Message-ID: Hi all, I am trying to run WRF model on a dual core dual cpu opteron system with intel c/fortran 9 , scientific linux (rhel compatible, tried 3.0.4 , 3.0.5 , 4.2) and mpich 1.2.7p1. As long as I use the vendor supplied kernels everything works fine. However , when I use kernels compiled on my own, I am getting erratic behaviour: the model will either crash, or produce invalid results, or complete successfully approximately once in 20 attemps. If I run it on one CPU it completes successfully with all kernels. I have tried with 2.6.14.3, 2.6.16.6 and 2.6.9. Both show the same erratic behaviour. Kernels 2.6.9-22 and 2.6.9-34 as supplied by scientific linux 4.2 work fine, as well as 2.4.21-37.0.1 in 3.0.4. All kernels are smp enabled. Any help is appreciated. Best regards, -- ============================================================================ Dimitris Zilaskos Department of Physics @ Aristotle University of Thessaloniki , Greece PGP key : http://tassadar.physics.auth.gr/~dzila/pgp_public_key.asc http://egnatia.ee.auth.gr/~dzila/pgp_public_key.asc MD5sum : de2bd8f73d545f0e4caf3096894ad83f pgp_public_key.asc ============================================================================ From michael.creel at uab.es Mon Apr 3 03:46:12 2006 From: michael.creel at uab.es (Michael Creel) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] wireless cluster Message-ID: <4430FCF4.8050202@uab.es> A few days ago there was a thread on wireless clusters. Here's an example: http://157.181.66.70/wmlc/english.html This is not a late April fools joke! Michael From edkarns at firewirestuff.com Mon Apr 3 12:57:20 2006 From: edkarns at firewirestuff.com (Ed Karns) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: next gen / parallel processing In-Reply-To: <200604031900.k33J0CVn007352@bluewest.scyld.com> References: <200604031900.k33J0CVn007352@bluewest.scyld.com> Message-ID: <36A77385-6F86-45F9-8D6A-9E9C8EE36D0E@firewirestuff.com> " ... the HyperTransport Consortium is weeks away from launching HT 3.0, which is expected to at least double the bandwidth of that interconnect while lowering latency and leaving the underlying protocol largely unaltered. ..." From http://www.eetimes.com/news/ latest/showArticle.jhtml?articleID=181502574 ... article on Cray's latest efforts. This next generation of parallel processing scenarios appears dovetail nicely with the Beowulf concept. Anyone looking into this? Ed Karns FireWireStuff.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20060403/fc8677db/attachment.html From diep at xs4all.nl Mon Apr 3 15:06:47 2006 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: next gen / parallel processing References: <200604031900.k33J0CVn007352@bluewest.scyld.com> <36A77385-6F86-45F9-8D6A-9E9C8EE36D0E@firewirestuff.com> Message-ID: <002d01c6576a$e7616ef0$9600000a@gourmandises> Well, doubling the bandwidth is great of course, but will it sell more machines? Important is the price a socket. I tend to remember that the initial bare price of a node (12 sockets) is around 50000 dollar and within 1 node you've got a one way pingpong latency of 1.5 us from cpu to cpu. That's pretty expensive, considering that just the bare machine itself without too much RAM and without too much i/o already gets very expensive. A 2400 socket machine is soon then like 200 * 50k = 10 MLN dollar without additional network, without additional RAM and with little i/o. So basically money talks too. 50k for 12 socket node at a great supercomputer isn't much, but you really want something then that is great everywhere. Further the latency within 1 node is not exactly magnificent. The entire machine has a great latency from node to node, but majority of scientists in first place run something at a cpu or 4-12 and want that application to perform fast. In general that application simply can't handle 1.5 us latencies in such low number of cpu's. IMHO doubling the bandwidth is nice on paper, but 90% of the jobs are jobs of just a few processors, if *those* run slow then the machine simply doesn't get bought. So theoretically spoken the improvement is great, practical i doubt it will change decision taking. Vincent ----- Original Message ----- From: Ed Karns To: beowulf@beowulf.org Sent: Monday, April 03, 2006 8:57 PM Subject: [Beowulf] Re: next gen / parallel processing " ... the HyperTransport Consortium is weeks away from launching HT 3.0, which is expected to at least double the bandwidth of that interconnect while lowering latency and leaving the underlying protocol largely unaltered. ..." From http://www.eetimes.com/news/latest/showArticle.jhtml?articleID=181502574 ... article on Cray's latest efforts. This next generation of parallel processing scenarios appears dovetail nicely with the Beowulf concept. Anyone looking into this? Ed Karns FireWireStuff.com ------------------------------------------------------------------------------ _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20060403/2a17d71d/attachment.html From deadline at clustermonkey.net Tue Apr 4 08:14:24 2006 From: deadline at clustermonkey.net (Douglas Eadline) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Cluster Optimization by Monkey's Message-ID: <60485.192.168.1.1.1144163664.squirrel@eadline.org> Check out the Cluster Monkey front page: http://www.clustermonkey.net and you will find: Life, The Universe, and Your Cluster - A Study in Cluster Optimization (and yes we use HPL as a test code!!!) MPI: Debugging in Parallel (in Parallel) (more MPI debugging from Jeff Squyres) Cluster Cooling, Noise and Benchmark Timing (More Mailing list highlights by Jeff Layton) Enjoy -- Doug From James.P.Lux at jpl.nasa.gov Wed Apr 5 16:58:44 2006 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] multiport RS232 to ethernet (or USB) Message-ID: <6.1.1.1.2.20060405165002.02a9ee70@mail.jpl.nasa.gov> Looking for cheap device to give me 8 serial ports (doesn't have to be particularly high speed: 9600bps is fine) from either ethernet or USB. Digi makes a variety of widgets that do this for USB, at about $50/port. (that is, a 8 port box is $400) Cyclades has some high end multi port devices. Comtrol has a 8 port serial device server at $750 There's a variety of PCI cards (e.g. RocketPort) that do this.. And, I suppose one could gang up a raft of $9 USB/Serial dongles on a USB hub (somehow, I suspect that this is fraught with peril) The device has to allow independent control of RTS and reading CTS (which may not be used in the usual flow control scheme) Thanks James Lux, P.E. Spacecraft Radio Frequency Subsystems Group Flight Communications Systems Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 From jlb17 at duke.edu Thu Apr 6 09:06:06 2006 From: jlb17 at duke.edu (Joshua Baker-LePain) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Gigabit switch recommendations In-Reply-To: References: <20060330051838.GA87266@tehun.pair.com> Message-ID: On Thu, 30 Mar 2006 at 9:33am, Joshua Baker-LePain wrote > On Thu, 30 Mar 2006 at 9:04am, Tim Mattox wrote > >> FYI - here are some links to a variety of 48-port commodity GigE switches, >> that may be worth looking at (but as another poster indicated, these >> might actually all be the same switch built by an OEM and just rebadged. >> One can only really tell by opening one up and looking at the PCB and chips >> that are inside.): >> >> SMC8648T TigerSwitch >> http://www.smc.com/index.cfm?event=viewProductDetail&localeCode=EN_USA&pid=1192 > > The SMC8748L2 I went with is here: > > http://www.smc.com/index.cfm?event=viewProduct&localeCode=EN_USA&pid=1498 > > It's cheaper than the 8648T while apparently newer and with some "better" > specs. As I mentioned before, I'm decidedly skeptical and intend to test it > hard while I'm still well within the return period. I've got my new switch in hand and I've done some preliminary testing. So far, so good. Total bandwidth between 2 hosts connected to the switch was quite comparable to that of the hosts being directly connected for just about all the MTUs I tested (see ). The hosts are centos 4.3 using onboard BCM5704 NICs (tg3 driver), and I tested with netperf. Now I just need to get it hooked up to more hosts and really have at it. -- Joshua Baker-LePain Department of Biomedical Engineering Duke University From hahn at physics.mcmaster.ca Thu Apr 6 11:36:45 2006 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] multiport RS232 to ethernet (or USB) In-Reply-To: <6.1.1.1.2.20060405165002.02a9ee70@mail.jpl.nasa.gov> Message-ID: > Looking for cheap device to give me 8 serial ports (doesn't have to be > particularly high speed: 9600bps is fine) from either ethernet or USB. http://www.saelig.com/miva/merchant.mvc?Screen=CTGY&Category_Code=U not an endorsement - I just ran across them recently. From edkarns at firewirestuff.com Thu Apr 6 14:22:31 2006 From: edkarns at firewirestuff.com (Ed Karns) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: multiport RS232 to ethernet (or USB) In-Reply-To: <200604061900.k36J0CZ1027050@bluewest.scyld.com> References: <200604061900.k36J0CZ1027050@bluewest.scyld.com> Message-ID: <0FBB5804-A811-4D50-824B-A971EECD29B6@firewirestuff.com> On Apr 6, 2006, at 12:00 PM, beowulf-request@beowulf.org wrote: > > And, I suppose one could gang up a raft of $9 USB/Serial dongles on > a USB hub (somehow, I suspect that this is fraught with peril) > > The device has to allow independent control of RTS and reading CTS > (which may not be used in the usual flow control scheme) It might be a tall order to get the ":cheap" $9 USB to Serial adapter to do Request to Send / Clear to Send / hardware handshaking ... probably should consider Xon / Xoff / software handshaking instead = not a problem for the least expensive USB to Serial adapters. (Ctrl S / Ctrl Q being easy from the command line or script to get the port to respond.) Ed Karns FireWireStuff.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20060406/4d08d235/attachment.html From joelja at darkwing.uoregon.edu Thu Apr 6 15:13:28 2006 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: multiport RS232 to ethernet (or USB) In-Reply-To: <0FBB5804-A811-4D50-824B-A971EECD29B6@firewirestuff.com> References: <200604061900.k36J0CZ1027050@bluewest.scyld.com> <0FBB5804-A811-4D50-824B-A971EECD29B6@firewirestuff.com> Message-ID: On Thu, 6 Apr 2006, Ed Karns wrote: > > On Apr 6, 2006, at 12:00 PM, beowulf-request@beowulf.org wrote: > >> >> And, I suppose one could gang up a raft of $9 USB/Serial dongles on a USB >> hub (somehow, I suspect that this is fraught with peril) we use 16 port rackmount digi edgeport usb-serial adapters http://www.digi.com/products/usb/edgeport.jsp The have an onboard usb hub though we haven't found a reason to cascade them as yet. >> The device has to allow independent control of RTS and reading CTS (which >> may not be used in the usual flow control scheme) > > > It might be a tall order to get the ":cheap" $9 USB to Serial adapter to do > Request to Send / Clear to Send / hardware handshaking ... probably should > consider Xon / Xoff / software handshaking instead = not a problem for the > least expensive USB to Serial adapters. (Ctrl S / Ctrl Q being easy from the > command line or script to get the port to respond.) > > Ed Karns > FireWireStuff.com > -- -------------------------------------------------------------------------- Joel Jaeggli Unix Consulting joelja@darkwing.uoregon.edu GPG Key Fingerprint: 5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2 From James.P.Lux at jpl.nasa.gov Thu Apr 6 15:50:34 2006 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: multiport RS232 to ethernet (or USB) In-Reply-To: <0FBB5804-A811-4D50-824B-A971EECD29B6@firewirestuff.com> References: <200604061900.k36J0CZ1027050@bluewest.scyld.com> <0FBB5804-A811-4D50-824B-A971EECD29B6@firewirestuff.com> Message-ID: <6.1.1.1.2.20060406154459.028bc9d8@mail.jpl.nasa.gov> At 02:22 PM 4/6/2006, Ed Karns wrote: >On Apr 6, 2006, at 12:00 PM, >beowulf-request@beowulf.org wrote: > >> >>And, I suppose one could gang up a raft of $9 USB/Serial dongles on a USB >>hub (somehow, I suspect that this is fraught with peril) >> >> >>The device has to allow independent control of RTS and reading CTS (which >>may not be used in the usual flow control scheme) > > >It might be a tall order to get the ":cheap" $9 USB to Serial adapter to >do Request to Send / Clear to Send / hardware handshaking ... probably >should consider Xon / Xoff / software handshaking instead = not a problem >for the least expensive USB to Serial adapters. (Ctrl S / Ctrl Q being >easy from the command line or script to get the port to respond.) Except that the piece of hardware the serial port is talking to uses RTS and CTS, not as handshaking lines, but to wake up the box, and to return status. However, I did just try it with a cheap USB/RS232 adapter, and I can toggle RTS just fine from a program talking to a Windows COM port. And, they "claim" that they have a Linux driver when I get to that step. I'm sure it's the same FTDI USB/Serial chip that everyone uses. Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20060406/a141c40a/attachment.html From mathog at caltech.edu Fri Apr 7 16:06:01 2006 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: multiport RS232 to ethernet (or USB) (Jim Lux) Message-ID: On Thu, 06 Apr 2006 15:50:34 -0700 Jim Lux wrote: > > Except that the piece of hardware the serial port is talking to uses RTS > and CTS, not as handshaking lines, but to wake up the box, and to return > status. In that case be very careful about bit states on power transitions affecting the USB hub and the controlling computer. I had all sorts of fun controlling some TrippLite UPS's that used a similar "bit level" serial line control methodology. Some computers would come up with the serial port lines bouncing around at random, which could send an "inverter kill" signal to the UPS at just the wrong time. You may well find that when you reboot the master machiine all hell breaks loose on the USB->serial controlled slave devices. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From James.P.Lux at jpl.nasa.gov Fri Apr 7 16:27:26 2006 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: multiport RS232 to ethernet (or USB) (Jim Lux) In-Reply-To: References: Message-ID: <6.1.1.1.2.20060407162459.03082338@mail.jpl.nasa.gov> At 04:06 PM 4/7/2006, David Mathog wrote: >On Thu, 06 Apr 2006 15:50:34 -0700 >Jim Lux wrote: > > > > > Except that the piece of hardware the serial port is talking to uses RTS > > and CTS, not as handshaking lines, but to wake up the box, and to return > > status. > >In that case be very careful about bit states on power >transitions affecting the USB hub and the controlling computer. >I had all sorts of fun controlling some TrippLite UPS's that >used a similar "bit level" serial line control methodology. >Some computers would come up with the serial port lines >bouncing around at random, which could send an "inverter kill" >signal to the UPS at just the wrong time. > >You may well find that when you reboot the master machiine all >hell breaks loose on the USB->serial controlled slave devices. In this case, that's not a problem. The box uses RTS going asserted and then deasserted to bring it up out of sleep mode. If nothing happens after that, it just goes back to sleep. So wild fluctations won't cause any problems. But you're right. I have a USB->parallel interface that has just that problem, and in that particular case, there are "bad things" that can happen (mind you the same problem exists with a hardware parallel port, because the BIOS goes out and does a reset). In that case, the external box doesn't get power applied until the PC is up and running. Jim From nmoore at winona.edu Mon Apr 3 13:08:26 2006 From: nmoore at winona.edu (Nathan Moore) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] XGrid on old macs In-Reply-To: <4430FCF4.8050202@uab.es> References: <4430FCF4.8050202@uab.es> Message-ID: At our university we've got 5-10 old (g4) powermacs floating around the department that were talking about organizing as a cluster for pedagogical work in computational physics. Apple's Xgrid utility seems like and interesting utility to use for the job - I'm curious if anyone on the list has experience with this scheduler. Specifically, is it easy to link against? Is the MPI layer standard (my previous experience is mostly with IBM machines) Nathan Moore Assistant Professor, Physics Winona State University AIM:nmoorewsu nmoore@winona.edu From rzewnickie at rfa.org Mon Apr 3 16:34:04 2006 From: rzewnickie at rfa.org (Eric Dantan Rzewnicki) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: BWBUG live stream today In-Reply-To: <20060330175604.GB23550@rfa.org> References: <20060330175604.GB23550@rfa.org> Message-ID: <20060403233404.GA22323@rfa.org> On Thu, Mar 30, 2006 at 12:56:04PM -0500, Eric Dantan Rzewnicki wrote: > Appologies for the short notice, but I just managed to get every thing > in order just in time to stream today's BWBUG meeting: > http://www.bwbug.org/ > > The URL for the live video stream is: > http://streamer2.rfa.org:8000 > (currently me sitting at my desk typing this, but I'll be moving down to > the conference room soon.) > > Please pass this on to anyone who might be interested. Again, sorry for > the short notice, but now that things are in order we can have these > streams for future BWBUG and DCLUG meetings. An ogg theora encoded version of the talk is now available: http://techweb.rfa.org/images/bwbug/ -- Eric Dantan Rzewnicki Apprentice Linux Audio Developer and Mostly Harmless Sysadmin (text below this line mandated by management) Technical Operations Division | Radio Free Asia 2025 M Street, NW | Washington, DC 20036 | 202-530-4900 CONFIDENTIAL COMMUNICATION This e-mail message is intended only for the use of the addressee and may contain information that is privileged and confidential. Any unauthorized dissemination, distribution, or copying is strictly prohibited. If you receive this transmission in error, please contact network@rfa.org. From benone.marcos at gmail.com Wed Apr 5 18:04:26 2006 From: benone.marcos at gmail.com (Benone Marcos) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] xCAT 1.2.0 finally released Message-ID: <4434691A.5090801@gmail.com> From xCAT dev team: xCAT 1.2.0 (after 3+ years of development, and 2+ years late) has released. FYI, 1.2.0 development started January 2003 and xCAT development started October 1999. Changes since 1.1.x: 1. Too many to list. Read the change log. Changes since 1.2.0-RC3: 1. RH4U3 support. 2. CentOS 4.3 support. 3. CentOS 3.6 support. 4. RHFC5 support. 5. Minor fixes. 1.2.1 is under development. Now that 1.2.0 is released you can safely upgrade to any 1.2.x release without updating your tables or templates. YMMV. 1.2.x will be in maintenance mode while 1.3.x is under development. New OSes and other features will continue to be added to 1.2.x, but no major architectural changes will be made. 1.3.x goals (goals not promises): 1. Drop ksh. 2. Rewrite in Perl. Bash, Expect, and Awk OK (OK, Python too, but no Ruby). 3. Drop ia64 and x86 (x86_64, ppc64, cell support only). Others can be added if required. 4. 2.6 kernel and later OSes only. Windows OK. 5. Drop support for legacy service processors. IPMI, Bladecenter, APC as a base, more to be added as needed. 6. All commands to be client/server (no more NFS mounting /opt/xcat). Audit, logging, queuing, permissions, etc... enforced. 7. Tables optional (can use DB). 8. Some sort of GUI. 9. Beta delivered before the next total eclipses of our sun. 10. What do you want? -- Benone Marcos Computer Scientist From shahriar222 at gmail.com Thu Apr 6 09:14:06 2006 From: shahriar222 at gmail.com (Shahriar Sharghi) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] A question on PBS Message-ID: <80d357ff0604060914t2f1517cdq7dcc4944cd821437@mail.gmail.com> Hi everyone, Anyone can help with the following; How can a PBS script be written that utalize different number CPUs in different nodes. The goal of it is to make a head node as part of computational node for one of the small cluster I am dealing with. That would be 4 dual-core-dual-processor-node cluster and I want to devote one of the processors in the head node for computations. The following leads to the problem that PBS gives the error: "qsub: Job exceeds queue resource limits". Any idea? #!/bin/sh #PBS -N my_job #PBS -o my_stdout1.txt #PBS -e my_stderr1.txt #PBS -q workq #PBS -l nodes=4:ppn=4 #PBS -l ncpus=14 #PBS -l cput=0:04:00 echo Launchnode is `hostname` mpirun -machinefile $HOME/machines.LINUX -np 14 /home/shahriar/examples/cpi # All done From shahriar222 at gmail.com Thu Apr 6 13:42:07 2006 From: shahriar222 at gmail.com (Shahriar Sharghi) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] multiport RS232 to ethernet (or USB) In-Reply-To: References: <6.1.1.1.2.20060405165002.02a9ee70@mail.jpl.nasa.gov> Message-ID: <80d357ff0604061342l24dfcc0fleeb7a7d9b061ce08@mail.gmail.com> You may want to try a portmaster 2e. You probably can get a cheap one in ebay or get one from www.portmasters.com. I am not endorsing them either. Shahriar Sharghi President www.gocluster.com 1-646-709-2713 info@gocluster.com On 4/6/06, Mark Hahn wrote: > > Looking for cheap device to give me 8 serial ports (doesn't have to be > > particularly high speed: 9600bps is fine) from either ethernet or USB. > > http://www.saelig.com/miva/merchant.mvc?Screen=CTGY&Category_Code=U > > not an endorsement - I just ran across them recently. > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From chris.colemansmith at gmail.com Thu Apr 6 23:05:19 2006 From: chris.colemansmith at gmail.com (Chris) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: multiport RS232 to ethernet (or USB) In-Reply-To: <0FBB5804-A811-4D50-824B-A971EECD29B6@firewirestuff.com> References: <200604061900.k36J0CZ1027050@bluewest.scyld.com> <0FBB5804-A811-4D50-824B-A971EECD29B6@firewirestuff.com> Message-ID: <9503ce760604062305n4a40ab6erc8cdd62fce5a2a94@mail.gmail.com> As a student hiding in the corners and under the stairs on this list i was wondering why you guys would want to be connecting things to your marvelous clusters with RS-232 type ports. i mean there's many other ways to interface right? So what'cha plugging into these contraptions? On 4/6/06, Ed Karns wrote: > > > On Apr 6, 2006, at 12:00 PM, beowulf-request@beowulf.org wrote: > > > And, I suppose one could gang up a raft of $9 USB/Serial dongles on a USB hub > (somehow, I suspect that this is fraught with peril) > > > The device has to allow independent control of RTS and reading CTS (which may > not be used in the usual flow control scheme) > > > > > > It might be a tall order to get the ":cheap" $9 USB to Serial adapter to > do Request to Send / Clear to Send / hardware handshaking ... probably > should consider Xon / Xoff / software handshaking instead = not a problem > for the least expensive USB to Serial adapters. (Ctrl S / Ctrl Q being > easy from the command line or script to get the port to respond.) > > Ed Karns > FireWireStuff.com > > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > -- : : www.relain.co.uk : : -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20060407/f44d2319/attachment.html From Julien.Leduc at lri.fr Fri Apr 7 01:02:51 2006 From: Julien.Leduc at lri.fr (Julien.Leduc@lri.fr) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: multiport RS232 to ethernet (or USB) In-Reply-To: <6.1.1.1.2.20060406154459.028bc9d8@mail.jpl.nasa.gov> References: <200604061900.k36J0CZ1027050@bluewest.scyld.com> <0FBB5804-A811-4D50-824B-A971EECD29B6@firewirestuff.com> <6.1.1.1.2.20060406154459.028bc9d8@mail.jpl.nasa.gov> Message-ID: <49291.212.224.231.236.1144396971.squirrel@www.lri.fr> > However, I did just try it with a cheap USB/RS232 adapter, and I can > toggle > RTS just fine from a program talking to a Windows COM port. And, they > "claim" that they have a Linux driver when I get to that step. I'm sure > it's the same FTDI USB/Serial chip that everyone uses. I played with a USB<->8 x RS232 ports, it uses ftdi Linux driver, and using null modem cables between this box and the cluster's nodes, along with kermit, it is very efficient and robust (because very simple). The main problem is the strange escape sequence you have to type to close acces to the console redirection (leaving kermit is not as simple as leaving telnet). Julien Leduc From wt at atmos.colostate.edu Mon Apr 10 18:26:31 2006 From: wt at atmos.colostate.edu (Warren Turkal) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: multiport RS232 to ethernet (or USB) In-Reply-To: <9503ce760604062305n4a40ab6erc8cdd62fce5a2a94@mail.gmail.com> References: <200604061900.k36J0CZ1027050@bluewest.scyld.com> <0FBB5804-A811-4D50-824B-A971EECD29B6@firewirestuff.com> <9503ce760604062305n4a40ab6erc8cdd62fce5a2a94@mail.gmail.com> Message-ID: <200604101926.31333.wt@atmos.colostate.edu> On Friday 07 April 2006 00:05, Chris wrote: > As a student hiding in the corners and under the stairs on this list i was > wondering why you guys would want to be connecting things to your marvelous > clusters with RS-232 type ports. i mean there's many other ways to > interface right? So what'cha plugging into these contraptions? I am thinking they are probably using serial terminals for monitoring or something like that. wt -- Warren Turkal, Research Associate III/Systems Administrator Colorado State University, Dept. of Atmospheric Science From James.P.Lux at jpl.nasa.gov Mon Apr 10 22:12:38 2006 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: multiport RS232 to ethernet (or USB) In-Reply-To: <200604101926.31333.wt@atmos.colostate.edu> References: <200604061900.k36J0CZ1027050@bluewest.scyld.com> <0FBB5804-A811-4D50-824B-A971EECD29B6@firewirestuff.com> <9503ce760604062305n4a40ab6erc8cdd62fce5a2a94@mail.gmail.com> <200604101926.31333.wt@atmos.colostate.edu> Message-ID: <6.1.1.1.2.20060410221025.03438ae0@mail.jpl.nasa.gov> At 06:26 PM 4/10/2006, Warren Turkal wrote: >On Friday 07 April 2006 00:05, Chris wrote: > > As a student hiding in the corners and under the stairs on this list i was > > wondering why you guys would want to be connecting things to your marvelous > > clusters with RS-232 type ports. i mean there's many other ways to > > interface right? So what'cha plugging into these contraptions? > >I am thinking they are probably using serial terminals for monitoring or >something like that. aka "out of band monitoring and control" There's an enormous amount of inexpensive widgets with serial ports to do things like look at voltages, currents, and temperatures, and/or actuate relays, etc. Jim From nixon at nsc.liu.se Tue Apr 11 01:04:09 2006 From: nixon at nsc.liu.se (Leif Nixon) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: multiport RS232 to ethernet (or USB) In-Reply-To: <9503ce760604062305n4a40ab6erc8cdd62fce5a2a94@mail.gmail.com> (chris.colemansmith@gmail.com's message of "Fri, 7 Apr 2006 07:05:19 +0100") References: <200604061900.k36J0CZ1027050@bluewest.scyld.com> <0FBB5804-A811-4D50-824B-A971EECD29B6@firewirestuff.com> <9503ce760604062305n4a40ab6erc8cdd62fce5a2a94@mail.gmail.com> Message-ID: Chris writes: > As a student hiding in the corners and under the stairs on this list i was > wondering why you guys would want to be connecting things to your marvelous > clusters with RS-232 type ports. i mean there's many other ways to interface > right? Serial connections tend to just work (once you've got them wired correctly), and need no fancy-schmancy OS level custom drivers. And almost every piece of equipment tends to come with a serial port for management and monitoring. Computers, switches, RAID controllers, environment sensors, UPS:es... -- Leif Nixon - Systems expert ------------------------------------------------------------ National Supercomputer Centre - Linkoping University ------------------------------------------------------------ From laytonjb at charter.net Tue Apr 11 15:14:26 2006 From: laytonjb at charter.net (Jeffrey B. Layton) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] LAM trouble Message-ID: <443C2A42.5010607@charter.net> Howdy! I apologize for posting this problem here, but I tried the LAM list and didn't hear anything, so I thought I would cast my net a bit wider in search of help. I'm having trouble starting an MPI code (NPB bt) that was built with PGI 6.1 and LAM-7.1.2. I get the following messages when I try to start the code (lamboot): n-1<24201> ssi:boot:base:linear: booting n0 (n2004) n-1<24201> ssi:boot:base:linear: booting n1 (n2005) n-1<24201> ssi:boot:base:linear: booting n2 (n2006) n-1<24201> ssi:boot:base:linear: booting n3 (n2007) n-1<24201> ssi:boot:base:linear: booting n4 (n2008) n-1<24201> ssi:boot:base:linear: booting n5 (n2009) n-1<24201> ssi:boot:base:linear: booting n6 (n2010) n-1<24201> ssi:boot:base:linear: booting n7 (n2011) n-1<24201> ssi:boot:base:linear: finished ----------------------------------------------------------------------------- It seems that [at least] one of the processes that was started with mpirun chose a different RPI than its peers. For example, at least the following two processes mismatched in their RPI selections: MPI_COMM_WORLD rank 0: tcp (v7.1.0) MPI_COMM_WORLD rank 3: usysv (v7.1.0) All MPI processes must choose the same RPI module and version when they start. Check your SSI settings and/or the local environment variables on each node. I'm using PBS to start the job and here are the relevant parts of the script: NET=tcp lamboot -b -v -ssh rpi $NET $PBS_NODEFILE mpirun -O -v C ./${EXE} >> ${OUTFILE} lamhalt where $EXE and $OUTFILE are defined appropriately in the script. Does anyone have any ideas? TIA! Jeff From mathog at caltech.edu Wed Apr 12 09:05:27 2006 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] delayed savings time crashes Message-ID: This is an odd one. I just realized that 9 of 20 nodes rebooted on Apr 4. (Since they all rebooted successfully everything was working and there was no reason to think that this had taken place.) This appears to be related to the daylights savings time change two days before. The reason I think that is that the nodes that rebooted have /var/log/messages files like: Apr 4 08:01:00 nodename CROND ... /cron/hourly Apr 4 09:01:00 nodename CROND ... /cron/hourly Apr 4 08:24:33 nodename syslogd 1.4.1; restart Notice the time shift backwards between the last normal record and the first reboot record. As if it finally caught on that the clock had changed and that somehow triggered a reboot. Unfortunately none of the log files contain a message that indicated exactly what it was that ordered the reboot. Unclear to me what piece of software could have triggered this. Presumably something that had it's own clock stuck one hour off on the previous time standard and also has the ability to restart the system. ntpd? Ganglia? They were both running. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From kewley at gps.caltech.edu Wed Apr 12 10:42:31 2006 From: kewley at gps.caltech.edu (David Kewley) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] delayed savings time crashes In-Reply-To: References: Message-ID: <200604121042.31923.kewley@gps.caltech.edu> David, The reboots were due to a City of Pasadena power glitch at 9:17 that morning. :) It was raining, and a 34kV city feeder line that runs between the generating plant at the entrance of the 110 and a substation at Del Mar & Los Robles faulted. The responsible breaker took 13 cycles to break, during which time the single-phase voltage seen at Caltech dropped to about 75V. This info comes from the responsible EE at Caltech. As for its effects, believe me, I know about it the hard way, as it took down 2/3 of our compute nodes, 1/3 of our disk shelves, and 3/4 of our fileservers. Our UPS has been on bypass these past 6+ months as we wait for our UPS vendor to install a fix so that the UPS can handle the tendency of our computer power supplies' internal Power Factor Correction feedback circuitry to lock up & induce massive 12Hz oscillations on the room's power lines. As for the time glitch, that is probably induced by the fact that Daylight Savings Time changes only take place on the "system" clock, and in a standard Red Hat system those changes only get synced to the hardware clock upon a clean shutdown. So if your machine crashes after a DST change, then upon bootup syslogd gets its time from the hardware clock, which is wrong. The system clock is only corrected later in the bootup sequence, when ntpd starts. The best solution is probably to set the hardware clock to UCT rather than local time. UCT doesn't undergo step changes like most timezones in the U.S. do, so the compensation for DST happens dynamically in software, rather than requiring a hardware clock change. David On Wednesday 12 April 2006 09:05, David Mathog wrote: > This is an odd one. I just realized that 9 of 20 nodes > rebooted on Apr 4. (Since they all rebooted successfully everything > was working and there was no reason to think that this had > taken place.) This appears to be related to the daylights > savings time change two days before. The reason I think that is > that the nodes that rebooted have /var/log/messages files like: > > Apr 4 08:01:00 nodename CROND ... /cron/hourly > Apr 4 09:01:00 nodename CROND ... /cron/hourly > Apr 4 08:24:33 nodename syslogd 1.4.1; restart > > Notice the time shift backwards between the last normal > record and the first reboot record. > > As if it finally caught on that the clock had changed and that > somehow triggered a reboot. Unfortunately none of the log files > contain a message that indicated exactly what it was that ordered > the reboot. > > Unclear to me what piece of software could have triggered this. > Presumably something that had it's own clock stuck one hour off > on the previous time standard and also has the ability to restart > the system. ntpd? Ganglia? They were both running. > > Regards, > > David Mathog > mathog@caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From mathog at caltech.edu Wed Apr 12 11:34:33 2006 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] delayed savings time crashes Message-ID: > The reboots were due to a City of Pasadena power glitch at 9:17 that > morning. :) It was raining, and a 34kV city feeder line that runs between > the generating plant at the entrance of the 110 and a substation at Del Mar > & Los Robles faulted. The responsible breaker took 13 cycles to break, > during which time the single-phase voltage seen at Caltech dropped to about > 75V. I was on campus at that time and didn't notice it. My desktop machine didn't even hiccup. Hmm, now that we know the cause of it that might explain why all those that did reboot were plugged into just 2 surge suppressors, where the loss was 9/10 machines, whereas the other 2 surge suppressors lost 0/10 machines. Each surge suppressor is on its own circuit which is 1/3rd of a 3 phase line. Maybe only one phase had the glitch and by good luck the two circuits which lost no machines were wired between the two good phases? Usually power glitches just crash the nodes and they stay down but this one may have looked enough like power off/power to have allowed a reboot. The servers are all plugged into UPS's so they saw none of this. > This info comes from the responsible EE at Caltech. As for its effects, > believe me, I know about it the hard way, as it took down 2/3 of our > compute nodes, 1/3 of our disk shelves, and 3/4 of our fileservers. That's a lot of machines in your case. Did any sustain permanent damage? > As for the time glitch, that is probably induced by the fact that Daylight > Savings Time changes only take place on the "system" clock, Right, that makes perfect sense. There had been no planned shutdown since the DST change and they would have come up an hour off. Thanks, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From florent.calvayrac at univ-lemans.fr Wed Apr 12 12:12:16 2006 From: florent.calvayrac at univ-lemans.fr (Florent Calvayrac) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] delayed savings time crashes In-Reply-To: <200604121042.31923.kewley@gps.caltech.edu> References: <200604121042.31923.kewley@gps.caltech.edu> Message-ID: <443D5110.20406@univ-lemans.fr> David Kewley wrote: >David, > >The reboots were due to a City of Pasadena power glitch at 9:17 that >morning. :) It was raining, and a 34kV city feeder line that runs between >the generating plant at the entrance of the 110 and a substation at Del Mar >& Los Robles faulted. The responsible breaker took 13 cycles to break, >during which time the single-phase voltage seen at Caltech dropped to about >75V. >induce massive 12Hz oscillations on the room's power lines. > >As for the time glitch, that is probably induced by the fact that Daylight >Savings Time changes only take place on the "system" clock, and in a >standard Red Hat system those changes only get synced to the hardware clock >upon a clean shutdown. So if your machine crashes after a DST change, then >upon bootup syslogd gets its time from the hardware clock, which is wrong. >The system clock is only corrected later in the bootup sequence, when ntpd >starts. The best solution is probably to set the hardware clock to UCT >rather than local time. UCT doesn't undergo step changes like most >timezones in the U.S. do, so the compensation for DST happens dynamically >in software, rather than requiring a hardware clock change. > > > > Why don't you use a line conditioner ? It's much cheaper than a similar powered UPS ($1000 for 10kW), many UPS dont guarantee voltage and waveform excepted during clean power cuts, anyway. A line conditioner is very handy in time of brownouts like during August thunderstorms. We have a Salicru one on our cluster and have a much better MTBF on our compute nodes in comparison with the ones of a computing room nearby. Besides, I was confronted with the same problem about daylight saving time. Just added a clock -w after the NTP synchronization in cron.daily so that the time would be corrected automatically on the hardware clock. (which is automatic on Windows btw). From kewley at gps.caltech.edu Wed Apr 12 12:57:30 2006 From: kewley at gps.caltech.edu (David Kewley) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] delayed savings time crashes In-Reply-To: References: Message-ID: <200604121257.31228.kewley@gps.caltech.edu> On Wednesday 12 April 2006 11:34, David Mathog wrote: > Hmm, now that we know the cause of it that might explain > why all those that did reboot were plugged into just 2 surge > suppressors, where the loss was 9/10 machines, whereas the > other 2 surge suppressors lost 0/10 machines. Each surge > suppressor is on its own circuit which is 1/3rd of a 3 phase line. > Maybe only one phase had the glitch and by good luck the > two circuits which lost no machines were wired between the > two good phases? I do not know how this worked, but I did see something similar but even stranger. Our UPS feeds two PDUs, each responsible for about 1/2 the computers. One PDU saw all computers on phases 1 & 2 fail, and the other saw all computers on phases 1 & 3 fail. On both PDUs, the third, unaffected phase saw all its computers stay up. I have no idea how to explain this. > > This info comes from the responsible EE at Caltech. As for its > > effects, believe me, I know about it the hard way, as it took down 2/3 > > of our compute nodes, 1/3 of our disk shelves, and 3/4 of our > > fileservers. > > That's a lot of machines in your case. Did any sustain permanent > damage? It was a voltage drop rather than a spike, and that probably explains why we had no hardware damage. Just quite a bit of filesystem corruption to clean up (which leaves lost files & corrupted file data for some small subset of user files). David From kewley at gps.caltech.edu Wed Apr 12 13:49:52 2006 From: kewley at gps.caltech.edu (David Kewley) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] delayed savings time crashes In-Reply-To: <443D5110.20406@univ-lemans.fr> References: <200604121042.31923.kewley@gps.caltech.edu> <443D5110.20406@univ-lemans.fr> Message-ID: <200604121349.53010.kewley@gps.caltech.edu> On Wednesday 12 April 2006 12:12, Florent Calvayrac wrote: > David Kewley wrote: > >David, > > > >The reboots were due to a City of Pasadena power glitch at 9:17 that > >morning. :) It was raining, and a 34kV city feeder line that runs > > between the generating plant at the entrance of the 110 and a > > substation at Del Mar & Los Robles faulted. The responsible breaker > > took 13 cycles to break, during which time the single-phase voltage > > seen at Caltech dropped to about 75V. > >induce massive 12Hz oscillations on the room's power lines. > > > >As for the time glitch, that is probably induced by the fact that > > Daylight Savings Time changes only take place on the "system" clock, > > and in a standard Red Hat system those changes only get synced to the > > hardware clock upon a clean shutdown. So if your machine crashes after > > a DST change, then upon bootup syslogd gets its time from the hardware > > clock, which is wrong. The system clock is only corrected later in the > > bootup sequence, when ntpd starts. The best solution is probably to > > set the hardware clock to UCT rather than local time. UCT doesn't > > undergo step changes like most timezones in the U.S. do, so the > > compensation for DST happens dynamically in software, rather than > > requiring a hardware clock change. > > Why don't you use a line conditioner ? It's much cheaper than a similar > powered UPS ($1000 for 10kW), many UPS dont guarantee voltage and > waveform excepted during clean power cuts, anyway. A line conditioner is > very handy in time of brownouts like during August thunderstorms. We have > a Salicru one on our cluster and have a much better MTBF on our compute > nodes in comparison with the ones of a computing room nearby. I wasn't here when the room was designed. But I understand that at the time, it was unknown what the machine load in the room would be -- it was quite possible that we'd get several professor-specific clusters rather than the monolith that we have now. So rather than try to make only certain outlets in the room UPS-backed, or make individual users supply their own UPSes, they decided to get a big UPS to feed all the outlets. If I were involved in the design of another room like this, with a monolithic cluster, I'd advocate for getting a UPS big enough to handle the servers and network gear, and a wiring plan to distribute UPS power to just those boxes. The, as you say, put a power conditioner in front of the remaining outlets. Our peak power is 320kVA -- any idea what the offerings are for a conditioner of that size, or a set of conditioners (vendor, price, and cubic footage of the box(es))? I'm just curious. Our UPS is a Liebert Series 600, which (although I don't actually know the specs) I'd expect would handle any power event very well. Liebert makes a more modern large UPS (the nPower series), but I'd expect them to have all their best tricks at the time of design rolled into the Series 600. Problem is, the Series 600 was designed before modern computer Power Factor Correction power supplies (at least before they were *mandated* by the NEC), and it doesn't handle a large load of that type very well. That's the issue that Liebert is still working on, and the reason we're running in bypass. These PFC power supplies' feedback circuits lock up if the power source impedance at ~10-40 Hz isn't low enough (frequency depending on the specific power supplies, and perhaps on details of the room's power line diagram). Our PDUs see major phase current oscillation at ~12Hz when the UPS is powering a full load of 320kVA, and there's even a tiny bit of oscillation when we're in bypass (powered directly by our own substation). > Besides, I was confronted with the same problem about daylight saving > time. Just added a > > clock -w > > after the NTP synchronization in cron.daily so that the time would be > corrected automatically on the hardware clock. Are you saying that you're both running ntpd and doing a daily ntpdate? If so, why? On my systems, there is no NTP syncronization in cron.daily, and I've never seen a need to set the system clock outside the ntp initscript (and /etc/rc.d/rc.sysinit). On distributions by Red Hat (RHL, RHEL, Fedora), here is how the system (software-tracked) and hardware (CMOS on x86) clocks get set and maintained: Early in boot, the system clock is set from the hardware clock using this code snippet in /etc/rc.d/rc.sysinit (taken from RHEL3): -------------------------- # Set the system clock. update_boot_stage RCclock ARC=0 SRM=0 UTC=0 if [ -f /etc/sysconfig/clock ]; then . /etc/sysconfig/clock # convert old style clock config to new values if [ "${CLOCKMODE}" = "GMT" ]; then UTC=true elif [ "${CLOCKMODE}" = "ARC" ]; then ARC=true fi fi CLOCKDEF="" CLOCKFLAGS="$CLOCKFLAGS --hctosys" case "$UTC" in yes|true) CLOCKFLAGS="$CLOCKFLAGS --utc" CLOCKDEF="$CLOCKDEF (utc)" ;; no|false) CLOCKFLAGS="$CLOCKFLAGS --localtime" CLOCKDEF="$CLOCKDEF (localtime)" ;; esac case "$ARC" in yes|true) CLOCKFLAGS="$CLOCKFLAGS --arc" CLOCKDEF="$CLOCKDEF (arc)" ;; esac case "$SRM" in yes|true) CLOCKFLAGS="$CLOCKFLAGS --srm" CLOCKDEF="$CLOCKDEF (srm)" ;; esac /sbin/hwclock $CLOCKFLAGS action $"Setting clock $CLOCKDEF: `date`" date -------------------------- Next, the ntpd initscript (if enabled in the present runlevel) slews the system clock to the correct time using ntpdate, then starts ntpd. After that, ntpd should keep the system clock well-synced. At system reboot or halt, the hardware clock setting is done by this code snippet in /etc/init.d/halt (taken here from RHEL3), which you'll notice is almost but not identical to the /etc/rc.d/rc.sysinit code; the main difference is --systohc versus --hctosys: -------------------------- # Sync the system clock. ARC=0 SRM=0 UTC=0 if [ -f /etc/sysconfig/clock ]; then . /etc/sysconfig/clock # convert old style clock config to new values if [ "${CLOCKMODE}" = "GMT" ]; then UTC=true elif [ "${CLOCKMODE}" = "ARC" ]; then ARC=true fi fi CLOCKDEF="" CLOCKFLAGS="$CLOCKFLAGS --systohc" case "$UTC" in yes|true) CLOCKFLAGS="$CLOCKFLAGS -u"; CLOCKDEF="$CLOCKDEF (utc)"; ;; no|false) CLOCKFLAGS="$CLOCKFLAGS --localtime"; CLOCKDEF="$CLOCKDEF (localtime)"; ;; esac case "$ARC" in yes|true) CLOCKFLAGS="$CLOCKFLAGS -A"; CLOCKDEF="$CLOCKDEF (arc)"; ;; esac case "$SRM" in yes|true) CLOCKFLAGS="$CLOCKFLAGS -S"; CLOCKDEF="$CLOCKDEF (srm)"; ;; esac runcmd $"Syncing hardware clock to system time" /sbin/hwclock $CLOCKFLAGS --------------- If I were to recommend changes to Red Hat (or to sysadmins of Red Hat systems), I'd say do these three things: * In the ntpd initscript, after slewing the system clock, set the hardware clock from the system clock. * Add a script to /etc/cron.hourly/ that sets the hardware clock from the system clock. Because the scripts in cron.hourly are by default run at 1 minute past every hour (see /etc/crontab), this will take care of Daylight Saving Time changes that happen on the hour (do any timezones have shifts on the half-hour?), with only 1 minute's window for the system to go down & miss the setting of the hardware clock. It also serves generally to keep the hardware clock well-synced to true time. * Just make UTC the default for the hardware clock, avoiding Daylight Savings Time issues altogether. Now that I've laid all that out, I will strongly consider doing all that on my own systems. :) David From rzewnickie at rfa.org Mon Apr 10 16:51:18 2006 From: rzewnickie at rfa.org (Eric Dantan Rzewnicki) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: BWBUG live stream tomorrow, 2006-04-11 In-Reply-To: <20060410234915.GH4905@rfa.org> References: <20060410234915.GH4905@rfa.org> Message-ID: <20060410235116.GI4905@rfa.org> On Mon, Apr 10, 2006 at 07:49:15PM -0400, Eric Dantan Rzewnicki wrote: > Tomorrow's BWBUG meeting will again be streamed live: > http://www.bwbug.org/ > > The URL for the live video stream will be: > http://streamer2.rfa.org:8000 > > Please pass this on to anyone who might be interested. At the least > there will be a live ogg theora video stream and feedback in #bwbug on > irc.freenode.net as for last month's meeting. I'm also working on > getting an ogg vorbis audio only stream going for those who lack > sufficient bandwidth for the video stream. Doh, forgot to mention the time. Meeting is scheduled to start at 14:30 EDT (utc-4). -- Eric Dantan Rzewnicki Apprentice Linux Audio Developer and Mostly Harmless Sysadmin (text below this line mandated by management) Technical Operations Division | Radio Free Asia 2025 M Street, NW | Washington, DC 20036 | 202-530-4900 CONFIDENTIAL COMMUNICATION This e-mail message is intended only for the use of the addressee and may contain information that is privileged and confidential. Any unauthorized dissemination, distribution, or copying is strictly prohibited. If you receive this transmission in error, please contact network@rfa.org. From kario at bu.edu Tue Apr 11 13:21:07 2006 From: kario at bu.edu (kario@bu.edu) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Problems installing ROCKS Message-ID: <20060411162107.l3kb8ssrs0g44w08@www.bu.edu> Problems installing ROCKS We're trying to install ROCKS cluster on a flat network with one managing node (HP DL360 g3, 1x3.06 GHz Xeon ) and 2 compute nodes (HP DL 145, 2x2.4 GHz Opteron 250). When trying to install ROCKS we get this error at 80%: " Error installing fonts-xorg-75dpi Can indicate media failure, lack of disk-space or hardware problems Fatal Error. Verify media" We have verified the media and checked the disk-space as well as the hardware. 1) How can we configure a flat network in ROCKS? 2) What does the error mean? 3) The two compute nodes do not have CD-ROM nor floppy disk. How can we install ROCKS on one of these machines if we instead choose one of these as managing node? Karianne From rzewnickie at rfa.org Wed Apr 12 15:01:42 2006 From: rzewnickie at rfa.org (Eric Dantan Rzewnicki) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: Reminder: BWBUG/LCUG live stream today, 2006-04-11 In-Reply-To: <20060411175407.GA5246@rfa.org> References: <20060411175407.GA5246@rfa.org> Message-ID: <20060412220140.GA6301@rfa.org> On Tue, Apr 11, 2006 at 01:54:09PM -0400, Eric Dantan Rzewnicki wrote: > Today's Linux Cluster User Group meeting will be streamed live. > > The URL for the live video stream will be: > http://streamer2.rfa.org:8000 > > For information on viewing theora streams see this FAQ entry: > http://www.theora.org/theorafaq.html#40 > > The camera is up now and I'm in #bwbug on irc.freenode.net. I'm still > working on getting an ogg vorbis audio only stream going for those who > lack sufficient bandwidth for the video stream, but might not make it in > time for this meeting. Hopefully I'll have it going for next week's > DCLUG meeting. The file of yesterday's talk is available for download now: http://techweb.rfa.org/images/bwbug/ -- Eric Dantan Rzewnicki Apprentice Linux Audio Developer and Mostly Harmless Sysadmin (text below this line mandated by management) Technical Operations Division | Radio Free Asia 2025 M Street, NW | Washington, DC 20036 | 202-530-4900 CONFIDENTIAL COMMUNICATION This e-mail message is intended only for the use of the addressee and may contain information that is privileged and confidential. Any unauthorized dissemination, distribution, or copying is strictly prohibited. If you receive this transmission in error, please contact network@rfa.org. From schen at hpl.umces.edu Wed Apr 12 17:35:47 2006 From: schen at hpl.umces.edu (Shihnan Chen) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] can't use both processors on each node Message-ID: <443D9CE3.4010409@hpl.umces.edu> hi all, we have a microway 10-node cluster (2 processors on each node) with OpenPBS and MAUI scheduler installed. I can submit a MPI job through qsub but encountered some strange problems. Would somebody kindly give me some hints. Thanks 1. I can only use 1 cpu on 1 node when submitting a job. I tried the following in my script but failed. #PBS -l nodes=2:ppn=2 mpirun -np 4 ./oceanM I attach the /var/spool/pbs/server_priv/nodes and "pbsnodes -a" below. Is the configuaration correct (np=2) ? 2. Every time we submit a job, the master node will be used, which should not happen. Here is the output of /var/spool/pbs/server_priv/nodes node2 np=2 node3 np=2 node4 np=2 node5 np=2 node6 np=2 node7 np=2 node8 np=2 node9 np=2 node10 np=2 Here is the output of "pbsnodes -a" node2 state = free np = 2 ntype = cluster node3 state = free np = 2 ntype = cluster node4 state = free np = 2 ntype = cluster node5 state = free np = 2 ntype = cluster node6 state = free np = 2 ntype = cluster node7 state = free np = 2 ntype = cluster node8 state = free np = 2 ntype = cluster node9 state = free np = 2 ntype = cluster node10 state = free np = 2 ntype = cluster From tsariysk at craft-tech.com Thu Apr 13 04:24:34 2006 From: tsariysk at craft-tech.com (Ted Sariyski) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] request for comments for NFS v.3 vs NFS v.4 Message-ID: <200604130724.34789.tsariysk@craft-tech.com> Hi, I used to follow discussions in this group closely but lately I was distracted from other issues. I used to think that as a rule NFS require carefull tuning. Now I've been said that NFS v.4 takes care of most of the issues that v.2 and v.3 have and it is mounted with its defaults. The problem is that under load I see issues (like stalled file system) that I used to associate with NFS (and I'm talking about a small clusters, say 30-50 nodes). I checked the mpich home page and found the same old warnings that nfs must be carefully tuned and found nothing specific for NFS v.4. Could somebody comment on how much tuning NFS v.4 reqiere compared with NFS v.3, please? Thanks in advance, Ted From rzewnickie at rfa.org Tue Apr 11 15:33:46 2006 From: rzewnickie at rfa.org (Eric Dantan Rzewnicki) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Re: Reminder: BWBUG/LCUG live stream today, 2006-04-11 In-Reply-To: <20060411175407.GA5246@rfa.org> References: <20060411175407.GA5246@rfa.org> Message-ID: <20060411223344.GA5312@rfa.org> On Tue, Apr 11, 2006 at 01:54:09PM -0400, Eric Dantan Rzewnicki wrote: > Today's Linux Cluster User Group meeting will be streamed live. > > The URL for the live video stream will be: > http://streamer2.rfa.org:8000 > > For information on viewing theora streams see this FAQ entry: > http://www.theora.org/theorafaq.html#40 > > The camera is up now and I'm in #bwbug on irc.freenode.net. I'm still > working on getting an ogg vorbis audio only stream going for those who > lack sufficient bandwidth for the video stream, but might not make it in > time for this meeting. Hopefully I'll have it going for next week's > DCLUG meeting. I welcome any feedback, positive or negative, from anyone who watched the streams. According to icecast we had as many as 11 viewers. Most of the time there were 7-8 watching. If possible, please note the player used, your bandwidth and other details you think might be pertinent Thanks to Don Jr. and Wade Hampton for being on irc. An ogg theora file of the tape will be available within a few days. Thanks, -- Eric Dantan Rzewnicki Apprentice Linux Audio Developer and Mostly Harmless Sysadmin (text below this line mandated by management) Technical Operations Division | Radio Free Asia 2025 M Street, NW | Washington, DC 20036 | 202-530-4900 CONFIDENTIAL COMMUNICATION This e-mail message is intended only for the use of the addressee and may contain information that is privileged and confidential. Any unauthorized dissemination, distribution, or copying is strictly prohibited. If you receive this transmission in error, please contact network@rfa.org. From leandro at ep.petrobras.com.br Tue Apr 18 07:28:46 2006 From: leandro at ep.petrobras.com.br (Leandro Tavares Carneiro) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Questions about a large job Message-ID: <4444F79E.50207@ep.petrobras.com.br> Hi, I tried this weekend run HPL on our largest cluster, 1172 dual Opteron nodes.The network is Gigabit ethernet as our applications don't need and don't use a lot of process intercommunication. I have available 1148 dual nodes, 2296 CPUs and configured HPL.dat to run on that. I already have tested the parameters so i know it was good for this cluster. So, I have compiled HPL with Pathscale using ACML mathematical library. The MPI used was LAM-MPI. I have run some tests with 10 nodes and it runs well. But, when I tried to run with 2296 CPUs, the job won't start. Various errors happened, one for each try. The Torque version installed is 2.0.0p8 and is working fine with other largers jobs, with 1000 CPUs. I must admit, I never have tried to run a job with this size. I know, I can made some mistake, but what I wish know is about timeouts. The processes takes a long time to start and don't start. When it start run, I saw it because the HPL.out was created, ir dies. Do you guys have jobs larger than that running OK with Torque and LAM-MPI? There are something I can do to accelerate the start of the job? I know i lost the list, but any help will be great! Thanks a lot. -- Leandro Tavares Carneiro Petrobras TI/TI-E&P/STEP Suporte Tecnico de E&P Av Chile, 65 sala 1501 EDISE - Rio de Janeiro / RJ Tel: (0xx21) 3224-1427 From jlkaiser at fnal.gov Mon Apr 17 12:31:09 2006 From: jlkaiser at fnal.gov (Joseph L. Kaiser) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Problems installing ROCKS In-Reply-To: <20060411162107.l3kb8ssrs0g44w08@www.bu.edu> References: <20060411162107.l3kb8ssrs0g44w08@www.bu.edu> Message-ID: <4443ECFD.3080609@fnal.gov> Hi, First, beowulf is an indirect way to get your Rocks questions answered. You should be posting here: npaci-rocks-discussion@sdsc.edu. The Rocks developers are on this list. Second, how did you verify your media. With MD5SUMs? Does this error occur during the install of the frontend or a compute node? How much memory does the fronend have? How big are the disks? You have a mixed frontend/compute node environment. What are you attempting to install on the frontend? Are you going to install the i386 on the frontend and then follow the cross-kickstart instructions in the user guide? Thanks, Joe kario@bu.edu wrote: > Problems installing ROCKS > > > > We're trying to install ROCKS cluster on a flat network with one > managing node (HP DL360 g3, 1x3.06 GHz Xeon > ) and 2 compute nodes (HP DL 145, 2x2.4 GHz Opteron 250). > > When trying to install ROCKS we get this error at 80%: > " Error installing fonts-xorg-75dpi Can indicate media failure, lack > of disk-space or hardware problems > Fatal Error. Verify media" > > We have verified the media and checked the disk-space as well as the > hardware. > > > > 1) How can we configure a flat network in ROCKS? > 2) What does the error mean? 3) The two compute nodes do not have > CD-ROM nor floppy disk. How can we install ROCKS on one of these > machines if > we instead choose one of these as managing node? > > > > Karianne > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From erwan at seanodes.com Tue Apr 18 08:20:29 2006 From: erwan at seanodes.com (Velu Erwan) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Questions about a large job In-Reply-To: <4444F79E.50207@ep.petrobras.com.br> References: <4444F79E.50207@ep.petrobras.com.br> Message-ID: <444503BD.3080409@seanodes.com> Leandro Tavares Carneiro a ?crit : [...] > Do you guys have jobs larger than that running OK with Torque and > LAM-MPI? There are something I can do to accelerate the start of the job? > > I know i lost the list, but any help will be great! Thanks a lot. > Did you try talking about that with the Torque and/or LAM-MPI team ? Both are very kind, and must be able to help you in such trouble. Erwan, From Bogdan.Costescu at iwr.uni-heidelberg.de Tue Apr 18 12:35:03 2006 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Questions about a large job In-Reply-To: <4444F79E.50207@ep.petrobras.com.br> Message-ID: On Tue, 18 Apr 2006, Leandro Tavares Carneiro wrote: > The MPI used was LAM-MPI. I have run some tests with 10 nodes and it > runs well. But, when I tried to run with 2296 CPUs, the job won't start. Are you able to run a simple "hello world" test ? If not, you might be hitting the per-process descriptor limit, as each process will try to open a TCP connection to each other process - in this case you should still be able to run a job on something like 500 nodes (=1000 processes, slightly less than the 1024 maximum descriptors per process). > Various errors happened, one for each try. The Torque version installed > is 2.0.0p8 and is working fine with other largers jobs, with 1000 CPUs. This just confirms my suspicion expressed above. To change the limits on a Red Hat like system, add a line like: * - nofile 4096 to /etc/security/limits.conf. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: Bogdan.Costescu@IWR.Uni-Heidelberg.De From deadline at clustermonkey.net Wed Apr 19 09:20:43 2006 From: deadline at clustermonkey.net (Douglas Eadline) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Cluster Interconnects: The Whole Shebang Message-ID: <47267.192.168.1.1.1145463643.squirrel@www.eadline.org> I am pleased to announce the publication of our latest Cluster Monkey feature: "Cluster Interconnects: The Whole Shebang" http://www.clustermonkey.net/ Jeff Layton has updated and expand a prior version that was published in ClusterWorld. We believe it is the only review of its type. We provide an overview of the available technologies and conclude with two tables summarizing key features and ballpark pricing for various size clusters. While you are there, also check the other recent articles: * MPI: Is Your Application Spawnworthy? * MPI: The Spawn of MPI * Using the PIO Benchmark * Using the Globus Toolkit with Firewalls -- Doug From konstantin_kudin at yahoo.com Wed Apr 19 12:54:03 2006 From: konstantin_kudin at yahoo.com (Konstantin Kudin) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Determining NFS usage by user on a cluster Message-ID: <20060419195403.31341.qmail@web52015.mail.yahoo.com> Hi all, Is there any good solution to find out which user is loading the NFS the most in a cluster configuration? Basically, a bunch of nodes are hooked up to an NFS server, and when somebody forgets to use the local scratch space for a job, the system slows down to a crawl. So figuring out who is doing that would be extremely helpful. Thanks! Konstantin __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From rgb at phy.duke.edu Wed Apr 19 13:39:01 2006 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Determining NFS usage by user on a cluster In-Reply-To: <20060419195403.31341.qmail@web52015.mail.yahoo.com> References: <20060419195403.31341.qmail@web52015.mail.yahoo.com> Message-ID: On Wed, 19 Apr 2006, Konstantin Kudin wrote: > Hi all, > > Is there any good solution to find out which user is loading the NFS > the most in a cluster configuration? > > Basically, a bunch of nodes are hooked up to an NFS server, and when > somebody forgets to use the local scratch space for a job, the system > slows down to a crawl. > > So figuring out who is doing that would be extremely helpful. > > Thanks! > Konstantin If you install e.g. xmlsysd/wulfstat/wulflogger (the former on the nodes, the latter on a monitor machine) you can probably find out by monitoring a mix of job and network traffic per node, especially if the nodes are typically used for grid-like applications that don't generate a lot of network traffic. The easiest way to get an idea of what/who to look for is to run "lsof -N" on the nodes to see what NSF based files are open on the node. Note that you'll likely need to do more -- monitor likely looking files directly for being modified in association with active process and burst of network traffic -- to really run this down. rgb > > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From gmpc at sanger.ac.uk Thu Apr 20 00:54:34 2006 From: gmpc at sanger.ac.uk (Guy Coates) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Determining NFS usage by user on a cluster In-Reply-To: <20060419195403.31341.qmail@web52015.mail.yahoo.com> References: <20060419195403.31341.qmail@web52015.mail.yahoo.com> Message-ID: <44473E3A.2000907@sanger.ac.uk> Konstantin Kudin wrote: > Hi all, > > Is there any good solution to find out which user is loading the NFS > the most in a cluster configuration? > iftop is quite a handy tool; it displays traffic on a per-host basis, so if a particular node is hammering your NFS server, it will show right up. http://www.ex-parrot.com/~pdw/iftop/ Good old tcpdump/ethereal will help you get finer details if you need them. Cheers, Guy > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > -- Dr Guy Coates, Informatics System Group The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1HH, UK Tel: +44 (0)1223 834244 ex 6925 Fax: +44 (0)1223 496802 From m.janssens at opencfd.co.uk Wed Apr 19 14:01:55 2006 From: m.janssens at opencfd.co.uk (mattijs) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Determining NFS usage by user on a cluster In-Reply-To: <20060419195403.31341.qmail@web52015.mail.yahoo.com> References: <20060419195403.31341.qmail@web52015.mail.yahoo.com> Message-ID: <200604192201.55631.m.janssens@opencfd.co.uk> fuser might help. fuser -mv file-on-nfs will show all the usage of the file system the 'file-on-nfs' is on. On Wednesday 19 April 2006 20:54, Konstantin Kudin wrote: > Hi all, > > Is there any good solution to find out which user is loading the NFS > the most in a cluster configuration? > > Basically, a bunch of nodes are hooked up to an NFS server, and when > somebody forgets to use the local scratch space for a job, the system > slows down to a crawl. > > So figuring out who is doing that would be extremely helpful. > > Thanks! > Konstantin > > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From Lester.Langford at ssc.nasa.gov Thu Apr 20 08:11:44 2006 From: Lester.Langford at ssc.nasa.gov (Langford, Lester) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] HPL Runtime error Message-ID: Hello all, I am kind of new to cluster operation, but have built 4 small sized (16 ? 48 nodes) for other researchers. Now, I?m trying to run HPL on our 48-node dual Opteron 246 cluster. Got it to build the xhpl file, but when I try a test run I get the following error: a48:/local-io/linux_bench/hpl/bin/Linux_Op_246 # mpirun -np 4 xhpl /local-io/linux_bench/hpl/bin/Linux_Op_246/xhpl: Command not found. p0_4305: p4_error: Child process exited while making connection to remote process on ath64: 0 p0_4305: (46.070312) net_send: could not write to fd=4, errno = 32 ath64 is a workstation I am not using for this task. Any and all help on this matter would be greatly appreciated. Thanks, Les Langford Lester Langford Technology Development & Transfer NASA Test Operations Group Jacobs Sverdrup/ERC Bldg 8306 Stennis Space Center, MS 39529 - Lester.Langford@ssc.nasa.gov ( (228) 688-7221 Fax (228) 688-1106 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20060420/ec89c6eb/attachment.html From ctierney at hypermall.net Thu Apr 20 08:41:12 2006 From: ctierney at hypermall.net (Craig Tierney) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] HPL Runtime error In-Reply-To: References: Message-ID: <4447AB98.6060707@hypermall.net> Langford, Lester wrote: > Hello all, > > I am kind of new to cluster operation, but have built 4 small sized (16 > ? 48 nodes) for other researchers. > Now, I?m trying to run HPL on our 48-node dual Opteron 246 cluster. Got > it to build > the xhpl file, but when I try a test run I get the following error: > > a48:/local-io/linux_bench/hpl/bin/Linux_Op_246 # mpirun -np 4 xhpl > /local-io/linux_bench/hpl/bin/Linux_Op_246/xhpl: Command not found. > p0_4305: p4_error: Child process exited while making connection to > remote process on ath64: 0 > p0_4305: (46.070312) net_send: could not write to fd=4, errno = 32 Is the /local-io filesystem shared amongst all nodes, or is it the local disk? The executable needs to be on a shared filesystem where all applications can see it. This is probably why you got the "command not found' error. Craig > > ath64 is a workstation I am not using for this task. > > Any and all help on this matter would be greatly appreciated. > Thanks, > > Les Langford > > > Lester Langford > Technology Development & Transfer > NASA Test Operations Group > Jacobs Sverdrup/ERC > Bldg 8306 > Stennis Space Center, MS 39529 > > *- *Lester.Langford@ssc.nasa.gov > ( (228) 688-7221 > Fax (228) 688-1106 > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jlb17 at duke.edu Thu Apr 20 16:26:26 2006 From: jlb17 at duke.edu (Joshua Baker-LePain) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Gigabit switch recommendations In-Reply-To: References: <20060330051838.GA87266@tehun.pair.com> Message-ID: On Thu, 6 Apr 2006 at 12:06pm, Joshua Baker-LePain wrote > I've got my new switch in hand and I've done some preliminary testing. So > far, so good. Total bandwidth between 2 hosts connected to the switch was > quite comparable to that of the hosts being directly connected for just about > all the MTUs I tested (see ). The > hosts are centos 4.3 using onboard BCM5704 NICs (tg3 driver), and I tested > with netperf. > > Now I just need to get it hooked up to more hosts and really have at it. Responding to myself one last time, I've put the switch (SMC8748L2) through some more testing, and I'm pretty happy with it. I set up increasing numbers of node pairs firing netperf at each other, and compared netpipe runs as the switch load increased. Unfortunately, node count limited me to 24 netperf instances (50% load only), but the switch didn't even blink: http://www.duke.edu/~jlb17/np-comp.png All that testing was done with jumbo frames. I've put the cluster back into production, but if there's something in particular someone would like to see, let me know and I'll see if I can do it. -- Joshua Baker-LePain Department of Biomedical Engineering Duke University From john.hearns at streamline-computing.com Fri Apr 21 00:45:15 2006 From: john.hearns at streamline-computing.com (John Hearns) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Determining NFS usage by user on a cluster In-Reply-To: <44473E3A.2000907@sanger.ac.uk> References: <20060419195403.31341.qmail@web52015.mail.yahoo.com> <44473E3A.2000907@sanger.ac.uk> Message-ID: <1145605516.3698.74.camel@localhost.localdomain> On Thu, 2006-04-20 at 08:54 +0100, Guy Coates wrote: > Konstantin Kudin wrote: > > Hi all, > > > > Is there any good solution to find out which user is loading the NFS > > the most in a cluster configuration? > > > > iftop is quite a handy tool; it displays traffic on a per-host basis, so > if a particular node is hammering your NFS server, it will show right up. > > http://www.ex-parrot.com/~pdw/iftop/ Etherape might be a useful tool also. http://etherape.sourceforge.net/ It has a graphical 'radar display' of traffic to/from network hosts. I have found it very useful in the past for monitoring/debugging. I'd hazard a guess though that in a cluster on full song you'll get a busy display, but you should be able to spot any 'hot' nodes. From i.kozin at dl.ac.uk Fri Apr 21 02:43:22 2006 From: i.kozin at dl.ac.uk (Kozin, I (Igor)) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Gigabit switch recommendations Message-ID: <77673C9ECE12AB4791B5AC0A7BF40C8F0147D8F4@exchange02.fed.cclrc.ac.uk> Joshua, no need for excuses. I hope others on the list will agree that all information about reliable switches (as any good practice regarding clusters for that matter) is being read with high interest. A positive post might not get a lot of feedback but can be still extremely useful to many. Igor > -----Original Message----- > From: beowulf-bounces@beowulf.org > [mailto:beowulf-bounces@beowulf.org]On > Behalf Of Joshua Baker-LePain > Sent: 21 April 2006 00:26 > To: beowulf@beowulf.org > Subject: Re: [Beowulf] Gigabit switch recommendations > > > On Thu, 6 Apr 2006 at 12:06pm, Joshua Baker-LePain wrote > > > I've got my new switch in hand and I've done some > preliminary testing. So > > far, so good. Total bandwidth between 2 hosts connected to > the switch was > > quite comparable to that of the hosts being directly > connected for just about > > all the MTUs I tested (see > ). The > > hosts are centos 4.3 using onboard BCM5704 NICs (tg3 > driver), and I tested > > with netperf. > > > > Now I just need to get it hooked up to more hosts and > really have at it. > > Responding to myself one last time, I've put the switch (SMC8748L2) > through some more testing, and I'm pretty happy with it. I set up > increasing numbers of node pairs firing netperf at each other, and > compared netpipe runs as the switch load increased. > Unfortunately, node > count limited me to 24 netperf instances (50% load only), but > the switch > didn't even blink: > > http://www.duke.edu/~jlb17/np-comp.png > > All that testing was done with jumbo frames. I've put the > cluster back > into production, but if there's something in particular > someone would like > to see, let me know and I'll see if I can do it. > > -- > Joshua Baker-LePain > Department of Biomedical Engineering > Duke University > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) > visit http://www.beowulf.org/mailman/listinfo/beowulf > From joachim at ccrl-nece.de Fri Apr 21 04:14:02 2006 From: joachim at ccrl-nece.de (Joachim Worringen) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Final Call for Papers EuroPVM/MPI 2006 (Deadline Extended May 8th) Message-ID: <4448BE7A.3080100@ccrl-nece.de> ************************************************************************ *** *** *** FINAL CALL FOR PAPERS *** *** *** *** DEADLINE EXTENDED TO MAY 8th, 2006 *** *** (no further extensions will be given) *** *** *** ************************************************************************ EuroPVM/MPI 2006 13th European PVMMPI Users' Group Meeting Wissenschaftszentrum Bonn, Ahrstrasse 45, D-53175 Bonn Bonn, Germany, September 17-20, 2006 web: http://www.pvmmpi06.org e-mail: chair@pvmmpi06.org organized by C&C Research Laboratories, NEC Europe Ltd (http://www.ccrl-nece.de) Forschungszentrum Juelich (http://www.fz-juelich.de) BACKGROUND AND TOPICS PVM (Parallel Virtual Machine) and MPI (Message Passing Interface) have evolved into the standard interfaces for high-performance parallel programming in the message-passing paradigm. EuroPVM/MPI is the most prominent meeting dedicated to the latest developments of PVM and MPI, their use, including support tools, and implementation, and to applications using these interfaces. The EuroPVM/MPI meeting naturally encourages discussions of new message-passing and other parallel and distributed programming paradigms beyond MPI and PVM. The 13th European PVM/MPI Users' Group Meeting will be a forum for users and developers of PVM, MPI, and other message-passing programming environments. Through presentation of contributed papers, vendor presentations, poster presentations and invited talks, they will have the opportunity to share ideas and experiences to contribute to the improvement and furthering of message-passing and related parallel programming paradigms. Topics of interest for the meeting include, but are not limited to: * PVM and MPI implementation issues and improvements * Extensions to PVM and MPI * PVM and MPI for high-performance computing, clusters and grid environments * New message-passing and hybrid parallel programming paradigms * Formal methods for reasoning about message-passing programs * Interaction between message-passing software and hardware * Performance evaluation of PVM and MPI applications * Tools and environments for PVM and MPI * Algorithms using the message-passing paradigm * Applications in science and engineering based on message-passing This year special emphasis will be put on new message-passing paradigms and programming models, addressing perceived or demonstrated shortcomings of either PVM or MPI, enabling a better fit with new hardware and interconnect technologies, or higher-productivity programming. As in the preceeding years, the special session 'ParSim' will focus on numerical simulation for parallel engineering environments. EuroPVM/MPI 2006 also introduces two new session types which are 'Outstanding Papers' and 'Late and Breaking Results' for up-to-date information which can be submitted after the deadline for full papers. SUBMISSION INFORMATION Contributors are invited to submit a full paper as PDF (or Postscript) document not exceeding 8 pages in English (2 pages for poster abstracts and Late and Breaking Results). The title page should contain a 100-word abstract and five specific keywords. The paper needs to be formatted according to the Springer LNCS guidelines [2]. The usage of LaTeX for preparation of the contribution as well as the submission in camera ready format is strongly recommended. Style files can be found at the URL [2]. New work that is not yet mature for a full paper, short observations, and similar brief announcements are invited for the poster session. Contributions to the poster session should be submitted in the form of a two page abstract. All these contributions will be fully peer reviewed by the program committee. Submissions to the special session 'Current Trends in Numerical Simulation for Parallel Engineering Environments' (ParSim 2006) are handled and reviewed by the respecitve session chairs. For more information please refer to the ParSim website [1]. Additionally, submission of two page abstracts for the special session Late and Breaking Results is possible up to three weeks before the event. These submissions will be reviewed by the General and Program Chairs only. The chosen contributions will not appear in the proceedings, but will be published on the web site. All accepted submissions are expected to be presented at the conference by one of the authors, which requires registration for the conference. IMPORTANT DATES Submission of full papers and poster abstracts May 8th, 2006 (extended) (Submission is now open) Notification of authors June 6th, 2006 Camera ready papers July 3rd, 2006 Early registration deadline August 18th, 2006 (Registration is now open) Submission of Late and Breaking Results September 1st, 2006 Tutorials September 17th, 2006 Conference September 18th - 20th, 2006 For up-to-date information, visit the conference web site at http://www.pvmmpi06.org. PROCEEDINGS The conference proceedings consisting of abstracts of invited talks, full papers, and two page abstracts for the posters will be published by Springer in the Lecture Notes in Computer Science series. In addition, selected papers of the conference, including those from the 'Outstanding Papers' session, will be considered for publication in a special issue of 'Parallel Computing' in an extended format. GENERAL CHAIR * Jack Dongarra (University of Tennessee) PROGRAM CHAIRS * Bernd Mohr (Forschungszentrum Juelich) * Jesper Larsson Traff (C&C Research Labs, NEC Europe) * Joachim Worringen (C&C Research Labs, NEC Europe) PROGRAM COMMITTEE * George Almasi (IBM, USA) * Ranieri Baraglia (CNUCE Institute, Italy) * Richard Barrett (ORNL, USA) * Gil Bloch (Mellanox, Israel) * Arndt Bode (Technical Univ. of Munich, Germany) * Marian Bubak (AGH Cracow, Poland) * Hakon Bugge (Scali, Norway) * Franck Capello (University of Paris-Sud, France) * Barbara Chapman (University of Houston, USA) * Brian Coghlan (Trinity College Dublin, Ireland) * Yiannis Cotronis (University of Athens, Greece) * Jose Cunha (New University of Lisbon, Portugal) * Marco Danelutto (University of Pisa, Italy) * Frank Dehne (Carleton University, Canada) * Luiz DeRose (Cray, USA) * Frederic Desprez (INRIA, France) * Erik D'Hollander (University of Ghent, Belgium) * Beniamino Di Martino (Second University of Naples, Italy) * Jack Dongarra (University of Tennessee, USA) * Graham Fagg (University of Tennessee, USA) * Edgar Gabriel (University of Houston, USA) * Al Geist (OakRidge National Laboratory, USA) * Patrick Geoffray (Myricom, USA) * Michael Gerndt (Technical Univ. of Munich, Germany) * Andrzej Goscinski (Deakin University, Australia) * Richard L. Graham (LANL, USA) * William Gropp (Argonne National Laboratory, USA) * Erez Haba (Microsoft, USA) * Rolf Hempel (DLR - German Aerospace Center, Germany) * Dieter Kranzlm?ller (Joh. Kepler University Linz, Austria) * Rainer Keller (HLRS, Germany) * Stefan Lankes (RWTH Aachen, Germany) * Erwin Laure (CERN, Switzerland) * Laurent Lefevre (INRIA/LIP, France) * Greg Lindahl (Pathscale, USA) * Thomas Ludwig (University of Heidelberg, Germany) * Emilio Luque (University Autonoma de Barcelona, Spain) * Ewing Rusty Lusk (Argonne National Laboratory, USA) * Tomas Margalef (University Autonoma de Barcelona, Spain) * Bart Miller (University of Wisconsin, USA) * Bernd Mohr (Forschungszentrum J?lich, Germany) * Matthias M?ller (Dresden University of Technology, Germany) * Salvatore Orlando (University of Venice, Italy) * Fabrizio Petrini (PNNL, USA) * Neil Pundit (Sandia National Labs, USA) * Rolf Rabenseifner (HLRS, Germany) * Thomas Rauber (Universit?t Bayreuth, Germany) * Wolfgang Rehm (TU Chemnitz, Germany) * Casiano Rodriguez-Leon (University de La Laguna, Spain) * Michiel Ronsse (University of Ghent, Belgium) * Peter Sanders (Karlsruhe) * Martin Schulz (Lawrence Livermore National Laboratory, USA) * Jeffrey Squyres (Open System hs LAB, Indiana) * Thomas M. Stricker (Google European Engineering Center, Switzerland) * Vaidy Sunderam (Emory University, USA) * Bernard Tourancheau (Universite de Lyon / INRIA, France) * Jesper Larsson Tr?ff (C&C Research Labs, NEC Europe, Germany) * Carsten Trinitis (Technische Universit?t M?nchen, Germnay) * Pavel Tvrdik (Czech Technical University, Czech Republic) * Jerzy Wasniewski (Danish Technical University, Denmark) * Roland Wismueller (University Siegen, Germany) * Felix Wolf (Forschungszentrum J?lich, Germany) * Joachim Worringen (C&C Research Labs, NEC Europe, Germany) * Laurence T. Yang (St. Francis Xavier University, Canada) CONFERENCE VENUE The conference will be held in Bonn, Germany, the former capital of Germany, an attractive and relaxed city at the Rhine, 30km south of Cologne, frequent host to international conferences in science and politics. The conference will take place at the Wissenschaftszentrum, which is part of the Deutsches Museum (of technology) in Bonn. Bonn is an ideal starting point for exploring the romantic Rhine valley, Roman relics in the museums of Cologne and Bonn, or recent German history in the Haus der Geschichte. Bonn was also the birthplace of Ludwig van Beethoven, and the annual Beethoven Fest will take place in the days during the meeting. The Rheinisches Landesmuseum in Bonn exhibits the famous Neandertal skull, which will be highlighted in the Roots exhibition during the conference. Bonn is reachable by plane to the Cologne-Bonn Airport (with express bus to downtown Bonn) or Frankfurt Airport (high-speed train in about half-an-hour to Siegburg/Bonn, tram to downtown Bonn from Siegburg). There are excellent train connections from all of Europe either to Bonn central station or Siegburg/Bonn. Cologne is reachable by local train in about 30 minutes. REFERENCES [1] ParSim 2006: http://www.lrr.in.tum.de/Par/arch/events/parsim06/ [2] Springer Guidelines: http://www.springer.de/comp/lncs/authors.html -- Joachim Worringen - NEC C&C research lab St.Augustin fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de From list-beowulf at onerussian.com Fri Apr 21 08:47:25 2006 From: list-beowulf at onerussian.com (Yaroslav Halchenko) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] Gigabit switch recommendations In-Reply-To: <20060329053308.GO6679@washoe.onerussian.com> References: <20060329053308.GO6679@washoe.onerussian.com> Message-ID: <20060421154724.GE16161@washoe.onerussian.com> Hi All Beowulfers, Continuing reports Joshua was nice to share, I want to report some pilot testing of the switch I got -- I went for a cheap (~700$ for 44 ports) D-Link DGS-1248T. Did firmware upgrade on it as soon as it arrived (had to use laptop with windows since firmware upgrade is available only in the tool which shipped along -- web interface of the switch doesn't allow it) I banged by hand against the way for a few minutes since I could not find any nice tool to perform parametric analysis/plotting of network performance (please advice if there is any), thus I wrote two small scripts - one to collect and another to plot the results. By default it uses netperf and plots throughput between two nodes depending on MTU sizes. Scripts are available from http://www.onerussian.com/Linux/scripts/netperf/test_netperf.sh http://www.onerussian.com/Linux/scripts/netperf/plot_test.py Graphs are available from http://www.onerussian.com/Linux/scripts/netperf/examples/ Tuned parameters were (blindly copied from Joshua :-)) net.ipv4.tcp_wmem = 4096 65536 16777216 net.ipv4.tcp_rmem = 4096 16777216 16777216 net.core.rmem_max = 8388608 net.core.wmem_max = 8388608 whenever defaults are net.ipv4.tcp_wmem = 4096 16384 131072 net.ipv4.tcp_rmem = 4096 87380 174760 net.core.wmem_max = 131071 net.core.rmem_max = 131071 Results: weirdo -- needs to rerun may be -- there is a range of MTUs (0 - ~4500) when switch performed better than crossover with the same network parameters MTUs 3500-4000 seems to be the best in terms of throughput (~185000 KBps duplex) and tuning the params seems to help in general but for 3500 seems to be not that important (need to run with smaller step of MTU) going away from default MTU (1024) helps a lot for CPU utilization, so my NFS server should be happier than now. but "tuned" params incure their own impact on cpu utilization, thus they also has to be checked for their optimal values for the task generally this DLink seems to be not as good as Joshua's SMC on high MTU sizes -- speed between crosslinked and via-switch diverges considerably (around 30000-35000 KBps loss) I hope this is of help for anyone ;) -- .-. =------------------------------ /v\ ----------------------------= Keep in touch // \\ (yoh@|www.)onerussian.com Yaroslav Halchenko /( )\ ICQ#: 60653192 Linux User ^^-^^ [175555] -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://www.scyld.com/pipermail/beowulf/attachments/20060421/5f9b2518/attachment.bin From hahn at physics.mcmaster.ca Fri Apr 21 14:46:53 2006 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] HPL Runtime error In-Reply-To: <4447AB98.6060707@hypermall.net> Message-ID: > local disk? The executable needs to be on a shared filesystem where > all applications can see it. not to be persnickety (hah), but the executable _could_ be replicated atthe same name on non-shared filesystems. the particular MPI's startup might well assume other files are shared, though. -mark From James.P.Lux at jpl.nasa.gov Sun Apr 23 20:12:42 2006 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:04:58 2009 Subject: [Beowulf] ClusterMonkey/Doug E. got slashdotted Message-ID: <6.1.1.1.2.20060423201006.035edc08@mail.jpl.nasa.gov> Apparently clustermonkey has achieved some degree of slashdot fame, judging from the comments about being unable to get to the article. http://hardware.slashdot.org/hardware/06/04/22/1822258.shtml However, it would appear that some of the posters on /. don't actually understand what clusters are or are trying to do. James Lux, P.E. Spacecraft Radio Frequency Subsystems Group Flight Communications Systems Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 From deadline at clustermonkey.net Mon Apr 24 04:33:59 2006 From: deadline at clustermonkey.net (Douglas Eadline) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] ClusterMonkey/Doug E. got slashdotted In-Reply-To: <6.1.1.1.2.20060423201006.035edc08@mail.jpl.nasa.gov> References: <6.1.1.1.2.20060423201006.035edc08@mail.jpl.nasa.gov> Message-ID: <46956.192.168.1.1.1145878439.squirrel@www.eadline.org> > Apparently clustermonkey has achieved some degree of slashdot fame, > judging > from the comments about being unable to get to the article. > http://hardware.slashdot.org/hardware/06/04/22/1822258.shtml > It was fun in a weird sort of way watching my server run out of memory. But, 20,000 views in 24 hours is not to bad if it did not come on like tsunami. > > However, it would appear that some of the posters on /. don't actually > understand what clusters are or are trying to do. Which is why I wrote this some time ago http://www.clustermonkey.net//content/view/16/33/ Like most things, Slashdot does not seem to be what it used to be. What ever that was. -- Doug From eugen at leitl.org Mon Apr 24 09:23:00 2006 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] FPGA coprocessor for the Opteron slot Message-ID: <20060424162300.GF31486@leitl.org> http://www.theregister.co.uk/2006/04/21/drc_fpga_module/print.html Start-up could kick Opteron into overdrive By Ashlee Vance in Mountain View Published Friday 21st April 2006 19:44 GMT Exclusive The 2002 film Death to Smoochy reminds us that "friends come in all sizes." AMD executives must embrace this observation on a daily basis, especially when a company such as DRC Computer appears. The tiny DRC works out of a no frills Santa Clara office, producing technology that has the potential to give servers based on AMD's Opteron chip a real edge over competing Xeon-based boxes. DRC has developed a type of reprogrammable co-processor that can slot straight into Opteron sockets. Customers can then offload a wide variety of software jobs to the co-processor running in a standard server, instead of buying unique, more expensive types of accelerators from third parties as they have in the past. "Current accelerators costs about $15,000 each and deliver little performance improvements beyond what you could achieve by buying more blade servers for that same price," Larry Laurich, the CEO of DRC, told us in an interview. "We have taken the approach that we must deliver three times the price-performance of a standard blade." Neither standalone server accelerators nor FPGAs (field-programmable gate arrays), which is what the DRC modules are, stand as novel concepts in the hardware industry. Server customers, however, have largely shied from buying pricey, specialized co-processors even when such devices demonstrated dramatic performance improvements on certain workloads. The high costs of accelerators, a lack of supporting software and a large amount of custom design work needed to make the devices work well have made them not worth the trouble to most customers. It's this tradition of disdain for accelerators that DRC will have to fight. "People have tried a lot of special purpose processing devices over the years and, with the exceptions of graphics units and arguably floating point units, general purpose processors have always won out in the end," said Gordon Haff, an analyst at Illuminata. DRC thinks it has solved the price and performance problems by playing off AMD's open Hypertransport specification. "DRC's flagship product is the DRC Coprocessor Module that plugs directly into an open processor socket in a multi-way Opteron system," the company notes on its web site. "This provides direct access to DDR memory and any adjacent Opteron processor at full Hypertransport bandwidth [12.8 GBps] and ?75 nanosecond latency." AMD's decision to open Hypertransport could end up being a key factor in Opteron's future success. Intel looks set to compete better with AMD later this year when it releases a revamped line of Xeon processors. AMD, however, can now turn to third parties such as DRC for performance boosts unavailable with Intel's chip line. A DRC module in a person's handDRC appears to be making the most of its AMD ties by sliding right into Opteron sockets. That means that customers can outfit an Opteron motherboard with any combination of Opteron chips and DRC modules. Illuminata's Haff sees the DRC implementation as one way of overcoming past aversions to accelerators. "It is true that one of the issues around PCI-based FPGA products and really anything specialized is that by the time you transfer the calculation over the special purpose board, you have often lost much of the benefit you had," Haff said. "So, putting the product within the CPU fabric certainly does help address this particular problem." The notion of offloading certain routines to an FPGA should prove attractive to a wide variety of industries, stretching from the oil and gas sector to high performance computing buffs and possibly even mainstream server customers. Today, for example, companies like Boeing that need specialized, embedded devices will buy a PCI board with an FPGA and do custom work designing software and a hardware unit for their system. "Those products could end up in something the size of a telephone or a bread box," said Laurich. "It may take them about six months to lay out that type of custom design." With the DRC module, customers can pick from standard hardware ranging from blade servers on up to Opteron-based SMPs instead of building their own breadboxes. Each DRC module will cost around $4,500 this year and likely drop to around $3,000 next year, Laurich said. That compares to products from companies such as SGI that cost well over $10,000. So far, DRC has seen the most interest from oil and gas companies looking to put specific algorithms on the FPGAs. Manufacturing firms and financial services companies have also looked at the DRC products for help with their own routines. It's not hard to imagine companies such as Linux Networx, Cray or SGI (when it does the inevitable and backs Opteron) wanting to move away from more expensive FPGA products as well in order to service the high-performance computing market. Eventually, standard server makers could turn to the FPGAs to help with security or networking workloads. "There does seem to be this kind of general feeling in places like IBM and Sun that the time may be here to use some special purpose processors or parts of processors for various things," Haff said. "The FPGA approach is certainly one way of doing that. It does have the advantage that you're not locked into a particular function at any time because you can dynamically reprogram it." The DRC products also come with potential energy cost savings that could be a plus for end users and server vendors that have started hawking "green computing." Power has become the most expensive item for many large data centers. The first set of DRC modules will consume about 10 - 20 watts versus close to 80 watts for an Opteron chip. An upcoming larger DRC module will consume twice the power and be able to handle larger algorithms. "We believe we will get 10 to 20x application acceleration at 40 per cent of the power," Laurich said. "At the same time, we're looking at a 2 to 3x price performance advantage." A motherboard with DRC and Opteron chip It will, of course, take some time to build out the software for the DRC modules. The company has started shipping its first machines to channel partners that specialize in developing applications for FPGAs. An oil and gas company wanting to move its code to the product could expect the process to take about 6 months. If DRC takes off, the company plans to bulk up from its current 13-person operation and to tap partners in different verticals to help out with the software work. DRC also thinks it can maintain a competitive advantage over potential rivals via its patent portfolio. The modules result from work done by FPGA pioneer Steve Casselman, who is a co-founder and CTO of the company. Casselman told us that he had been waiting for something like Hypertransport to come along for years and that AMD's opening up of the specification almost brought tears to his eyes. It's always difficult to judge how well a start-up will pan out, especially one that needs to build out systems and software to make it a success. DRC, however, does have - at the moment - that rare feeling of something special. It's playing off standard server components and riding the Opteron wave. In addition, it is reducing the cost of acceleration modules in a dramatic fashion. That combination of serious horsepower with much lower costs is typically the right recipe for a decent start-up, and we'll be curious to see how things progress in the coming months. You can have a look at the DRC kit here (http://www.drccomputer.com/pages/products.html). ? -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: Digital signature Url : http://www.scyld.com/pipermail/beowulf/attachments/20060424/6dc9773e/attachment.bin From sigut at id.ethz.ch Sun Apr 23 23:04:20 2006 From: sigut at id.ethz.ch (G.M.Sigut) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Determining NFS usage by user on a cluster Message-ID: <1145858660.13364.8.camel@gms2.ethz.ch> > > > Is there any good solution to find out which user is loading the NFS > > > the most in a cluster configuration? > > > > > > > iftop is quite a handy tool; it displays traffic on a per-host basis, so > > if a particular node is hammering your NFS server, it will show right up. > > > > http://www.ex-parrot.com/~pdw/iftop/ > > > Etherape might be a useful tool also. > > http://etherape.sourceforge.net/ If you want to find users, you might try to run tcpdump -i eth1 -c 1000 -n -p -s 192 -v -u udp and port 2049 | \ awk '/ids/{A=match($0,"ids"); print substr($0,A+4,3)}' dudu|grep -v '/' on the NFS server. This will give you a list of UIDs you might process further by building sums etc. Not nice, not 100%, but the only way to sum up USER usage as far as I was able to find out until now. Regards, George -- >>>>>>>>>>>>>>>>>>>>>>>>> George M. Sigut <<<<<<<<<<<<<<<<<<<<<<<<<<< ETH Zurich, Informatikdienste, Sektion Systemdienste, CH-8092 Zurich Swiss Federal Inst. of Technology, Computing Services, System Services e-mail: sigut@id.ethz.ch, Phone: +41 1 632 5763, Fax: +41 1 632 1022 >>>> if my regular address does not work, try "sigut@pop.agri.ch" <<<< From jastreich at gmail.com Mon Apr 24 16:11:04 2006 From: jastreich at gmail.com (Jeremy Streich) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] ClusterMonkey/Doug E. got slashdotted Message-ID: <444D5B08.4020408@uwm.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 |> However, it would appear that some of the posters on /. don't |> actually |> understand what clusters are or are trying to do. | | Which is why I wrote this some time ago | | http://www.clustermonkey.net//content/view/16/33/ I think most slashdoters do get what clustering is and isn't; but think that it's a witty and funny comment to blame whatever the article is about when the server gets slashdotted. | Like most things, Slashdot does not seem to be what it used to be. | What ever that was. That, however, is very true and very sad. - -- - --------------------------------------------------- | .--. Jeremy Streich | | |o_o | | | |:_/ | IEEE-CS at UWM: | | // \ \ http://ieeecs.soc.uwm.edu | | (| | ) | |/'\_ _/`\ Tech for L&S IT Office at UWM | |\___)=(___/ | |Personal Site: http://game-master.us/phpx-3.4.0/ | - --------------------------------------------------- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.9.20 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFETVsFGmL5INBXKmQRAjcPAJ9PFFRHYEHRY5ru/4ZKfSdlib+M4QCfd/bI xVa3U0P1Pw7NUNfGFLeKzMU= =cPE4 -----END PGP SIGNATURE----- From samson at freemail.hu Fri Apr 21 09:35:31 2006 From: samson at freemail.hu (=?ISO-8859-2?Q?s=E1mson_g=E1bor?=) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] (no subject) Message-ID: please help my where can i download beowulf ? thanks , Gabor Samson _______________________________________________________________________________ Szem?lyi k?lcs?n interneten kereszt?l, hitelb?r?lati d?j n?lk?l, ak?r 72 h?napos futamid?re! www.klikkbank.hu From wt at atmos.colostate.edu Mon Apr 24 21:29:44 2006 From: wt at atmos.colostate.edu (Warren Turkal) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] (no subject) In-Reply-To: References: Message-ID: <200604242229.45075.wt@atmos.colostate.edu> On Friday 21 April 2006 10:35, s?mson g?bor wrote: > please help my where can i download beowulf ? > thanks , Gabor Samson Technically speaking, beowulf is more of a concept than a product. wt -- Warren Turkal, Research Associate III/Systems Administrator Colorado State University, Dept. of Atmospheric Research http://www.atmos.colostate.edu/ From eugen at leitl.org Mon Apr 24 23:42:01 2006 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] /. [HyperTransport 3.0 Ratified] Message-ID: <20060425064201.GW31486@leitl.org> Link: http://slashdot.org/article.pl?sid=06/04/24/203238 Posted by: ScuttleMonkey, on 2006-04-24 21:17:00 Hack Jandy writes "The HyperTransport consortium just released the [1]3.0 specification of HyperTransport. The new specification allows for external HyperTransport interconnects, basically meaning you might plug your next generation Opteron into the equivalent of a USB port at the back of your computer. Among other things, the new specification also includes hot swap, on-the-fly reconfigurable HT links and also a hefty increase in bandwidth." References 1. http://dailytech.com/article.aspx?newsid=1943 ----- End forwarded message ----- -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: Digital signature Url : http://www.scyld.com/pipermail/beowulf/attachments/20060425/67ae431b/attachment.bin From steve_heaton at iinet.net.au Tue Apr 25 00:34:51 2006 From: steve_heaton at iinet.net.au (SIM DOG) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Re: ClusterMonkey/Doug E. got slashdotted In-Reply-To: <200604241901.k3OJ0FL9013895@bluewest.scyld.com> References: <200604241901.k3OJ0FL9013895@bluewest.scyld.com> Message-ID: <444DD11B.40906@iinet.net.au> > Like most things, Slashdot does not seem to be what it used to be. > What ever that was. > > -- > Doug Slashdot used to be a source of distilled "interesting things" for me. These days it seems more for kiddies to complain about the latest troubles in WoW, mindless M$ bashing and 'if Linux is to get onto the desktop..." article sightings. Still... if a couple of young'uns learn something from ClusterMonkey then it's a good thing (TM). The interconnect article was a good one :) Stevo From john.hearns at streamline-computing.com Tue Apr 25 01:03:53 2006 From: john.hearns at streamline-computing.com (John Hearns) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] /. [HyperTransport 3.0 Ratified] In-Reply-To: <20060425064201.GW31486@leitl.org> References: <20060425064201.GW31486@leitl.org> Message-ID: <1145952234.3961.31.camel@Vigor12> On Tue, 2006-04-25 at 08:42 +0200, Eugen Leitl wrote: > Link: http://slashdot.org/article.pl?sid=06/04/24/203238 > Posted by: ScuttleMonkey, on 2006-04-24 21:17:00 > > Hack Jandy writes "The HyperTransport consortium just released the > [1]3.0 specification of HyperTransport. The new specification allows > for external HyperTransport interconnects, basically meaning you might > plug your next generation Opteron into the equivalent of a USB port at > the back of your computer. Among other things, the new specification > also includes hot swap, on-the-fly reconfigurable HT links and also a > hefty increase in bandwidth." What goes around, comes around I suppose. Better round up those old-style SMP gurus. "I'll show these young wolfpups a thing or two. Why, my old Origin did 0-60Mips in ten seconds, and she was stock with just a gas port of the inlet connectors." From gerry.creager at tamu.edu Tue Apr 25 07:33:55 2006 From: gerry.creager at tamu.edu (Gerry Creager N5JXS) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] /. [HyperTransport 3.0 Ratified] In-Reply-To: <1145952234.3961.31.camel@Vigor12> References: <20060425064201.GW31486@leitl.org> <1145952234.3961.31.camel@Vigor12> Message-ID: <444E3353.8090005@tamu.edu> Interestingly, w're seeing that argument/discussion (it's not reached the Pythonesque level of comedy just yet) on our campus. John Hearns wrote: > On Tue, 2006-04-25 at 08:42 +0200, Eugen Leitl wrote: > >>Link: http://slashdot.org/article.pl?sid=06/04/24/203238 >>Posted by: ScuttleMonkey, on 2006-04-24 21:17:00 >> >> Hack Jandy writes "The HyperTransport consortium just released the >> [1]3.0 specification of HyperTransport. The new specification allows >> for external HyperTransport interconnects, basically meaning you might >> plug your next generation Opteron into the equivalent of a USB port at >> the back of your computer. Among other things, the new specification >> also includes hot swap, on-the-fly reconfigurable HT links and also a >> hefty increase in bandwidth." > > > What goes around, comes around I suppose. Better round up those > old-style SMP gurus. "I'll show these young wolfpups a thing or two. > Why, my old Origin did 0-60Mips in ten seconds, and she was stock with > just a gas port of the inlet connectors." > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From gerry.creager at tamu.edu Tue Apr 25 07:40:20 2006 From: gerry.creager at tamu.edu (Gerry Creager N5JXS) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Re: ClusterMonkey/Doug E. got slashdotted In-Reply-To: <444DD11B.40906@iinet.net.au> References: <200604241901.k3OJ0FL9013895@bluewest.scyld.com> <444DD11B.40906@iinet.net.au> Message-ID: <444E34D4.4070909@tamu.edu> I still scan it daily. Not so much for the comments posted as for the news they catch I might be interested in. I rarely read the comments anymore... SIM DOG wrote: > >> Like most things, Slashdot does not seem to be what it used to be. >> What ever that was. >> >> -- >> Doug > > > Slashdot used to be a source of distilled "interesting things" for me. > > These days it seems more for kiddies to complain about the latest > troubles in WoW, mindless M$ bashing and 'if Linux is to get onto the > desktop..." article sightings. > > Still... if a couple of young'uns learn something from ClusterMonkey > then it's a good thing (TM). The interconnect article was a good one :) > > Stevo > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From eugen at leitl.org Tue Apr 25 08:52:07 2006 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Re: ClusterMonkey/Doug E. got slashdotted In-Reply-To: <444E34D4.4070909@tamu.edu> References: <200604241901.k3OJ0FL9013895@bluewest.scyld.com> <444DD11B.40906@iinet.net.au> <444E34D4.4070909@tamu.edu> Message-ID: <20060425155207.GO23772@leitl.org> On Tue, Apr 25, 2006 at 09:40:20AM -0500, Gerry Creager N5JXS wrote: > I still scan it daily. Not so much for the comments posted as for the > news they catch I might be interested in. I rarely read the comments > anymore... I only read the /. headlines for the same reasons, via an email newsletter. The noise is still overwhelming. You can assume anything Beowulf-relevant will land here. -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: Digital signature Url : http://www.scyld.com/pipermail/beowulf/attachments/20060425/16957776/attachment.bin From sigut at id.ethz.ch Wed Apr 26 01:53:06 2006 From: sigut at id.ethz.ch (G.M.Sigut) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Determining NFS usage by user on a cluster In-Reply-To: <200604251901.k3PJ0Df8009130@bluewest.scyld.com> References: <200604251901.k3PJ0Df8009130@bluewest.scyld.com> Message-ID: <1146041586.2060.20.camel@gms2.ethz.ch> > Date: Mon, 24 Apr 2006 08:04:20 +0200 > From: "G.M.Sigut" > Subject: Re: [Beowulf] Determining NFS usage by user on a cluster > To: john.hearns@streamline-computing.com > Cc: beowulf@beowulf.org ... > If you want to find users, you might try to run > > tcpdump -i eth1 -c 1000 -n -p -s 192 -v -u udp and port 2049 | \ > awk '/ids/{A=match($0,"ids"); print substr($0,A+4,3)}' dudu|grep -v '/' Of course the second line is awk '/ids/{A=match($0,"ids"); print substr($0,A+4,3)}'|grep -v '/' the "dudu" bit is a rest from the original script which was using a file to store the results of tcpdump. It's not in the way, but confuses the issue. George -- >>>>>>>>>>>>>>>>>>>>>>>>> George M. Sigut <<<<<<<<<<<<<<<<<<<<<<<<<<< ETH Zurich, Informatikdienste, Sektion Systemdienste, CH-8092 Zurich Swiss Federal Inst. of Technology, Computing Services, System Services e-mail: sigut@id.ethz.ch, Phone: +41 1 632 5763, Fax: +41 1 632 1022 >>>> if my regular address does not work, try "sigut@pop.agri.ch" <<<< From deadline at clustermonkey.net Wed Apr 26 07:41:33 2006 From: deadline at clustermonkey.net (Douglas Eadline) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] (no subject) In-Reply-To: References: Message-ID: <37681.192.168.1.1.1146062493.squirrel@www.eadline.org> Samson Beowulf is more of a concept than an actual a software package. You may be thinking of the Scyld Beowulf package as well. http://scyld.com/ Here are some links to help you get started The official Beowulf page: http://www.beowulf.org/ Getting started with Clusters may also be of help: http://www.clustermonkey.net//content/view/91/44/ -- Doug > > please help my where can i download beowulf ? > thanks , Gabor Samson > > > _______________________________________________________________________________ > Szem?lyi k?lcs?n interneten kereszt?l, hitelb?r?lati d?j n?lk?l, ak?r 72 > h?napos futamid?re! > www.klikkbank.hu > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Doug From strombrg at dcs.nac.uci.edu Wed Apr 26 12:57:11 2006 From: strombrg at dcs.nac.uci.edu (Dan Stromberg) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Re: ClusterMonkey/Doug E. got slashdotted In-Reply-To: <20060425155207.GO23772@leitl.org> References: <200604241901.k3OJ0FL9013895@bluewest.scyld.com> <444DD11B.40906@iinet.net.au> <444E34D4.4070909@tamu.edu> <20060425155207.GO23772@leitl.org> Message-ID: <1146081432.32203.23.camel@seki.nac.uci.edu> On Tue, 2006-04-25 at 17:52 +0200, Eugen Leitl wrote: > On Tue, Apr 25, 2006 at 09:40:20AM -0500, Gerry Creager N5JXS wrote: > > I still scan it daily. Not so much for the comments posted as for the > > news they catch I might be interested in. I rarely read the comments > > anymore... > > I only read the /. headlines for the same reasons, via an email > newsletter. The noise is still overwhelming. > > You can assume anything Beowulf-relevant will land here. Seems pretty off topic, but what I do is to put "the palm version" on slashdot on my PDA every morning, and read it while I'm standing in line or whatever. The Palm version has just the top of the article, and the five highest-rated comments. Sadly though, the page is designed pretty poorly for spidering it - each successive article is deeper into the hierarchy, so if you aren't careful (say, if someone links back to the main slashdot page from one of the top five articles), you can easily end up trying to spider all of slashdot. From bill at cse.ucdavis.edu Thu Apr 27 14:54:54 2006 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Opteron cooling specifications? Message-ID: <20060427215454.GA16179@cse.ucdavis.edu> I'm writing a spec for future opteron cluster purchases. The issue of airflow came up. I've seen a surprising variety of configurations, some with a giant rotating cylinder (think paddle wheel), most with a variety of 40x28 or 40x56mm fans, or horizontal blowers. Anyone have a fan vendor they prefer? Ideally well known for making fans that last 3+ years when in use 24/7. A target node CFM for a dual socket dual core opteron? A target maximum CPU temp? I assume it's wise to stay well below the 70C or so thermal max on most of the dual core Opterons. Seems like there is a huge variation in the number of fans and total CFM from various chassis/node manufacturers. A single core single socket 1u opteron I got from sun has two 40mm x 56mm, and 4 * 40mm x 28mm fans. Not bad for a node starting at $750. Additionally some chassis designs form a fairly decent wall across the node for the fans to insure a good front to back airflow. Others seem to place fans willy nilly, I've even seen some that suck air sideways across the rear opteron. In any case, the nature of the campus purchasing process is that we can put in any specification, but can't buy from a single vendor, or award bids for better engineering. So basically lowest bid wins that meets the spec. Thus the need for a better spec. Any feedback appreciated. -- Bill Broadley Computational Science and Engineering UC Davis From walid.shaari at gmail.com Wed Apr 26 03:34:01 2006 From: walid.shaari at gmail.com (Walid) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] 512 nodes Myrinet cluster Challanges Message-ID: Hi all, Does any one know what types of problems/challanges for big clusters? we are considering having a 512 node cluster that will be using Myrinet as its main interconnect, and would like to do our homework The cluster is meant to run an inhouse fluid simulation application that is I/O intensve, and requires large memory models. any hints, pointers will be apperciated TIA Walid. From hpc at gurban.org Wed Apr 26 09:04:52 2006 From: hpc at gurban.org (hpc@gurban.org) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Which is better: 32x2 or 64x1 Message-ID: <46566.62.60.196.3.1146067492.squirrel@gurban.org> Hi, I'm going to build a beowulf linux cluster for my coledge. Thay want 64 core cluster with Opteron processors. Now I want to know which is better (performance & price)? * 64 x Opteron Single Core * 32 x Opteron Dual Core cheers -- Gurban M. Tewekgeli From mwill at penguincomputing.com Thu Apr 27 16:18:07 2006 From: mwill at penguincomputing.com (Michael Will) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Which is better: 32x2 or 64x1 Message-ID: <433093DF7AD7444DA65EFAFE3987879C107CE0@jellyfish.highlyscyld.com> It does not matter as long as you buy from us. ;-) No really it depends on your applications. Are they CPU intensive, RAM intensive, IO intensive? Do they have a lot of interprocesscommuncation going on? In what size groups? The technical parameters that those question allow to weigh are: 1. core speed the fastest dual core cpu you can buy is the 285 (2x 2.6ghz) and the fastest single core is the 254 (1x 2.8 ghz) for some reason we already have the 4-way opteron 856 3.0Ghz but not the 256. So generally speaking, the dual core cpu's are one or two speed steps behind the single core version. Price wise in the middle field, the opteron 250 (one 2.4ghz core per cpu) is about the same price as the opteron 265 (two 1.8ghz core per cpu), which means if you only run one thread on the node, it's only 75% of the speed that it would be on the single core system. However if you run four threads on the dual core system, then the total performance in the best case could be 133% of the single core system, if it scales linearily and there is no contention from I/O or memory. 2. memory architecture. A node with two single core opteron cpu's has one memory controller per core with two channels each, whereas a node with two dual core opteron cpu's still has only one memory controller per cpu socket, and so two cores share one memory controller, which can lead to a slowdown if all four cores are mostly accessing ram most of the time. Examples are signal processing style applications that mostly iterate over data that does not fit into cache and so hit the RAM all the time. 3. I/O 3.1 interconnect If you happen to have a process that uses four cores, and it can run within the same node on a dual dual core node doing message passing in RAM instead of having to do go through ethernet or infiniband between two separate single core nodes, you will see a speedup. There also are The more typical case is that either you don't do much interprocesscommunication and then it does not matter, or you are having a lot of it, and then having four processes going through the same NIC instead of just two is a disadvantage (network I/O bound application). If you have up to 8 processes talking in one group, buying quad-opteron dual core nodes will result in the fastest solution. 3.2 disk I/O Same goes for accessing local disk from four threads instead of two. dual core is a disadvantage if you are I/O bound. One way to mitigate that is to use 1U nodes with four drives and have each process use it's own drive with a separate filesystem for local scratch storage. Michael Will / SE Technical Lead / Penguin Computing / www.penguincomputing.com -----Original Message----- From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of hpc@gurban.org Sent: Wednesday, April 26, 2006 9:05 AM To: beowulf@beowulf.org Subject: [Beowulf] Which is better: 32x2 or 64x1 Hi, I'm going to build a beowulf linux cluster for my coledge. Thay want 64 core cluster with Opteron processors. Now I want to know which is better (performance & price)? * 64 x Opteron Single Core * 32 x Opteron Dual Core cheers -- Gurban M. Tewekgeli _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bill at cse.ucdavis.edu Thu Apr 27 16:18:37 2006 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Which is better: 32x2 or 64x1 In-Reply-To: <46566.62.60.196.3.1146067492.squirrel@gurban.org> References: <46566.62.60.196.3.1146067492.squirrel@gurban.org> Message-ID: <20060427231837.GB16179@cse.ucdavis.edu> On Wed, Apr 26, 2006 at 11:04:52AM -0500, hpc@gurban.org wrote: > Hi, > > I'm going to build a beowulf linux cluster for my > coledge. Thay want 64 core cluster with Opteron > processors. > Now I want to know which is better (performance & price)? > * 64 x Opteron Single Core Twice the memory bandwidth. Great if you are memory bandwidth limited. > * 32 x Opteron Dual Core Cheaper, no slower if you are CPU or cache limited. Ideally you would take the codes that justify spending the money and benchmark it on a single core vs dual core node and establish some kind of price/performance metric. If you pay for power, cooling, or rack space you might want to include that in your calculation. The dual core will take 1/2 the space, and somewhat less power. -- Bill Broadley Computational Science and Engineering UC Davis From agshew at gmail.com Thu Apr 27 18:56:40 2006 From: agshew at gmail.com (Andrew Shewmaker) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Which is better: 32x2 or 64x1 In-Reply-To: <46566.62.60.196.3.1146067492.squirrel@gurban.org> References: <46566.62.60.196.3.1146067492.squirrel@gurban.org> Message-ID: On 4/26/06, hpc@gurban.org wrote: > Hi, > > I'm going to build a beowulf linux cluster for my > coledge. Thay want 64 core cluster with Opteron > processors. > Now I want to know which is better (performance & price)? > * 64 x Opteron Single Core > * 32 x Opteron Dual Core The memory performance of Opteron Dual Cores is less than Single cores in a cluster because there are fewer memory controllers and they will experience more contention. Dual Cores are also generally slower, so they have a lower theoretical performance and their integrated memory controller is slower. Opteron Dual Cores have a higher memory latency than single cores (5-6 ns for single and dual socket systems) . That is much less than the increase in latency seen by accessing memory over Hypertransport (roughly 30 ns). Note that Hypertransport is still very good. http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_8796_8800~97042,00.html The fourth page of the following document has a nice table that shows what kind of Stream performance you can expect with different numbers of threads using dual and single core Opterons. Each thread gets fewer MB/s, but Dual Cores with more threads can achieve a greater aggregate rate. http://www.amd.com/us-en/assets/content_type/DownloadableAssets/AMD_Opteron_Streams_041405_LA.pdf Another thing to consider is the quantity of memory you are able to purchase and fit in your cluster. Because you will have half the number of memory controllers, you can have half the amount of memory. With Single Cores you could also buy less dense, cheaper memory while still getting the same total amount. If your code is CPU or cache bound, or if this is just a cluster to learn on, then I would go with Dual Cores because I would expect them to be cheaper. If your code is memory or network bound, or if your code is limited in how far it can scale across processors; then I would go with the faster Single Cores that will also have greater memory and network resources per core. -- Andrew Shewmaker From eugen at leitl.org Thu Apr 27 23:58:16 2006 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Which is better: 32x2 or 64x1 In-Reply-To: <46566.62.60.196.3.1146067492.squirrel@gurban.org> References: <46566.62.60.196.3.1146067492.squirrel@gurban.org> Message-ID: <20060428065816.GY22800@leitl.org> On Wed, Apr 26, 2006 at 11:04:52AM -0500, hpc@gurban.org wrote: > Hi, > > I'm going to build a beowulf linux cluster for my > coledge. Thay want 64 core cluster with Opteron > processors. > Now I want to know which is better (performance & price)? > * 64 x Opteron Single Core > * 32 x Opteron Dual Core Which kind of codes will your college be running, and which kind of interconnect will you be using? -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: Digital signature Url : http://www.scyld.com/pipermail/beowulf/attachments/20060428/7adfb09c/attachment.bin From hahn at physics.mcmaster.ca Fri Apr 28 05:04:53 2006 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] 512 nodes Myrinet cluster Challanges In-Reply-To: Message-ID: > Does any one know what types of problems/challanges for big clusters? cooling, power, managability, reliability, delivering IO, space. > we are considering having a 512 node cluster that will be using > Myrinet as its main interconnect, and would like to do our homework how confident are you at addressing especially the physical issues above? cooling and power happen to be prominent in my awareness right now because of a 768-node cluster I'm working on. but even ~200 node clusters need to have some careful thought applied to managability (cleaining up dead jobs, making sure the scheduler doesn't let jobs hang around consuming myrinet ports, for instance.) reliability is a fairly cut and dried issue, IMO - either you make the right hardware decisions at purchase, or not. > The cluster is meant to run an inhouse fluid simulation application > that is I/O intensve, and requires large memory models. what parallel-cluster filesystem are you planning to run? how many fileservers? (or is the IO intensivity handlable using per-node disks?) From leon at lost.co.nz Thu Apr 27 20:08:39 2006 From: leon at lost.co.nz (Leon Matthews) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Which is better: 32x2 or 64x1 In-Reply-To: Message-ID: > > Now I want to know which is better (performance & price)? > > * 64 x Opteron Single Core > > * 32 x Opteron Dual Core How does the performance of Dual Core Opterons compare to that of Dual Core Athlon64 processors? It is /much/ cheaper (30- 50% -- at least down here in the antipodes) to build a system with the later, especially when you take into account the price of the motherboards. Cheers, Leon From jaime at iaa.es Fri Apr 28 00:27:19 2006 From: jaime at iaa.es (Jaime Perea) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] 512 nodes Myrinet cluster Challanges In-Reply-To: References: Message-ID: <200604280927.20058.jaime@iaa.es> El Mi?rcoles, 26 de Abril de 2006 12:34, Walid escribi?: > Hi all, > > Does any one know what types of problems/challanges for big clusters? > > we are considering having a 512 node cluster that will be using > Myrinet as its main interconnect, and would like to do our homework > > The cluster is meant to run an inhouse fluid simulation application > that is I/O intensve, and requires large memory models. > > any hints, pointers will be apperciated > > TIA > > Walid. > I'm not sure if this is going to be my first posting. We have a small (16 dual xeon nodes) with myrinet. It works quite well and having mpich-gm is a plus, it gives very low latency and good bandwith. Also we are doing some work with the MareNostrum and although I heard that perhaps there are scaling problems when you are going to a really large number of tasks, that is not really my experience. Perhaps myrinet forces the use of mpi instead of pvm, although there are alternatives. (in principle you can use an ethernet emulation, while being quite fast is not the same at all) >From my point of view, the big problem there is the IO, we installed on our small cluster the pvfs2 system, it works well, using the myrinet gm for the passing mechanism, the pvfs2 is only a solution for parallel IO, since mpi can use it. On the other hand it can not be used for the normal user stuff, so you have to take that into account and think carefully on how to install a good poweful nfs server machine which has to be on an alternative standard network. On the "other" architecture IBM's gpfs is really a nice alternative. All the best -- Jaime D. Perea Duarte. Linux registered user #10472 Dep. Astrofisica Extragalactica. Instituto de Astrofisica de Andalucia (CSIC) Apdo. 3004, 18080 Granada, Spain. From ballen at gravity.phys.uwm.edu Fri Apr 28 07:49:44 2006 From: ballen at gravity.phys.uwm.edu (Bruce Allen) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Which is better: 32x2 or 64x1 In-Reply-To: References: Message-ID: For what it's worth, the Opteron 175 (2 x 2.2 GHz cores) is identical silicon to the AMD64 x2 4400+. The qualification/testing process is apparently NOT the same for the two types of chip, and the register that returns 'chip ID' is programmed differently. Other than that, I have been told by AMD engineers that they are identical. Cheers, Bruce On Fri, 28 Apr 2006, Leon Matthews wrote: >>> Now I want to know which is better (performance & price)? >>> * 64 x Opteron Single Core >>> * 32 x Opteron Dual Core > > How does the performance of Dual Core Opterons compare to that of Dual Core > Athlon64 processors? > > It is /much/ cheaper (30- 50% -- at least down here in the antipodes) to > build a system with the later, especially when you take into account the > price of the motherboards. > > Cheers, > > Leon > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From ballen at gravity.phys.uwm.edu Fri Apr 28 08:18:08 2006 From: ballen at gravity.phys.uwm.edu (Bruce Allen) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Opteron cooling specifications? In-Reply-To: <20060427215454.GA16179@cse.ucdavis.edu> References: <20060427215454.GA16179@cse.ucdavis.edu> Message-ID: Bill, I have similar issues in purchasing. U. of Wisconsin requires that we use a sealed bid system, so writing good specs is important. You can get some protection by requiring that in their submisssion a bidder provide you with MTBF (Mean Time Between Failure) statistical data for ALL fans in a system and for the power supply. You can also require that (example language) 'when system is fully loaded the CPU temperature must be at least 10 Celsuis below the CPU manufacturers maximum recommended operating temperature'. Then have your purchasing people add language which allows you to take overall system reliability and total cost of ownership into account in making a purchase descision (rather than just the contract cost). This allows you to exclude a system if the MTBF numbers are too low. Cheers, Bruce On Thu, 27 Apr 2006, Bill Broadley wrote: > > I'm writing a spec for future opteron cluster purchases. The issue of > airflow came up. > > I've seen a surprising variety of configurations, some with a giant rotating > cylinder (think paddle wheel), most with a variety of 40x28 or 40x56mm > fans, or horizontal blowers. > > Anyone have a fan vendor they prefer? Ideally well known for making > fans that last 3+ years when in use 24/7. > > A target node CFM for a dual socket dual core opteron? > > A target maximum CPU temp? I assume it's wise to stay well below the > 70C or so thermal max on most of the dual core Opterons. > > Seems like there is a huge variation in the number of fans and total CFM > from various chassis/node manufacturers. A single core single socket > 1u opteron I got from sun has two 40mm x 56mm, and 4 * 40mm x 28mm fans. > Not bad for a node starting at $750. > > Additionally some chassis designs form a fairly decent wall across the > node for the fans to insure a good front to back airflow. Others seem > to place fans willy nilly, I've even seen some that suck air sideways > across the rear opteron. > > In any case, the nature of the campus purchasing process is that we can > put in any specification, but can't buy from a single vendor, or award > bids for better engineering. So basically lowest bid wins that meets > the spec. Thus the need for a better spec. > > Any feedback appreciated. > > -- > Bill Broadley > Computational Science and Engineering > UC Davis > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From kewley at gps.caltech.edu Fri Apr 28 14:10:35 2006 From: kewley at gps.caltech.edu (David Kewley) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] 512 nodes Myrinet cluster Challanges In-Reply-To: References: Message-ID: <200604281410.35245.kewley@gps.caltech.edu> On Friday 28 April 2006 05:04, Mark Hahn wrote: > > Does any one know what types of problems/challanges for big clusters? > > cooling, power, managability, reliability, delivering IO, space. I'd add: sysadmin or other professional resources to manage the cluster. Certainly, the more manageable and reliable the cluster is, the less time the admin(s) will have to spend at basically keeping the cluster in good health. But given manageability and reliability, the bigger issue is: How many users and how many different codebases do you have? Given the variety in individual needs, you can end up spending quite a bit of time helping users get new code working well, and/or making adjustments to the cluster software to accommodate their needs. At least this has been my experience. I'm the only admin for a 1024-node cluster with 70+ authorized users (49 unique users in the past 31 days, about 30of whom are frequent users, I'd estimate), and probably a couple dozen user applications. Having other non-sysadmin local staff helping me, as well as having good hardware and software vendor support, has been critical to multiply the force I can bring to bear in solving problems. You know all those best practices you hear about when you're a sysadmin managing a departmental network? Well, when you have a large cluster, best practices become critical -- you have to arrange things so that you don't have to touch hardware but rarely, nor login to fix problems on individual nodes. Such attention to individual nodes takes far too much time away from more productive pursuits, and will lead to lower cluster availability, which means extra frustration and stress for you and your users. A few elements of manageability that I use all the time: * the ability to turn nodes on or off in a remote, scripted, customizable manner * the ability to reinstall the OS on all your nodes, or specific nodes, trivially (e.g. as provided by Rocks or Warewulf) * the ability to get remote console so you can fix problems without getting out the crash cart -- hopefully you don't have to use this much (because it means paying attention to individual systems), but when you need it, it will speed up your work compared to the alternative * the ability to gather and analyze node health information trivially, using embedded hardware management tools and system software tools * the ability to administratively close a node that has problems, so that you can deal with the problem later, and meanwhile jobs won't get assigned to it Think of your compute nodes not as individuals, but as indistinguishable members of a Borg Collective. You shouldn't care very much about individual nodes, but only about the overall health of the cluster. Is the Collective running smoothly? If so, great -- make sure you don't have to sweat the details very much. > > we are considering having a 512 node cluster that will be using > > Myrinet as its main interconnect, and would like to do our homework I've had excellent experience with Myrinet, in terms of reliability, functionality, and technical support. It's probably the most trouble-free part of my cluster and my best overall vendor experience. Myrinet gets used continuously by my users, but I rarely have to pay attention to it at all. > how confident are you at addressing especially the physical issues above? > cooling and power happen to be prominent in my awareness right now > because of a 768-node cluster I'm working on. but even ~200 node > clusters need to have some careful thought applied to managability > (cleaining up dead jobs, making sure the scheduler doesn't let jobs hang > around consuming myrinet ports, for instance.) reliability is a fairly > cut and dried issue, IMO - either you make the right hardware decisions > at purchase, or not. A few comments from my personal experience. On my cluster, perhaps 1 in 10,000 or 100,000 job processes ends up unkilled, taking up compute node resources. It's not been a big problem for me, although it certainly does come up. Generally the undead processes have been a handful out of a set of processes that have something in common -- a bad run, a user doing something weird, or some anomalous system state (e.g. central filesystem going down). I've never had a problem with consumed Myrinet ports, but I'm sure that's going to depend on the details of your local cluster usage patterns. Most often the problem has been a job spinning using CPU, slowing down legitimate jobs. If I configured my scheduler properly (LSF), I'm pretty sure I could avoid even that problem -- just set a threshold on CPU idleness or load level. I *have* made a couple of scripts to find nodes that are busier than they should be, or quieter than they should be, based on the load that the scheduler has placed on them versus the load they're actually carrying. That helps identify problems, and more frequently it helps to give confidence that there *aren't* any problems. :) I'm not sure I agree with Mark that reliability is cut and dried, depending only on initial hardware decisions. (Yes, I removed or changed a couple of important qualifying words in there from what Mark wrote. :) Vendor support methods are critical -- consider that part of the initial hardware choice if you like. My point here is that it's hardware and vendor choice taken together, not just hardware choice. By the way, the idea of rolling-your-own hardware on a large cluster, and planning on having a small technical team, makes me shiver in horror. If you go that route, you better have *lots* of experience in clusters. and make very good decisions about cluster components and management methods. If you don't, your users will suffer mightily, which means you will suffer mightily too. David From kewley at gps.caltech.edu Fri Apr 28 14:13:58 2006 From: kewley at gps.caltech.edu (David Kewley) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Opteron cooling specifications? In-Reply-To: References: <20060427215454.GA16179@cse.ucdavis.edu> Message-ID: <200604281413.58888.kewley@gps.caltech.edu> On Friday 28 April 2006 08:18, Bruce Allen wrote: > You can also require > that (example language) 'when system is fully loaded the CPU temperature > must be at least 10 Celsuis below the CPU manufacturers maximum > recommended operating temperature'. That might work if the vendor is providing the entire room environment in addition to the computer hardware, but not if they're only supplying the computer hardware. Temperature of the CPU depends directly on ambient temperature (which involves the HVAC units) and airflow patterns (which involves the room and rack designs). David From mwill at penguincomputing.com Fri Apr 28 14:22:51 2006 From: mwill at penguincomputing.com (Michael Will) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Opteron cooling specifications? Message-ID: <433093DF7AD7444DA65EFAFE3987879C107D75@jellyfish.highlyscyld.com> A good set of specs according to our engineers could be: 1. No side vending of hot air from the case. The systems will be placed into 19" racks and there is no place for the air to go if it's blown into the side of the rack. Even if you take the sides off then you still will have racks placed next to each other. Airflow should be 100% front to back. 2. Along with that, there should be no "cheat holes" in the top, bottom or sides of the case. All "fresh" air should be drawn in from the front of the chassis. Again, the system will be racked in a 19" rack and there is no "fresh air" to be drawn in from the sides of the case (see 1 above) nor will the holes be open when nodes are stacked one on top of the other in a fully populated rack (32 nodes per rack). 3. There should be a mechanical separation between the hot and cold sections of the chassis to prevent the internal fans from sucking in hot air from the rear of the chassis. 4. The power supply *must* vent directly to the outside of the case and not inside the chassis. The power supply produces approximately 20% of the heat in the system. That hot air must be vented directly out of the chassis to prevent it from heating other components in the system. 5. The system should employ fan speed control. Running high speed fans at less than rated speed prolongs their life and reduces power usage for the platform as a whole. Fan speed should be controlled by either ambient temperature or preferably by CPU temperature. 6. The system must have a way of measuring fan speed and reporting a fan failure so that failed fans can be replaced quickly. Michael Will / SE Technical Lead / Penguin Computing -----Original Message----- From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Bill Broadley Sent: Thursday, April 27, 2006 2:55 PM To: beowulf@beowulf.org Subject: [Beowulf] Opteron cooling specifications? I'm writing a spec for future opteron cluster purchases. The issue of airflow came up. I've seen a surprising variety of configurations, some with a giant rotating cylinder (think paddle wheel), most with a variety of 40x28 or 40x56mm fans, or horizontal blowers. Anyone have a fan vendor they prefer? Ideally well known for making fans that last 3+ years when in use 24/7. A target node CFM for a dual socket dual core opteron? A target maximum CPU temp? I assume it's wise to stay well below the 70C or so thermal max on most of the dual core Opterons. Seems like there is a huge variation in the number of fans and total CFM from various chassis/node manufacturers. A single core single socket 1u opteron I got from sun has two 40mm x 56mm, and 4 * 40mm x 28mm fans. Not bad for a node starting at $750. Additionally some chassis designs form a fairly decent wall across the node for the fans to insure a good front to back airflow. Others seem to place fans willy nilly, I've even seen some that suck air sideways across the rear opteron. In any case, the nature of the campus purchasing process is that we can put in any specification, but can't buy from a single vendor, or award bids for better engineering. So basically lowest bid wins that meets the spec. Thus the need for a better spec. Any feedback appreciated. -- Bill Broadley Computational Science and Engineering UC Davis _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From wt at atmos.colostate.edu Fri Apr 28 14:31:50 2006 From: wt at atmos.colostate.edu (Warren Turkal) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] maui and uids Message-ID: <200604281531.51063.wt@atmos.colostate.edu> Is there any way to make maui run with a 100 < uid < 1000? wt -- Warren Turkal, Research Associate III/Systems Administrator Colorado State University, Dept. of Atmospheric Science From rgb at phy.duke.edu Fri Apr 28 16:36:22 2006 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] 512 nodes Myrinet cluster Challanges In-Reply-To: <200604281410.35245.kewley@gps.caltech.edu> References: <200604281410.35245.kewley@gps.caltech.edu> Message-ID: On Fri, 28 Apr 2006, David Kewley wrote: > By the way, the idea of rolling-your-own hardware on a large cluster, and > planning on having a small technical team, makes me shiver in horror. If > you go that route, you better have *lots* of experience in clusters. and > make very good decisions about cluster components and management methods. > If you don't, your users will suffer mightily, which means you will suffer > mightily too. I >>have<< lots of experience in clusters and have tried rolling my own nodes for a variety of small and medium sized clusters. Let me clarify. For clusters with more than perhaps 16 nodes, or EVEN 32 if you're feeling masochistic and inclined to heartache: Don't. Or you will have a really high probability of being very, very sorry. 16 node clusters I've done "ok" with, in the sense that the problems were manageable. >32 node clusters, especially if you encounter ANY ex post facto problems with the hardware configuration -- including ones that passed through your original prototyping runs (and yeah, they exist) -- rapidly descend into circle of hell type experiences. Expensive ones. Much more expensive in real money, let alone time, than just buy nodes from a quality vendor of nodes with a 3-4 year onsite service contract, so if they break they'll come fix them (but they don't break -- see word "quality" in the above:-). Other than thinking that "shiver in horror" is somehow inadequate to describe the potential for misery, I endorse pretty much everything else David (and Mark) said -- both these guys know whereof they speak. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From kewley at gps.caltech.edu Fri Apr 28 17:00:33 2006 From: kewley at gps.caltech.edu (David Kewley) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Opteron cooling specifications? In-Reply-To: <433093DF7AD7444DA65EFAFE3987879C107D75@jellyfish.highlyscyld.com> References: <433093DF7AD7444DA65EFAFE3987879C107D75@jellyfish.highlyscyld.com> Message-ID: <200604281700.33950.kewley@gps.caltech.edu> I totally agree with Michael. Except I'd assert that 32 nodes per rack is not a fully populated rack. :) Twenty-four of our 42U racks have 40 nodes apiece, plus 1U for a network switch and 1U blank. It works fine for us and allows us comfortably fit 1024 nodes in our pre-existing room. At full power, one 40-node rack burns about 13kW. Heat has not been a problem -- the only anomaly is that the topmost node in a 40-node rack tends to experience ambient temps a few degrees C higher than the others. That's presumably because some of the rear-vented hot air is recirculating within the rack back around to the front, and rising as it goes. There may well be some other subtle airflow issues involved. But yeah, initially Dell tried to tell me that 32 nodes is a full rack. Pfft. :) In case it's of interest (and because I'm proud of our room :), our arrangement is: A single 3-rack group for infrastructure (SAN, fileservers, master nodes, work nodes, tape backup, central Myrinet switch), placed in the middle of the room. Four groups of 7 racks apiece, each holding 256 compute nodes and associated network equipment. Racks 1-3 and 5-7 are 40-node racks (20 modes at the bottom, then 2U of GigE switch & blank space, then 20 nodes at the top). Rack 4 is 16 nodes at the bottom, a GigE switch in the middle, a Myrinet edge switch at the top, with quite a bit of blank space left over. In the room, these are arranged in a long row: [walkway][7racks][7racks][3racks][walkway][7racks][7racks][walkway] And that *just* barely fits in the room. :) One interesting element: Our switches are 48-port Nortel BayStack switches, so we have a natural arrangement: The 7-rack and 3-rack groups each have one switch stack. The stacking cables go rack-to-rack horizontally between racks (only the end racks have side panels). David On Friday 28 April 2006 14:22, Michael Will wrote: > A good set of specs according to our engineers could be: > > 1. No side vending of hot air from the case. The systems will be placed > into 19" racks and there is no place for the air to go if it's blown > into the side of the rack. Even if you take the sides off then you still > will have racks placed next to each other. Airflow should be 100% front > to back. > > 2. Along with that, there should be no "cheat holes" in the top, bottom > or sides of the case. All "fresh" air should be drawn in from the front > of the chassis. Again, the system will be racked in a 19" rack and there > is no "fresh air" to be drawn in from the sides of the case (see 1 > above) nor will the holes be open when nodes are stacked one on top of > the other in a fully populated rack (32 nodes per rack). > > 3. There should be a mechanical separation between the hot and cold > sections of the chassis to prevent the internal fans from sucking in hot > air from the rear of the chassis. > > 4. The power supply *must* vent directly to the outside of the case and > not inside the chassis. The power supply produces approximately 20% of > the heat in the system. That hot air must be vented directly out of the > chassis to prevent it from heating other components in the system. > > 5. The system should employ fan speed control. Running high speed fans > at less than rated speed prolongs their life and reduces power usage for > the platform as a whole. Fan speed should be controlled by either > ambient temperature or preferably by CPU temperature. > > 6. The system must have a way of measuring fan speed and reporting a fan > failure so that failed fans can be replaced quickly. > > Michael Will / SE Technical Lead / Penguin Computing > > -----Original Message----- > From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] > On Behalf Of Bill Broadley > Sent: Thursday, April 27, 2006 2:55 PM > To: beowulf@beowulf.org > Subject: [Beowulf] Opteron cooling specifications? > > > I'm writing a spec for future opteron cluster purchases. The issue of > airflow came up. > > I've seen a surprising variety of configurations, some with a giant > rotating cylinder (think paddle wheel), most with a variety of 40x28 or > 40x56mm fans, or horizontal blowers. > > Anyone have a fan vendor they prefer? Ideally well known for making > fans that last 3+ years when in use 24/7. > > A target node CFM for a dual socket dual core opteron? > > A target maximum CPU temp? I assume it's wise to stay well below the > 70C or so thermal max on most of the dual core Opterons. > > Seems like there is a huge variation in the number of fans and total CFM > from various chassis/node manufacturers. A single core single socket 1u > opteron I got from sun has two 40mm x 56mm, and 4 * 40mm x 28mm fans. > Not bad for a node starting at $750. > > Additionally some chassis designs form a fairly decent wall across the > node for the fans to insure a good front to back airflow. Others seem > to place fans willy nilly, I've even seen some that suck air sideways > across the rear opteron. > > In any case, the nature of the campus purchasing process is that we can > put in any specification, but can't buy from a single vendor, or award > bids for better engineering. So basically lowest bid wins that meets > the spec. Thus the need for a better spec. > > Any feedback appreciated. > > -- > Bill Broadley > Computational Science and Engineering > UC Davis > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org To change your subscription > (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From mwill at penguincomputing.com Sat Apr 29 10:02:34 2006 From: mwill at penguincomputing.com (Michael Will) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Opteron cooling specifications? In-Reply-To: <200604281700.33950.kewley@gps.caltech.edu> References: <433093DF7AD7444DA65EFAFE3987879C107D75@jellyfish.highlyscyld.com> <200604281700.33950.kewley@gps.caltech.edu> Message-ID: <44539C2A.9040706@penguincomputing.com> David, 32 compute nodes per rack is just good practice, but of course not the maximum if you are willing to cut corners and do somewhat nonstandard and potentially riskier power (i.e. L21-30 208V 3phase). Where do your power cords go, and how many PDU's and cords do you have going to each rack? Our 32 node racks have only two L21-20 cords for the nodes and comply with national fire and electrical codes that limit continuous loads on a circuit to 80% of the breaker rating. The extra space in the rack is dedicated to the remaining cluster infrastructure, i.e. 2U headnode, switches, LCD-keyboard-tray, storage, and room to run cables between racks. The additional infrastructure is not powered by the two L21-20 but fits on a separate single phase 120V circuit, typically a UPS to protect headnode and storage, or just a plain 1U PDU. That means 32 compute nodes and headnode and storage and LCD/keyboard together are fed by a total of 7 120V 20A phases of your breaker panel, utilizing three cords that go to your rack. Michael Will David Kewley wrote: > I totally agree with Michael. > > Except I'd assert that 32 nodes per rack is not a fully populated rack. :) > Twenty-four of our 42U racks have 40 nodes apiece, plus 1U for a network > switch and 1U blank. It works fine for us and allows us comfortably fit > 1024 nodes in our pre-existing room. > > At full power, one 40-node rack burns about 13kW. Heat has not been a > problem -- the only anomaly is that the topmost node in a 40-node rack > tends to experience ambient temps a few degrees C higher than the others. > That's presumably because some of the rear-vented hot air is recirculating > within the rack back around to the front, and rising as it goes. There may > well be some other subtle airflow issues involved. > > But yeah, initially Dell tried to tell me that 32 nodes is a full rack. > Pfft. :) > > In case it's of interest (and because I'm proud of our room :), our > arrangement is: > > A single 3-rack group for infrastructure (SAN, fileservers, master nodes, > work nodes, tape backup, central Myrinet switch), placed in the middle of > the room. > > Four groups of 7 racks apiece, each holding 256 compute nodes and associated > network equipment. Racks 1-3 and 5-7 are 40-node racks (20 modes at the > bottom, then 2U of GigE switch & blank space, then 20 nodes at the top). > Rack 4 is 16 nodes at the bottom, a GigE switch in the middle, a Myrinet > edge switch at the top, with quite a bit of blank space left over. > > In the room, these are arranged in a long row: > > [walkway][7racks][7racks][3racks][walkway][7racks][7racks][walkway] > > And that *just* barely fits in the room. :) > > One interesting element: Our switches are 48-port Nortel BayStack switches, > so we have a natural arrangement: The 7-rack and 3-rack groups each have > one switch stack. The stacking cables go rack-to-rack horizontally between > racks (only the end racks have side panels). > > David > > On Friday 28 April 2006 14:22, Michael Will wrote: > >> A good set of specs according to our engineers could be: >> >> 1. No side vending of hot air from the case. The systems will be placed >> into 19" racks and there is no place for the air to go if it's blown >> into the side of the rack. Even if you take the sides off then you still >> will have racks placed next to each other. Airflow should be 100% front >> to back. >> >> 2. Along with that, there should be no "cheat holes" in the top, bottom >> or sides of the case. All "fresh" air should be drawn in from the front >> of the chassis. Again, the system will be racked in a 19" rack and there >> is no "fresh air" to be drawn in from the sides of the case (see 1 >> above) nor will the holes be open when nodes are stacked one on top of >> the other in a fully populated rack (32 nodes per rack). >> >> 3. There should be a mechanical separation between the hot and cold >> sections of the chassis to prevent the internal fans from sucking in hot >> air from the rear of the chassis. >> >> 4. The power supply *must* vent directly to the outside of the case and >> not inside the chassis. The power supply produces approximately 20% of >> the heat in the system. That hot air must be vented directly out of the >> chassis to prevent it from heating other components in the system. >> >> 5. The system should employ fan speed control. Running high speed fans >> at less than rated speed prolongs their life and reduces power usage for >> the platform as a whole. Fan speed should be controlled by either >> ambient temperature or preferably by CPU temperature. >> >> 6. The system must have a way of measuring fan speed and reporting a fan >> failure so that failed fans can be replaced quickly. >> >> Michael Will / SE Technical Lead / Penguin Computing >> >> -----Original Message----- >> From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] >> On Behalf Of Bill Broadley >> Sent: Thursday, April 27, 2006 2:55 PM >> To: beowulf@beowulf.org >> Subject: [Beowulf] Opteron cooling specifications? >> >> >> I'm writing a spec for future opteron cluster purchases. The issue of >> airflow came up. >> >> I've seen a surprising variety of configurations, some with a giant >> rotating cylinder (think paddle wheel), most with a variety of 40x28 or >> 40x56mm fans, or horizontal blowers. >> >> Anyone have a fan vendor they prefer? Ideally well known for making >> fans that last 3+ years when in use 24/7. >> >> A target node CFM for a dual socket dual core opteron? >> >> A target maximum CPU temp? I assume it's wise to stay well below the >> 70C or so thermal max on most of the dual core Opterons. >> >> Seems like there is a huge variation in the number of fans and total CFM >> from various chassis/node manufacturers. A single core single socket 1u >> opteron I got from sun has two 40mm x 56mm, and 4 * 40mm x 28mm fans. >> Not bad for a node starting at $750. >> >> Additionally some chassis designs form a fairly decent wall across the >> node for the fans to insure a good front to back airflow. Others seem >> to place fans willy nilly, I've even seen some that suck air sideways >> across the rear opteron. >> >> In any case, the nature of the campus purchasing process is that we can >> put in any specification, but can't buy from a single vendor, or award >> bids for better engineering. So basically lowest bid wins that meets >> the spec. Thus the need for a better spec. >> >> Any feedback appreciated. >> >> -- >> Bill Broadley >> Computational Science and Engineering >> UC Davis >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org To change your subscription >> (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> -- Michael Will Penguin Computing Corp. Sales Engineer 415-954-2822 415-954-2899 fx mwill@penguincomputing.com From kewley at gps.caltech.edu Sat Apr 29 18:08:49 2006 From: kewley at gps.caltech.edu (David Kewley) Date: Wed Nov 25 01:04:59 2009 Subject: [Beowulf] Opteron cooling specifications? In-Reply-To: <44539C2A.9040706@penguincomputing.com> References: <433093DF7AD7444DA65EFAFE3987879C107D75@jellyfish.highlyscyld.com> <200604281700.33950.kewley@gps.caltech.edu> <44539C2A.9040706@penguincomputing.com> Message-ID: <200604291808.49855.kewley@gps.caltech.edu> On Saturday 29 April 2006 10:02, Michael Will wrote: > David, > > 32 compute nodes per rack is just good practice, but of course not the > maximum if you are willing > to cut corners and do somewhat nonstandard and potentially riskier power > (i.e. L21-30 208V 3phase). > > Where do your power cords go, and how many PDU's and cords do you have > going to each rack? What does it mean to say that 32 is "standard"? Why shouldn't 40 be standard, other than perhaps 32 is more typically done? How do you conclude that putting 40 in a rack rather than 32 necessarily means you cut corners? Just because we had our sites set on fitting all those machines into the room doesn't mean that the rack setup is poor (it's actually excellent, come by and see it some time), nor that we cut corners to make things fit. We didn't. Many people who have seen our room remark that it's an outstandingly good room, so it's quite possible to do what we did and do it well -- 40 per rack is no problem if you design it well. We use L21-20 208 V 3-phase (note -20 not -30). How is that riskier? We have 3 L21-20 based power strips per rack, APC units each rated (80% derated) at 5.7kW. The 3' cords go under the raised floor to outlets that are fastened to a rail that runs directly under the raised floor gridwork. The outlets are supplied by flexible, waterproof conduit that runs across the cement subfloor to the raised-floor-mounted PDUs. The cords are essentially invisible when the rack doors are closed, and are completely out of the way. > Our 32 node racks have only two L21-20 cords for the nodes and comply > with national fire and electrical codes > that limit continuous loads on a circuit to 80% of the breaker rating. Same here, but with 3 rack PDUs. It's not a problem for us. > The extra space in the rack is dedicated to the remaining cluster > infrastructure, i.e. 2U headnode, switches, > LCD-keyboard-tray, storage, and room to run cables between racks. The > additional infrastructure is not powered > by the two L21-20 but fits on a separate single phase 120V circuit, > typically a UPS to protect headnode and storage, > or just a plain 1U PDU. That means 32 compute nodes and headnode and > storage and LCD/keyboard together > are fed by a total of 7 120V 20A phases of your breaker panel, utilizing > three cords that go to your rack. Right, that works in your setup, especially for smaller clusters. For our large cluster, 40 per rack (for most but not all racks) works great. David > Michael Will > > David Kewley wrote: > > I totally agree with Michael. > > > > Except I'd assert that 32 nodes per rack is not a fully populated rack. > > :) Twenty-four of our 42U racks have 40 nodes apiece, plus 1U for a > > network switch and 1U blank. It works fine for us and allows us > > comfortably fit 1024 nodes in our pre-existing room. > > > > At full power, one 40-node rack burns about 13kW. Heat has not been a > > problem -- the only anomaly is that the topmost node in a 40-node rack > > tends to experience ambient temps a few degrees C higher than the > > others. That's presumably because some of the rear-vented hot air is > > recirculating within the rack back around to the front, and rising as > > it goes. There may well be some other subtle airflow issues involved. > > > > But yeah, initially Dell tried to tell me that 32 nodes is a full rack. > > Pfft. :) > > > > In case it's of interest (and because I'm proud of our room :), our > > arrangement is: > > > > A single 3-rack group for infrastructure (SAN, fileservers, master > > nodes, work nodes, tape backup, central Myrinet switch), placed in the > > middle of the room. > > > > Four groups of 7 racks apiece, each holding 256 compute nodes and > > associated network equipment. Racks 1-3 and 5-7 are 40-node racks (20 > > modes at the bottom, then 2U of GigE switch & blank space, then 20 > > nodes at the top). Rack 4 is 16 nodes at the bottom, a GigE switch in > > the middle, a Myrinet edge switch at the top, with quite a bit of blank > > space left over. > > > > In the room, these are arranged in a long row: > > > > [walkway][7racks][7racks][3racks][walkway][7racks][7racks][walkway] > > > > And that *just* barely fits in the room. :) > > > > One interesting element: Our switches are 48-port Nortel BayStack > > switches, so we have a natural arrangement: The 7-rack and 3-rack > > groups each have one switch stack. The stacking cables go rack-to-rack > > horizontally between racks (only the end racks have side panels). > > > > David > > > > On Friday 28 April 2006 14:22, Michael Will wrote: > >> A good set of specs according to our engineers could be: > >> > >> 1. No side vending of hot air from the case. The systems will be > >> placed into 19" racks and there is no place for the air to go if it's > >> blown into the side of the rack. Even if you take the sides off then > >> you still will have racks placed next to each other. Airflow should be > >> 100% front to back. > >> > >> 2. Along with that, there should be no "cheat holes" in the top, > >> bottom or sides of the case. All "fresh" air should be drawn in from > >> the front of the chassis. Again, the system will be racked in a 19" > >> rack and there is no "fresh air" to be drawn in from the sides of the > >> case (see 1 above) nor will the holes be open when nodes are stacked > >> one on top of the other in a fully populated rack (32 nodes per rack). > >> > >> 3. There should be a mechanical separation between the hot and cold > >> sections of the chassis to prevent the internal fans from sucking in > >> hot air from the rear of the chassis. > >> > >> 4. The power supply *must* vent directly to the outside of the case > >> and not inside the chassis. The power supply produces approximately > >> 20% of the heat in the system. That hot air must be vented directly > >> out of the chassis to prevent it from heating other components in the > >> system. > >> > >> 5. The system should employ fan speed control. Running high speed fans > >> at less than rated speed prolongs their life and reduces power usage > >> for the platform as a whole. Fan speed should be controlled by either > >> ambient temperature or preferably by CPU temperature. > >> > >> 6. The system must have a way of measuring fan speed and reporting a > >> fan failure so that failed fans can be replaced quickly. > >> > >> Michael Will / SE Technical Lead / Penguin Computing > >> > >> -----Original Message----- > >> From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] > >> On Behalf Of Bill Broadley > >> Sent: Thursday, April 27, 2006 2:55 PM > >> To: beowulf@beowulf.org > >> Subject: [Beowulf] Opteron cooling specifications? > >> > >> > >> I'm writing a spec for future opteron cluster purchases. The issue of > >> airflow came up. > >> > >> I've seen a surprising variety of configurations, some with a giant > >> rotating cylinder (think paddle wheel), most with a variety of 40x28 > >> or 40x56mm fans, or horizontal blowers. > >> > >> Anyone have a fan vendor they prefer? Ideally well known for making > >> fans that last 3+ years when in use 24/7. > >> > >> A target node CFM for a dual socket dual core opteron? > >> > >> A target maximum CPU temp? I assume it's wise to stay well below the > >> 70C or so thermal max on most of the dual core Opterons. > >> > >> Seems like there is a huge variation in the number of fans and total > >> CFM from various chassis/node manufacturers. A single core single > >> socket 1u opteron I got from sun has two 40mm x 56mm, and 4 * 40mm x > >> 28mm fans. Not bad for a node starting at $750. > >> > >> Additionally some chassis designs form a fairly decent wall across the > >> node for the fans to insure a good front to back airflow. Others seem > >> to place fans willy nilly, I've even seen some that suck air sideways > >> across the rear opteron. > >> > >> In any case, the nature of the campus purchasing process is that we > >> can put in any specification, but can't buy from a single vendor, or > >> award bids for better engineering. So basically lowest bid wins that > >> meets the spec. Thus the need for a better spec. > >> > >> Any feedback appreciated. > >> > >> -- > >> Bill Broadley > >> Computational Science and Engineering > >> UC Davis > >> _______________________________________________ > >> Beowulf mailing list, Beowulf@beowulf.org To change your subscription > >> (digest mode or unsubscribe) visit > >> http://www.beowulf.org/mailman/listinfo/beowulf > >> > >> _______________________________________________ > >> Beowulf mailing list, Beowulf@beowulf.org > >> To change your subscription (digest mode or unsubscribe) visit > >> http://www.beowulf.org/mailman/listinfo/beowulf From john.hearns at streamline-computing.com Sun Apr 30 00:12:17 2006 From: john.hearns at streamline-computing.com (John Hearns) Date: Wed Nov 25 01:05:00 2009 Subject: [Beowulf] Opteron cooling specifications? In-Reply-To: <200604291808.49855.kewley@gps.caltech.edu> References: <433093DF7AD7444DA65EFAFE3987879C107D75@jellyfish.highlyscyld.com> <200604281700.33950.kewley@gps.caltech.edu> <44539C2A.9040706@penguincomputing.com> <200604291808.49855.kewley@gps.caltech.edu> Message-ID: <1146381138.4990.2.camel@Vigor12> On Sat, 2006-04-29 at 18:08 -0700, David Kewley wrote: > What does it mean to say that 32 is "standard"? Why shouldn't 40 be > standard, other than perhaps 32 is more typically done? > Surely related to the maximum amperage you supply to the rack? With dual cores and now quad socket systems current draws per node are going up. In European installs, I see 2 times 32 Amp feeds per rack being normal. From csamuel at vpac.org Sun Apr 30 01:40:30 2006 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:05:00 2009 Subject: [Beowulf] Faulty Single Core Opterons - Can Overheat, Cause Floating Point Errors Message-ID: <200604301840.31026.csamuel@vpac.org> Hi folks, http://www.amd.com/us-en/0,,3715_13965,00.html AMD has identified, and subsequently corrected, a test escape that occurred in our post-manufacturing product testing process for a limited number of single-core AMD Opteron? processor models x52 and x54. No other single-core AMD Opteron processors, and no dual-core AMD Opteron processors, are affected. [...] You must be operating single-core AMD Opteron x52 (2.6 GHz) or x54 (2.8 GHz) processor-based systems, AND you must be running floating point-intensive code sequences. [...] -- Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager Victorian Partnership for Advanced Computing http://www.vpac.org/ Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia From kewley at gps.caltech.edu Sun Apr 30 02:35:26 2006 From: kewley at gps.caltech.edu (David Kewley) Date: Wed Nov 25 01:05:00 2009 Subject: [Beowulf] Opteron cooling specifications? In-Reply-To: <1146381138.4990.2.camel@Vigor12> References: <433093DF7AD7444DA65EFAFE3987879C107D75@jellyfish.highlyscyld.com> <200604291808.49855.kewley@gps.caltech.edu> <1146381138.4990.2.camel@Vigor12> Message-ID: <200604300235.27583.kewley@gps.caltech.edu> On Sunday 30 April 2006 00:12, John Hearns wrote: > On Sat, 2006-04-29 at 18:08 -0700, David Kewley wrote: > > What does it mean to say that 32 is "standard"? Why shouldn't 40 be > > standard, other than perhaps 32 is more typically done? > > Surely related to the maximum amperage you supply to the rack? > With dual cores and now quad socket systems current draws per node are > going up. In European installs, I see 2 times 32 Amp feeds per rack > being normal. But there's not a "standard" maximum amperage you can supply to a rack. As long as you have the power available, and have the means to remove the heat, you can do what you want. We're an example, and I would be astonished if we're the only example. If we are, it seems to me something is wrong with "standard" ways of thinking. My point is, the word "standard" tends to suggest that doing anything else is to be discouraged, out of the ordinary, etc. I see *no* reason that 32 nodes should be considered "standard". If you can only supply 20 A to a rack, *your* standard will be just a few nodes per rack. If you can supply 100 A to a rack, *your* standard will be much higher. So there is no "standard" that applies to all sites. No? David From hahn at physics.mcmaster.ca Sun Apr 30 09:42:57 2006 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:05:00 2009 Subject: [Beowulf] 512 nodes Myrinet cluster Challanges In-Reply-To: Message-ID: > > By the way, the idea of rolling-your-own hardware on a large cluster, and > > planning on having a small technical team, makes me shiver in horror. If > > you go that route, you better have *lots* of experience in clusters. and > > make very good decisions about cluster components and management methods. > > If you don't, your users will suffer mightily, which means you will suffer > > mightily too. I believe that overstates the case significantly. some clusters are just plain easy. it's entirely possible to buy a significant number of conservative compute nodes, toss them onto a generic switch or two, and run the whole thing for a couple years without any real effort. I did it, and while I have a lot of experience, I didn't apply any deep voodoo for the cluster I'm thinking of. it started out with a good solid login/file/boot server (4U, 6x scsi, dual-xeon 2.4, 1G ram), a single 48pt 100bt (1G up) switch, and 48 dual-xeon nodes (diskful but not disk-booting). it was a delight to install, maintain and manage. I originally built it with APC controllable PDUs, but in the process of moving it, stripped them out as I didn't need them. (I _do_ always require net-IPMI on anything newly purchased.) I've added more nodes to the cluster since then - dual-opteron nodes and a couple GE switches. > For clusters with more than perhaps 16 nodes, or EVEN 32 if you're > feeling masochistic and inclined to heartache: with all respect to rgb, I don't think size is a primary factor in cluster building/maintaining/etc effort. certainly it does eventually become a concern, but that's primarily a statistical result of MTBF/nnodes. it's quite possible to choose hardware to maximize MTBF and configuration risk. in the cluster above, I choose a chassis (AIC) which has a large centrifugal blower, rather than a bunch of 40mm axial/muffin fans. a much larger cluster I'm working on now (768 nodes) has 14 40mm muffin fans in each node! while I know I can rely on the vendor (HP) to replace failures promptly and without complaint, there's an interesting side-effect: power dissipation. of 12 fans pointing at the CPUs are actually paired inline, and each pair is rated to dissipate up to 20W. so a node that idles at 210W and 265W under full load can easily consume 340W if the fans are ramped up. ouch! this is probably the most significant size-dependent factor for me. if you're doing your own 32-node cluster, it's pretty easy to manage the cooling. the difference between dissipating 300 and 400W is less than a ton of chiller capacity. scraping up 10-20 additional tons of capacity is quite a different proposition. regards, mark hahn. From joelja at darkwing.uoregon.edu Sun Apr 30 10:10:07 2006 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Wed Nov 25 01:05:00 2009 Subject: [Beowulf] Opteron cooling specifications? In-Reply-To: <200604300235.27583.kewley@gps.caltech.edu> References: <433093DF7AD7444DA65EFAFE3987879C107D75@jellyfish.highlyscyld.com> <200604291808.49855.kewley@gps.caltech.edu> <1146381138.4990.2.camel@Vigor12> <200604300235.27583.kewley@gps.caltech.edu> Message-ID: On Sun, 30 Apr 2006, David Kewley wrote: > On Sunday 30 April 2006 00:12, John Hearns wrote: >> On Sat, 2006-04-29 at 18:08 -0700, David Kewley wrote: >>> What does it mean to say that 32 is "standard"? Why shouldn't 40 be >>> standard, other than perhaps 32 is more typically done? >> >> Surely related to the maximum amperage you supply to the rack? >> With dual cores and now quad socket systems current draws per node are >> going up. In European installs, I see 2 times 32 Amp feeds per rack >> being normal. > > But there's not a "standard" maximum amperage you can supply to a rack. As > long as you have the power available, and have the means to remove the > heat, you can do what you want. We're an example, and I would be > astonished if we're the only example. If we are, it seems to me something > is wrong with "standard" ways of thinking. > > My point is, the word "standard" tends to suggest that doing anything else > is to be discouraged, out of the ordinary, etc. I see *no* reason that 32 > nodes should be considered "standard". If you can only supply 20 A to a > rack, *your* standard will be just a few nodes per rack. If you can supply > 100 A to a rack, *your* standard will be much higher. So there is no > "standard" that applies to all sites. > > No? 50u racks are also available (I've seen them in colo facilities)... personally I don't like racking things that far over my head. We've being switching cabinets in one of our datacenters to 220volt service to support the sort of density we're seeing without running new conductors. > David > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- -------------------------------------------------------------------------- Joel Jaeggli Unix Consulting joelja@darkwing.uoregon.edu GPG Key Fingerprint: 5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2 From landman at scalableinformatics.com Sun Apr 30 10:49:13 2006 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:05:00 2009 Subject: [Beowulf] 512 nodes Myrinet cluster Challanges In-Reply-To: References: Message-ID: <4454F899.9090803@scalableinformatics.com> Mark Hahn wrote: > in the cluster above, I choose a chassis (AIC) which has a large centrifugal > blower, rather than a bunch of 40mm axial/muffin fans. a much larger cluster > I'm working on now (768 nodes) has 14 40mm muffin fans in each node! while > I know I can rely on the vendor (HP) to replace failures promptly and without > complaint, there's an interesting side-effect: power dissipation. of 12 fans > pointing at the CPUs are actually paired inline, and each pair is rated to > dissipate up to 20W. so a node that idles at 210W and 265W under full load > can easily consume 340W if the fans are ramped up. ouch! Heh... Load up a Sun v20z with a CPU pounding calculation, and as the CPUs heat up, they wind the fans up. I had several parallel GAMESS runs going and it reminded me of a turbo pump winding up .... Its neat to probe the hardware status and see the fans wind up from 4-5000 RPM through 12kRPM ... -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615 From hahn at physics.mcmaster.ca Sun Apr 30 11:08:54 2006 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:05:00 2009 Subject: [Beowulf] 512 nodes Myrinet cluster Challanges In-Reply-To: <4454F899.9090803@scalableinformatics.com> Message-ID: > Heh... Load up a Sun v20z with a CPU pounding calculation, and as the > CPUs heat up, they wind the fans up. I had several parallel GAMESS runs yes, I actually have no problem with thermostatically-controlled fans. in my case, the control algorithm appears to need some work, though, since it's common to find fairly cool air jetting out of our racks. > going and it reminded me of a turbo pump winding up .... Its neat to > probe the hardware status and see the fans wind up from 4-5000 RPM > through 12kRPM ... 12K is moderate - these, at least according to IPMI, go up to 16.8K ;) From markc.westwood at gmail.com Fri Apr 28 08:47:42 2006 From: markc.westwood at gmail.com (MarkC Westwood) Date: Wed Nov 25 01:05:00 2009 Subject: [Beowulf] Re: 512 nodes Myrinet cluster Challanges (Walid) Message-ID: Walid Once you've surmounted the challenges of power and cooling and maintenance, you can start tackling the challenges of making your codes run efficiently on such a large cluster. And that's where the fun begins, and that's what makes us do it. Good luck Mark Westwood Principal Software Engineer MTEM Ltd www.mtem.com From stalin_ni at rediffmail.com Sun Apr 30 03:29:10 2006 From: stalin_ni at rediffmail.com (stalin natesan) Date: Wed Nov 25 01:05:00 2009 Subject: [Beowulf] RARP request:MAC address match but no netmask match Message-ID: <20060430102910.1756.qmail@webmail26.rediffmail.com> ? hi all! i am using redhat 9 for my master node with via-rhine network card(in motherboard itself). When i am trying to boot slave node using floppy, everything goes fine, until RARP: Sending RARP requests. after sometimes the system reporting the PCI devices list and restarts. when i opened the log file in /var/log/messages it says RARP : MAC address match but no netmask match for the node The last message repeats .. times. (even nodeadd adding the MAC address to the config file,no problem in that) Is there any solution for my problem, please let me know. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20060430/88081098/attachment.html From Michael.Fitzmaurice at gtsi.com Fri Apr 28 12:09:06 2006 From: Michael.Fitzmaurice at gtsi.com (Michael Fitzmaurice) Date: Wed Nov 25 01:05:00 2009 Subject: [Beowulf] bwbug: New iGrid Storage Technology at Beowulf Meeting & Web Cast - May 9, 2006 Message-ID: Skipped content of type multipart/alternative-------------- next part -------------- _______________________________________________ bwbug mailing list bwbug@bwbug.org http://www.pbm.com/mailman/listinfo/bwbug