From eagles051387 at gmail.com Tue Jul 1 00:26:15 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:19 2009 Subject: Commodity supercomputing, was: Re: NDAs Re: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point? In-Reply-To: <6D1C4C9B-432F-4547-93F4-391B0847951D@xs4all.nl> References: <1210016466.4924.1.camel@Vigor13> <48551E70.7070507@scalableinformatics.com> <4AF41375-3A13-4691-A2A1-D5B853FEC3A4@xs4all.nl> <20080615154227.u8fwdpn08ww4c40k@webmail.jpl.nasa.gov> <6.2.5.6.2.20080616084554.02e4dd18@jpl.nasa.gov> <486923D6.8070907@moene.indiv.nluug.nl> <6D1C4C9B-432F-4547-93F4-391B0847951D@xs4all.nl> Message-ID: not sure if this applies to all kinds of senarios that clusters are used in but isnt the more ram you have the better? On 6/30/08, Vincent Diepeveen wrote: > > Toon, > > Can you drop a line on how important RAM is for weather forecasting in > latest type of calculations you're performing? > > Thanks, > Vincent > > > On Jun 30, 2008, at 8:20 PM, Toon Moene wrote: > > Jim Lux wrote: >> >> Yep. And for good reason. Even a big DoD job is still tiny in Nvidia's >>> scale of operations. We face this all the time with NASA work. >>> Semiconductor manufacturers have no real reason to produce special purpose >>> or customized versions of their products for space use, because they can >>> sell all they can make to the consumer market. More than once, I've had a >>> phone call along the lines of this: >>> "Jim: I'm interested in your new ABC321 part." >>> "Rep: Great. I'll just send the NDA over and we can talk about it." >>> "Jim: Great, you have my email and my fax # is..." >>> "Rep: By the way, what sort of volume are you going to be using?" >>> "Jim: Oh, 10-12.." >>> "Rep: thousand per week, excellent..." >>> "Jim: No, a dozen pieces, total, lifetime buy, or at best maybe every >>> year." >>> "Rep: Oh..." >>> {Well, to be fair, it's not that bad, they don't hang up on you.. >>> >> >> Since about a year, it's been clear to me that weather forecasting (i.e., >> running a more or less sophisticated atmospheric model to provide weather >> predictions) is going to be "mainstream" in the sense that every business >> that needs such forecasts for its operations can simply run them in-house. >> >> Case in point: I bought a $1100 HP box (the obvious target group being >> teenage downloaders) which performs the HIRLAM limited area model *on the >> grid that we used until October 2006* in December last year. >> >> It's about twice as slow as our then-operational 50-CPU Sun Fire 15K. >> >> I wonder what effect this will have on CPU developments ... >> >> -- >> Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 >> Saturnushof 14, 3738 XG Maartensdijk, The Netherlands >> At home: http://moene.indiv.nluug.nl/~toon/ >> Progress of GNU Fortran: http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html >> > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/3a4124c6/attachment.html From andrew at moonet.co.uk Tue Jul 1 01:20:56 2008 From: andrew at moonet.co.uk (andrew holway) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> Message-ID: Hi Jon, We have our own stack which we stick on top of the customers favourite red hat clone. Usually Scientific Linux. Here is a bit more about it. http://www.clustervision.com/products_os.php We sell as a standalone product and it does quite well. I could even go so far to say that it is 'stack of choice' in many European institutions. We have done a couple of M$ installations too. Ta Andy On Sat, Jun 28, 2008 at 12:09 PM, Jon Aquilina wrote: > congrats. just wondering what distro is being used on your clusters? > > On Thu, Jun 26, 2008 at 8:52 PM, Joe Landman > wrote: >> >> andrew holway wrote: >>> >>> http://www.clustervision.com/pr_top500_uk.php >> >> cool ... congratulations to ClusterVision! >> >> -- >> Joseph Landman, Ph.D >> Founder and CEO >> Scalable Informatics LLC, >> email: landman@scalableinformatics.com >> web : http://www.scalableinformatics.com >> http://jackrabbit.scalableinformatics.com >> phone: +1 734 786 8423 >> fax : +1 866 888 3112 >> cell : +1 734 612 4615 >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf > > > > -- > Jonathan Aquilina From Dan.Kidger at quadrics.com Tue Jul 1 01:42:59 2008 From: Dan.Kidger at quadrics.com (Dan.Kidger@quadrics.com) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> Message-ID: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> >Hi Jon, >We have our own stack which we stick on top of the customers favourite >red hat clone. Usually Scientific Linux. > >Here is a bit more about it. > >http://www.clustervision.com/products_os.php > >We sell as a standalone product and it does quite well. I could even >go so far to say that it is 'stack of choice' in many European >institutions. Every throught of getting a job in Sales and Marketing? :-) Daniel. On Sat, Jun 28, 2008 at 12:09 PM, Jon Aquilina wrote: > congrats. just wondering what distro is being used on your clusters? > > On Thu, Jun 26, 2008 at 8:52 PM, Joe Landman > wrote: >> >> andrew holway wrote: >>> >>> http://www.clustervision.com/pr_top500_uk.php >> >> cool ... congratulations to ClusterVision! >> >> -- >> Joseph Landman, Ph.D >> Founder and CEO >> Scalable Informatics LLC, From Dan.Kidger at quadrics.com Tue Jul 1 01:46:14 2008 From: Dan.Kidger at quadrics.com (Dan.Kidger@quadrics.com) Date: Wed Nov 25 01:07:19 2009 Subject: Commodity supercomputing, was: Re: NDAs Re: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point? In-Reply-To: <1214864562.6912.29.camel@Vigor13> References: <1210016466.4924.1.camel@Vigor13> <48551E70.7070507@scalableinformatics.com> <4AF41375-3A13-4691-A2A1-D5B853FEC3A4@xs4all.nl> <20080615154227.u8fwdpn08ww4c40k@webmail.jpl.nasa.gov> <6.2.5.6.2.20080616084554.02e4dd18@jpl.nasa.gov> <486923D6.8070907@moene.indiv.nluug.nl> <1214864562.6912.29.camel@Vigor13> Message-ID: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AD@quadbrsex1.quadrics.com> John is correct here. It is one thing to do long range climate prediction yourself using distributed computing and tweaking the stochastics based on a set of starting conditions, and another to try and work out if it will be sunny next Tuesday. Weather modelling is a different animal to CP- you need a supply of fresh input data - and a sophisticated infrastructure to harvest , collate, sanitise and feed these numbers into your computer model. Also with CP you typically run many instances concurrently which takes weeks/months to complete, but with WM, you have maybe 6 hours to run the whole job from start to finish which implies a closely coupled cluster. Daniel ------------------------------------------------------------- Dr. Daniel Kidger, Quadrics Ltd. daniel.kidger@quadrics.com One Bridewell St., Mobile: +44 (0)779 209 1851 Bristol, BS1 2AA, UK Office: +44 (0)117 915 5519 ----------------------- www.quadrics.com -------------------- -----Original Message----- From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of John Hearns Sent: 30 June 2008 23:23 To: beowulf@beowulf.org Subject: Re: Commodity supercomputing, was: Re: NDAs Re: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point? On Mon, 2008-06-30 at 20:20 +0200, Toon Moene wrote: > > Since about a year, it's been clear to me that weather forecasting > (i.e., running a more or less sophisticated atmospheric model to provide > weather predictions) is going to be "mainstream" in the sense that every > business that needs such forecasts for its operations can simply run > them in-house. Garbage in, garbage out. By that I mean that the CPU horsepower may be more and more readily affordable for businesses like that - let's say it is an ice-cream wholesaler who would like to have a three day forecast to allow stocking of their outlets with ice cream. However, the models depend on input from sensor networks - not my area of expertise, but I should imagine manned and unmanned weather stations, ocean buoys to measure wave height, satellite sensors. Do we see such data sources being made freely available, and in real time (ie not archived data sets)?? _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From eagles051387 at gmail.com Tue Jul 1 02:28:59 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] A press release In-Reply-To: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> Message-ID: >We have our own stack which we stick on top of the customers favourite >red hat clone. Usually Scientific Linux. does it necessarily have to be a redhat clone. can it also be a debian based clone? On 7/1/08, Dan.Kidger@quadrics.com wrote: > > >Hi Jon, > > >We have our own stack which we stick on top of the customers favourite > >red hat clone. Usually Scientific Linux. > > > >Here is a bit more about it. > > > >http://www.clustervision.com/products_os.php > > > >We sell as a standalone product and it does quite well. I could even > >go so far to say that it is 'stack of choice' in many European > >institutions. > > Every throught of getting a job in Sales and Marketing? :-) > > > Daniel. > > > On Sat, Jun 28, 2008 at 12:09 PM, Jon Aquilina > wrote: > > congrats. just wondering what distro is being used on your clusters? > > > > On Thu, Jun 26, 2008 at 8:52 PM, Joe Landman > > wrote: > >> > >> andrew holway wrote: > >>> > >>> http://www.clustervision.com/pr_top500_uk.php > >> > >> cool ... congratulations to ClusterVision! > >> > >> -- > >> Joseph Landman, Ph.D > >> Founder and CEO > >> Scalable Informatics LLC, > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/5e20fcc3/attachment.html From henning.fehrmann at aei.mpg.de Tue Jul 1 02:36:43 2008 From: henning.fehrmann at aei.mpg.de (Henning Fehrmann) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] automount on high ports Message-ID: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> Hello, we need to automount NFS directories on high ports to increase the number of possible mounts. Currently, we are limited up to ca 360 mounts. The NFS-server exports with the option 'insecure' but the mounts still end up on ports <1024 on the client side. Is there a way to enable automounts on higher ports? How can it be done manually: mount -t nfs -o ....? We are using autofs version 5. Thank you, Henning From steve_heaton at exemail.com.au Tue Jul 1 03:28:40 2008 From: steve_heaton at exemail.com.au (Particle Boy) Date: Wed Nov 25 01:07:19 2009 Subject: Commodity supercomputing, was: Re: NDAs Re: [Beowulf], Nvidia, cuda, tesla and... where's my double floating point? In-Reply-To: <200807010728.m617S3Uc011226@bluewest.scyld.com> References: <200807010728.m617S3Uc011226@bluewest.scyld.com> Message-ID: <486A06D8.2050705@exemail.com.au> Date: Mon, 30 Jun 2008 23:22:32 +0100 From: John Hearns > However, the models depend on input from sensor networks - not my area > of expertise, but I should imagine manned and unmanned weather >stations, >ocean buoys to measure wave height, satellite sensors. >Do we see such data sources being made freely available, and in real >time (ie not archived data sets)?? G'day John and all In a nutshell yes, you can can get sets of initial conditions from various agencies around the globe. The NCEP at NOAA is a great resource. SOO/STRC at UCAR packages WRF EMS with the pointers built right in for the various feeds :) Cheers Stevo From eagles051387 at gmail.com Tue Jul 1 03:38:52 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] open mosix alternative Message-ID: does anyone know an altenative to openmosix?? would it be worth reviving the development of the kernel? -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/430935c0/attachment.html From eagles051387 at gmail.com Tue Jul 1 03:39:48 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] software for compatible with a cluster Message-ID: does anyone know of any rendering software that will work with a cluster? -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/45e77b3e/attachment.html From geoff at galitz.org Tue Jul 1 04:04:48 2008 From: geoff at galitz.org (Geoff Galitz) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: References: Message-ID: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> I know people who use Houdini for this: http://www.sidefx.com/index.php I cannot vouch for how well it works or what is involved, though. Geoff Galitz Blankenheim NRW, Deutschland http://www.galitz.org _____ From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Jon Aquilina Sent: Dienstag, 1. Juli 2008 12:40 To: Beowulf Mailing List Subject: [Beowulf] software for compatible with a cluster does anyone know of any rendering software that will work with a cluster? -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/0ef16522/attachment.html From eagles051387 at gmail.com Tue Jul 1 04:26:38 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> References: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> Message-ID: reason i am asking is because i would like to setup a rendering cluster and provide rendering services. does this also work for 3d animated movies that require rendering or does one need somethin entierly different for that? On 7/1/08, Geoff Galitz wrote: > > > > > > I know people who use Houdini for this: > > > > http://www.sidefx.com/index.php > > > > I cannot vouch for how well it works or what is involved, though. > > > > > > Geoff Galitz > Blankenheim NRW, Deutschland > http://www.galitz.org > ------------------------------ > > *From:* beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] *On > Behalf Of *Jon Aquilina > *Sent:* Dienstag, 1. Juli 2008 12:40 > *To:* Beowulf Mailing List > *Subject:* [Beowulf] software for compatible with a cluster > > > > does anyone know of any rendering software that will work with a cluster? > > -- > Jonathan Aquilina > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/4d618ee7/attachment.html From gerry.creager at tamu.edu Tue Jul 1 04:59:03 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:19 2009 Subject: Commodity supercomputing, was: Re: NDAs Re: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point? In-Reply-To: <48694BD5.5090303@moene.indiv.nluug.nl> References: <1210016466.4924.1.camel@Vigor13> <48551E70.7070507@scalableinformatics.com> <4AF41375-3A13-4691-A2A1-D5B853FEC3A4@xs4all.nl> <20080615154227.u8fwdpn08ww4c40k@webmail.jpl.nasa.gov> <6.2.5.6.2.20080616084554.02e4dd18@jpl.nasa.gov> <486923D6.8070907@moene.indiv.nluug.nl> <48693DCA.3010903@tamu.edu> <48694BD5.5090303@moene.indiv.nluug.nl> Message-ID: <486A1C07.9050208@tamu.edu> Toon Moene wrote: > Gerry Creager wrote: > >> I'm running WRF on ranger, the 580 TF Sun cluster at utexas.edu. I >> can complete the WRF single domain run, using 384 cores in ~30 min >> wall clock time. At the WRF Users Conference last week, the number of >> folks I talked to running WRF on workstations or "operationally" on >> 16-64 core clusters was impressive. I suspect a lot of desktop >> weather forecasting will, as you suggest, become the norm. The >> question, then, is: Are we looking at an enterprise where everyone >> with a gaming machine thinks they understand the model well enough to >> try predicting the weather, or are some still in awe of Lorenz' >> hypothesis about its complexity? > > This is where I think the pluses of the established meteorological > society will be: We know how to establish the quality of meteorological > models, how to compare them, how to dive into their parametrizations to > figure out the relevant differences and to solve the problems. > > Because we know this, we will be sought after. However, we will be > working inside the industry that needs this knowlegde, and outside > academia or institutionalized weather centres. This is already starting to happen. However, what I continue to see is managers wanting/expecting an absolute answer be generated numerically, and they're paying less attention to the modelers' concerns about the "goodness" of the model in certain settings. As an example, for our evening news programs, we've someone purporting to be a meteorologist. Over the last 10 years, the proportion of folks actually trained in meteorology has grown significantly, and talking to them one-on-one, they tend to recognize the limitations of the models they present. Yet, rather than saying the temperature tomorrow will be in a range from 93-98 deg F (with apologies to our brothers across the Pond) they're generally required to say, "96F" because their managers believe the public requires an absolute number. Perhaps, in some industries where statistical analysis is more integral, we'll see appropriate use of the data... gerry -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From gerry.creager at tamu.edu Tue Jul 1 05:13:47 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:19 2009 Subject: Commodity supercomputing, was: Re: NDAs Re: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point? In-Reply-To: <1214864562.6912.29.camel@Vigor13> References: <1210016466.4924.1.camel@Vigor13> <48551E70.7070507@scalableinformatics.com> <4AF41375-3A13-4691-A2A1-D5B853FEC3A4@xs4all.nl> <20080615154227.u8fwdpn08ww4c40k@webmail.jpl.nasa.gov> <6.2.5.6.2.20080616084554.02e4dd18@jpl.nasa.gov> <486923D6.8070907@moene.indiv.nluug.nl> <1214864562.6912.29.camel@Vigor13> Message-ID: <486A1F7B.9080408@tamu.edu> John Hearns wrote: > On Mon, 2008-06-30 at 20:20 +0200, Toon Moene wrote: > >> Since about a year, it's been clear to me that weather forecasting >> (i.e., running a more or less sophisticated atmospheric model to provide >> weather predictions) is going to be "mainstream" in the sense that every >> business that needs such forecasts for its operations can simply run >> them in-house. > > Garbage in, garbage out. > > By that I mean that the CPU horsepower may be more and more readily > affordable for businesses like that - let's say it is an ice-cream > wholesaler who would like to have a three day forecast to allow stocking > of their outlets with ice cream. > However, the models depend on input from sensor networks - not my area > of expertise, but I should imagine manned and unmanned weather stations, > ocean buoys to measure wave height, satellite sensors. > Do we see such data sources being made freely available, and in real > time (ie not archived data sets)?? In the US, at least for academic institutions and hobbyists, surface and upper air observations of the sort you describe are generally available for incorporation into models for data assimilation. Models are generally forced and bounded using model data from other atmospheric models, also available. As I understand it from colleagues in Europe, getting similar data over there is more problemmatical. > Hopefully on topic the Manchester Guardian newspaper (you all know me > now for a Guardian reader) is running a "Free Our Data" campaign - to > pressurise Government to make freely available GIS type data and census > data which the Government has. I'm personally unconvinced of the > overwhelming justification for (say) the Ordnance Survey to give all of > its mapping data away for free. > http://www.freeourdata.org.uk/ Last summer, in Paris, I had a discussion on this subject with the Ordinance Survey's chief cartographer. It is their intent to free the data save reasonable costs of reproduction/maintenance as soon as they can establish these. In the US, this is the norm. In Texas, where I live, there's a site with State basemap data, highly accurate roadway data, land-use/land-cover, census, etc. that's just an FTP call away, or, if you want to pay roughly $10 per DVD, they'll burn a copy for you (cost of personnel for reproduction of the DVD). Some states have deemed their data proprietary. A lot have locked their data down somewhat since 9/11, as our Department of Homeland Security has called for restricting access to Critical Infrastructure data. Note that the last listing of Critical Infrastructure for Texas listed some 268 pages of delineation, description and justification. I fear it's been updated/expanded since then. It included banks, cemeteries, schools, bridges, water and sewer plants, shopping malls, high-traffic motor-ways, refrigerated facilities, supermarkets, gas stations, bridges, power transformer and generation sites, power transmission lines, petroleum pipelines, and gas stations, to name a few. There was discussion of adding individual residences to the list. As you can see, restricting access to "critical infrastructure" could result in a blank map. -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From m.janssens at opencfd.co.uk Tue Jul 1 05:48:39 2008 From: m.janssens at opencfd.co.uk (Mattijs Janssens) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: References: Message-ID: <200807011348.39343.m.janssens@opencfd.co.uk> On Tuesday 01 July 2008 11:38, Jon Aquilina wrote: > does anyone know an altenative to openmosix?? would it be worth reviving > the development of the kernel? maybe http://www.kerrighed.org (and that is all I know about it) Regards, Mattijs From geoff at galitz.org Tue Jul 1 05:50:33 2008 From: geoff at galitz.org (Geoff Galitz) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: References: Message-ID: It seems that much of the effort that was going into openMOSIX is now going into KVM. http://kvm.qumranet.com/kvmwiki I think the idea is that MOSIX functionality is more easily developed and deployed in the form of virtual machines than directly at the kernel level. There are some trade-offs, of course... more overhead being chief among them but the virtualization model is clearly the overall favorite. It sure does beat the heck out of having to track each kernel individually. Geoff Galitz Blankenheim NRW, Deutschland http://www.galitz.org _____ From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Jon Aquilina Sent: Dienstag, 1. Juli 2008 12:39 To: Beowulf Mailing List Subject: [Beowulf] open mosix alternative does anyone know an altenative to openmosix?? would it be worth reviving the development of the kernel? -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/3427056d/attachment.html From mark.kosmowski at gmail.com Tue Jul 1 05:51:54 2008 From: mark.kosmowski at gmail.com (Mark Kosmowski) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] Re: Beowulf Digest, Vol 53, Issue 1 In-Reply-To: <200807010728.m617S3Ub011226@bluewest.scyld.com> References: <200807010728.m617S3Ub011226@bluewest.scyld.com> Message-ID: At some point there a cost-benefit analysis needs to be performed. If my cluster at peak usage only uses 4 Gb RAM per CPU (I live in single-core land still and do not yet differentiate between CPU and core) and my nodes all have 16 Gb per CPU then I am wasting RAM resources and would be better off buying new machines and physically transferring the RAM to and from them or running more jobs each distributed across fewer CPUs. Or saving on my electricity bill and powering down some nodes. As heretical as this last sounds, I'm tempted to throw in the towel on my PhD studies because I can no longer afford the power to run my three node cluster at home. Energy costs may end up being the straw that breaks this camel's back. Mark E. Kosmowski > From: "Jon Aquilina" > > not sure if this applies to all kinds of senarios that clusters are used in > but isnt the more ram you have the better? > > On 6/30/08, Vincent Diepeveen wrote: > > > > Toon, > > > > Can you drop a line on how important RAM is for weather forecasting in > > latest type of calculations you're performing? > > > > Thanks, > > Vincent > > > > > > On Jun 30, 2008, at 8:20 PM, Toon Moene wrote: > > > > Jim Lux wrote: > >> > >> Yep. And for good reason. Even a big DoD job is still tiny in Nvidia's > >>> scale of operations. We face this all the time with NASA work. > >>> Semiconductor manufacturers have no real reason to produce special purpose > >>> or customized versions of their products for space use, because they can > >>> sell all they can make to the consumer market. More than once, I've had a > >>> phone call along the lines of this: > >>> "Jim: I'm interested in your new ABC321 part." > >>> "Rep: Great. I'll just send the NDA over and we can talk about it." > >>> "Jim: Great, you have my email and my fax # is..." > >>> "Rep: By the way, what sort of volume are you going to be using?" > >>> "Jim: Oh, 10-12.." > >>> "Rep: thousand per week, excellent..." > >>> "Jim: No, a dozen pieces, total, lifetime buy, or at best maybe every > >>> year." > >>> "Rep: Oh..." > >>> {Well, to be fair, it's not that bad, they don't hang up on you.. > >>> > >> > >> Since about a year, it's been clear to me that weather forecasting (i.e., > >> running a more or less sophisticated atmospheric model to provide weather > >> predictions) is going to be "mainstream" in the sense that every business > >> that needs such forecasts for its operations can simply run them in-house. > >> > >> Case in point: I bought a $1100 HP box (the obvious target group being > >> teenage downloaders) which performs the HIRLAM limited area model *on the > >> grid that we used until October 2006* in December last year. > >> > >> It's about twice as slow as our then-operational 50-CPU Sun Fire 15K. > >> > >> I wonder what effect this will have on CPU developments ... > >> > >> -- > >> Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 > >> Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > >> At home: http://moene.indiv.nluug.nl/~toon/ > >> Progress of GNU Fortran: http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html > >> > > > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > -- > Jonathan Aquilina From mark.kosmowski at gmail.com Tue Jul 1 05:53:35 2008 From: mark.kosmowski at gmail.com (Mark Kosmowski) Date: Wed Nov 25 01:07:19 2009 Subject: Commodity supercomputing, was: Re: NDAs Re: [Beowulf] Message-ID: And I forgot to change the subject. Apologies. On 7/1/08, Mark Kosmowski wrote: > At some point there a cost-benefit analysis needs to be performed. If > my cluster at peak usage only uses 4 Gb RAM per CPU (I live in > single-core land still and do not yet differentiate between CPU and > core) and my nodes all have 16 Gb per CPU then I am wasting RAM > resources and would be better off buying new machines and physically > transferring the RAM to and from them or running more jobs each > distributed across fewer CPUs. Or saving on my electricity bill and > powering down some nodes. > > As heretical as this last sounds, I'm tempted to throw in the towel on > my PhD studies because I can no longer afford the power to run my > three node cluster at home. Energy costs may end up being the straw > that breaks this camel's back. > > Mark E. Kosmowski > > > From: "Jon Aquilina" > > > > > not sure if this applies to all kinds of senarios that clusters are used in > > but isnt the more ram you have the better? > > > > On 6/30/08, Vincent Diepeveen wrote: > > > > > > Toon, > > > > > > Can you drop a line on how important RAM is for weather forecasting in > > > latest type of calculations you're performing? > > > > > > Thanks, > > > Vincent > > > > > > > > > On Jun 30, 2008, at 8:20 PM, Toon Moene wrote: > > > > > > Jim Lux wrote: > > >> > > >> Yep. And for good reason. Even a big DoD job is still tiny in Nvidia's > > >>> scale of operations. We face this all the time with NASA work. > > >>> Semiconductor manufacturers have no real reason to produce special purpose > > >>> or customized versions of their products for space use, because they can > > >>> sell all they can make to the consumer market. More than once, I've had a > > >>> phone call along the lines of this: > > >>> "Jim: I'm interested in your new ABC321 part." > > >>> "Rep: Great. I'll just send the NDA over and we can talk about it." > > >>> "Jim: Great, you have my email and my fax # is..." > > >>> "Rep: By the way, what sort of volume are you going to be using?" > > >>> "Jim: Oh, 10-12.." > > >>> "Rep: thousand per week, excellent..." > > >>> "Jim: No, a dozen pieces, total, lifetime buy, or at best maybe every > > >>> year." > > >>> "Rep: Oh..." > > >>> {Well, to be fair, it's not that bad, they don't hang up on you.. > > >>> > > >> > > >> Since about a year, it's been clear to me that weather forecasting (i.e., > > >> running a more or less sophisticated atmospheric model to provide weather > > >> predictions) is going to be "mainstream" in the sense that every business > > >> that needs such forecasts for its operations can simply run them in-house. > > >> > > >> Case in point: I bought a $1100 HP box (the obvious target group being > > >> teenage downloaders) which performs the HIRLAM limited area model *on the > > >> grid that we used until October 2006* in December last year. > > >> > > >> It's about twice as slow as our then-operational 50-CPU Sun Fire 15K. > > >> > > >> I wonder what effect this will have on CPU developments ... > > >> > > >> -- > > >> Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 > > >> Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > > >> At home: http://moene.indiv.nluug.nl/~toon/ > > >> Progress of GNU Fortran: http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html > > >> > > > > > > _______________________________________________ > > > Beowulf mailing list, Beowulf@beowulf.org > > > To change your subscription (digest mode or unsubscribe) visit > > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > > > > > > -- > > Jonathan Aquilina > From geoff at galitz.org Tue Jul 1 05:54:26 2008 From: geoff at galitz.org (Geoff Galitz) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: References: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> Message-ID: <3CB66E9F377C4961B5457896137EAD1B@geoffPC> That is out of my field of expertise. Sounds like a question for professional digital artists. I can put you in touch some folks that most likely know the answer to your questions, if you like. Anybody know of any current approaches to this? Geoff Galitz Blankenheim NRW, Deutschland http://www.galitz.org _____ From: Jon Aquilina [mailto:eagles051387@gmail.com] Sent: Dienstag, 1. Juli 2008 13:27 To: Geoff Galitz Cc: Beowulf Mailing List Subject: Re: [Beowulf] software for compatible with a cluster reason i am asking is because i would like to setup a rendering cluster and provide rendering services. does this also work for 3d animated movies that require rendering or does one need somethin entierly different for that? On 7/1/08, Geoff Galitz wrote: I know people who use Houdini for this: http://www.sidefx.com/index.php I cannot vouch for how well it works or what is involved, though. Geoff Galitz Blankenheim NRW, Deutschland http://www.galitz.org _____ From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Jon Aquilina Sent: Dienstag, 1. Juli 2008 12:40 To: Beowulf Mailing List Subject: [Beowulf] software for compatible with a cluster does anyone know of any rendering software that will work with a cluster? -- Jonathan Aquilina -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/8044b6fb/attachment.html From ajt at rri.sari.ac.uk Tue Jul 1 06:14:38 2008 From: ajt at rri.sari.ac.uk (Tony Travis) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: References: Message-ID: <486A2DBE.10302@rri.sari.ac.uk> Jon Aquilina wrote: > does anyone know an altenative to openmosix?? would it be worth reviving > the development of the kernel? Hello, Jonathan. I'm still running openMosix (linux-2.4.26-om1) and I did have an attempt at porting it to the 2.4.32 kernel so I could use SATA disks, but I couldn't get process migration to work. My deb's for rebuilding the openMosix kernel under Ubuntu 6.06.1 LTS are at: http://bioinformatics.rri.sari.ac.uk/openmosix We are currently evaluating Kerrighed as an alternative: http://www.kerrighed.org Kerrighed also forms the basis of 'XtreemOS': http://www.xtreemos.eu/ Although Kerrighed looks very promising, it is also quite fragile in our hands. If one node crashes, you lose the entire cluster. That said, the Kerrighed project is extremely well supported and I believe it will be a good alternative in the near future. We will continue to run openMosix in the short-term, but I may evaluate MOSIX2: http://www.mosix.org/ I was, previously, opposed to Mosix on idealogical grounds and loyal to Moshe Bar but to be fair to Mosix is now free for non-profit use and the source code is available (but not GPL). Please let me know if you are seriously considering reviving openMosix! Tony. -- Dr. A.J.Travis, | mailto:ajt@rri.sari.ac.uk Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751 Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687 From landman at scalableinformatics.com Tue Jul 1 07:00:06 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] Re: Beowulf Digest, Vol 53, Issue 1 In-Reply-To: References: <200807010728.m617S3Ub011226@bluewest.scyld.com> Message-ID: <486A3866.7030302@scalableinformatics.com> Mark Kosmowski wrote: > At some point there a cost-benefit analysis needs to be performed. If > my cluster at peak usage only uses 4 Gb RAM per CPU (I live in > single-core land still and do not yet differentiate between CPU and > core) and my nodes all have 16 Gb per CPU then I am wasting RAM > resources and would be better off buying new machines and physically > transferring the RAM to and from them or running more jobs each > distributed across fewer CPUs. Or saving on my electricity bill and > powering down some nodes. Possible, though if you do heavy IO even with single core chips, and you are running a 64 bit OS, the extra buffer cache is not to be rejected lightly. > > As heretical as this last sounds, I'm tempted to throw in the towel on > my PhD studies because I can no longer afford the power to run my > three node cluster at home. Energy costs may end up being the straw > that breaks this camel's back. Which country are you in? You may be able to apply for "free" computing resources. Tera-grid in the US, other similar resources. Mark Hahn might give you pointers for Canada, and the folks at Streamline/Clustervision/... might be able to give you pointers for UK/EU. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From eagles051387 at gmail.com Tue Jul 1 07:18:50 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: <3CB66E9F377C4961B5457896137EAD1B@geoffPC> References: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> <3CB66E9F377C4961B5457896137EAD1B@geoffPC> Message-ID: that would be greatly appreciated On 7/1/08, Geoff Galitz wrote: > > > > That is out of my field of expertise. Sounds like a question for > professional digital artists. I can put you in touch some folks that most > likely know the answer to your questions, if you like. > > > > Anybody know of any current approaches to this? > > > > Geoff Galitz > Blankenheim NRW, Deutschland > http://www.galitz.org > ------------------------------ > > *From:* Jon Aquilina [mailto:eagles051387@gmail.com] > *Sent:* Dienstag, 1. Juli 2008 13:27 > *To:* Geoff Galitz > *Cc:* Beowulf Mailing List > *Subject:* Re: [Beowulf] software for compatible with a cluster > > > > reason i am asking is because i would like to setup a rendering cluster and > provide rendering services. does this also work for 3d animated movies that > require rendering or does one need somethin entierly different for that? > > On 7/1/08, *Geoff Galitz* wrote: > > > > > > I know people who use Houdini for this: > > > > http://www.sidefx.com/index.php > > > > I cannot vouch for how well it works or what is involved, though. > > > > > > Geoff Galitz > Blankenheim NRW, Deutschland > http://www.galitz.org > ------------------------------ > > *From:* beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] *On > Behalf Of *Jon Aquilina > *Sent:* Dienstag, 1. Juli 2008 12:40 > *To:* Beowulf Mailing List > *Subject:* [Beowulf] software for compatible with a cluster > > > > does anyone know of any rendering software that will work with a cluster? > > -- > Jonathan Aquilina > > > > > -- > Jonathan Aquilina > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/00aa8d76/attachment.html From vanallsburg at hope.edu Tue Jul 1 07:43:34 2008 From: vanallsburg at hope.edu (Paul Van Allsburg) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: References: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> <3CB66E9F377C4961B5457896137EAD1B@geoffPC> Message-ID: <486A4296.4050501@hope.edu> I'd like to do the same, as a project for a group of students... Please keep me in the loop? Thanks! Paul -- Paul Van Allsburg Computational Science & Modeling Facilitator Natural Sciences Division, Hope College 35 East 12th Street Holland, Michigan 49423 616-395-7292 http://www.hope.edu/academic/csm/ Jon Aquilina wrote: > that would be greatly appreciated > > On 7/1/08, *Geoff Galitz* > > wrote: > > > > That is out of my field of expertise. Sounds like a question for > professional digital artists. I can put you in touch some folks > that most likely know the answer to your questions, if you like. > > > > Anybody know of any current approaches to this? > > > > Geoff Galitz > Blankenheim NRW, Deutschland > http://www.galitz.org > > * From: * Jon Aquilina [mailto:eagles051387@gmail.com > ] > *Sent:* Dienstag, 1. Juli 2008 13:27 > *To:* Geoff Galitz > *Cc:* Beowulf Mailing List > *Subject:* Re: [Beowulf] software for compatible with a cluster > > > > reason i am asking is because i would like to setup a rendering > cluster and provide rendering services. does this also work for 3d > animated movies that require rendering or does one need somethin > entierly different for that? > > On 7/1/08, *Geoff Galitz* > wrote: > > > > > > I know people who use Houdini for this: > > > > http://www.sidefx.com/index.php > > > > I cannot vouch for how well it works or what is involved, though. > > > > > > Geoff Galitz > Blankenheim NRW, Deutschland > http://www.galitz.org > > * From: * beowulf-bounces@beowulf.org > > [mailto:beowulf-bounces@beowulf.org > ] *On Behalf Of *Jon Aquilina > *Sent:* Dienstag, 1. Juli 2008 12:40 > *To:* Beowulf Mailing List > *Subject:* [Beowulf] software for compatible with a cluster > > > > does anyone know of any rendering software that will work with a > cluster? > > -- > Jonathan Aquilina > > > > > -- > Jonathan Aquilina > > > > > -- > Jonathan Aquilina From perry at piermont.com Tue Jul 1 07:44:48 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:19 2009 Subject: Commodity supercomputing, was: Re: NDAs Re: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point? In-Reply-To: <48694432.4020608@scalableinformatics.com> (Joe Landman's message of "Mon\, 30 Jun 2008 16\:38\:10 -0400") References: <1210016466.4924.1.camel@Vigor13> <48551E70.7070507@scalableinformatics.com> <4AF41375-3A13-4691-A2A1-D5B853FEC3A4@xs4all.nl> <20080615154227.u8fwdpn08ww4c40k@webmail.jpl.nasa.gov> <6.2.5.6.2.20080616084554.02e4dd18@jpl.nasa.gov> <486923D6.8070907@moene.indiv.nluug.nl> <48693DCA.3010903@tamu.edu> <48694432.4020608@scalableinformatics.com> Message-ID: <87d4lxk6jj.fsf@snark.cb.piermont.com> Joe Landman writes: > I see a curious phenomenon going on in crash simulation and NVH. We > see an increasing "decoupling" if you will, between the detailed > issues of simulation and coding, and the end user using the simulation > system. That is, the users may know the engineering side, but don't > seem to grasp the finer aspects of the simulation ... what to take as > reasonably accurate, and what to grasp might not be. > > I don't see this in chemistry, in large part due to many of the users > also writing their own software. On the contrary. I know computational chemistry specialists who worry about users of the common commercial software (Gaussian, Jaguar, etc.) not knowing what to believe and what not to believe in the output. Since I've seen people in synthetic organic labs running the simulation software to design possible synthetic pathways without understanding the software, I think this worry is perfectly valid. The overwhelming majority of users are not computational chemists at all -- they're ordinary organic chemists, and they don't have a good gut feel for what the limitations of the tools are. I know of very few users of computational chemistry software who roll their own. Try reading the computational chemistry mailing lists for a little while, or reading the journals, and you'll get a feel for what the average user is like. There might be a lot of people writing software out there, but there are vastly more who just want to get answers and don't understand how the programs work at all. Perry -- Perry E. Metzger perry@piermont.com From perry at piermont.com Tue Jul 1 07:53:06 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> (Henning Fehrmann's message of "Tue\, 1 Jul 2008 11\:36\:43 +0200") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> Message-ID: <878wwlk65p.fsf@snark.cb.piermont.com> Henning Fehrmann writes: > we need to automount NFS directories on high ports to increase the > number of possible mounts. Currently, we are limited up to ca 360 mounts. A TCP socket is a 4-tuple of localhost:localport:remotehost:remoteport A given localhost:localport pair can speak to an unlimted array of remotehost:remoteport sets. For example, in theory, your SMTP port can get connections from up to 2^32 different hosts on each of 2^16 different sockets from each, for a total space of 2^48 connections to a single local socket number. This in no way restricts how many connections can come in to another port, either, because a given socket is again the full 4-tuple -- if you have an SSH port, it too can get 2^48 connections. Now, there is this (odd) convention that only root can open a socket below 1024, so hosts "trust" (what a bad idea) sockets under that number. You can still, however, get up to 1023 connections from any given remote host to a given local host's port. Thus, your problem sounds rather odd. There is no obvious reason you should be limited to 360 connections. Perhaps your problem is not what you think it is at all. Could you explain it in more detail? -- Perry E. Metzger perry@piermont.com From ajt at rri.sari.ac.uk Tue Jul 1 08:31:48 2008 From: ajt at rri.sari.ac.uk (Tony Travis) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: References: Message-ID: <486A4DE4.1090807@rri.sari.ac.uk> Geoff Galitz wrote: > [...] > I think the idea is that MOSIX functionality is more easily developed > and deployed in the form of virtual machines than directly at the kernel > level. There are some trade-offs, of course... more overhead being > chief among them but the virtualization model is clearly the overall > favorite. It sure does beat the heck out of having to track each kernel > individually. Hello, Geoff. MOSIX functionality is mainly about load-balancing between independent kernels, and avoiding severe memory depletion by migrating processes between kernels. In fact (open)MOSIX implements an SMP-like model, but with a high-latency interconect (usually GBit ethernet). There is no need to 'track' kernels, because the oM HPC extension does it for you. The principle objective of SSI computing is to use many small machines as if they are one big one. This is the opposite of virtualisation which uses one (or a few) BIG machines like a lot of small ones. It does this by virtually separating the kernels. There is some confusion about this because it *is* very convenient to teach about or develop and test SSI software on virtual compute nodes if you don't have a lot of real nodes, but it defeats the purpose of SSI to use this approach in production. You might be interested to know that one reason Moshe Bar gave when he announced the end of the openMosix project was that SMP is now so cheap that SSI clustering less of a factor in computing: http://sourceforge.net/forum/forum.php?forum_id=715406 I'm not sure I agree - I still find openMosix useful, and I'll continue using it on our Beowulf here until I find a better alternative. Tony. -- Dr. A.J.Travis, | mailto:ajt@rri.sari.ac.uk Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751 Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687 From tjrc at sanger.ac.uk Tue Jul 1 08:40:27 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <878wwlk65p.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> Message-ID: On 1 Jul 2008, at 3:53 pm, Perry E. Metzger wrote: > > Henning Fehrmann writes: >> we need to automount NFS directories on high ports to increase the >> number of possible mounts. Currently, we are limited up to ca 360 >> mounts. > > A TCP socket is a 4-tuple of localhost:localport:remotehost:remoteport > > A given localhost:localport pair can speak to an unlimted array of > remotehost:remoteport sets. For example, in theory, your SMTP port can > get connections from up to 2^32 different hosts on each of 2^16 > different sockets from each, for a total space of 2^48 connections to > a single local socket number. This in no way restricts how many > connections can come in to another port, either, because a given > socket is again the full 4-tuple -- if you have an SSH port, it too > can get 2^48 connections. > > Now, there is this (odd) convention that only root can open a socket > below 1024, so hosts "trust" (what a bad idea) sockets under that > number. You can still, however, get up to 1023 connections from any > given remote host to a given local host's port. > > Thus, your problem sounds rather odd. There is no obvious reason you > should be limited to 360 connections. > > Perhaps your problem is not what you think it is at all. Could you > explain it in more detail? Certainly on my systems where I use the am-utils automounter, I find the limit on the number of simultaneously mounted filesystems is more in the region of 1500. I've been desperately trying to reduce the number of NFS filesystems we have though. Currently our automount map has about 600 entries, I think. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From perry at piermont.com Tue Jul 1 08:48:47 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] automount on high ports In-Reply-To: (Tim Cutts's message of "Tue\, 1 Jul 2008 16\:40\:27 +0100") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> Message-ID: <87od5hip0g.fsf@snark.cb.piermont.com> Tim Cutts writes: > Certainly on my systems where I use the am-utils automounter, I find > the limit on the number of simultaneously mounted filesystems is more > in the region of 1500. And that's doubtless not from TCP port issues but because of other kinds of resources being limited. > I've been desperately trying to reduce the number of NFS filesystems > we have though. Currently our automount map has about 600 entries, > I think. Sometimes that's reasonable. I've seen large sites where everyone has a workstation in front of them and all of the thousands of users get their home dir automounted when they sit in front of a box and log in. However, one notes that in such a situation, the automount maps have thousands or tens of thousands of entries, but any given machine generally only is mounting a few file systems. -- Perry E. Metzger perry@piermont.com From kilian at stanford.edu Tue Jul 1 08:49:42 2008 From: kilian at stanford.edu (Kilian CAVALOTTI) Date: Wed Nov 25 01:07:19 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: References: Message-ID: <200807010849.42415.kilian@stanford.edu> Hi Jon, On Tuesday 01 July 2008 03:38:52 am Jon Aquilina wrote: > does anyone know an altenative to openmosix?? You may want to check out OpenSSI: http://www.openssi.org As its name says, that's a SSI clustering solution, with unified process namespace, full process migration, load-balancing, single root filesystem, etc. A complete list of features is available at: http://wiki.openssi.org/go/Features Cheers, -- Kilian From henning.fehrmann at aei.mpg.de Tue Jul 1 09:47:47 2008 From: henning.fehrmann at aei.mpg.de (Henning Fehrmann) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <878wwlk65p.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> Message-ID: <20080701164747.GA15901@gretchen.aei.uni-hannover.de> On Tue, Jul 01, 2008 at 10:53:06AM -0400, Perry E. Metzger wrote: > > Henning Fehrmann writes: > > we need to automount NFS directories on high ports to increase the > > number of possible mounts. Currently, we are limited up to ca 360 mounts. > > > Thus, your problem sounds rather odd. There is no obvious reason you > should be limited to 360 connections. > > Perhaps your problem is not what you think it is at all. Could you > explain it in more detail? I guess it has also something to do with the automounter. I am not able to increase this number. But even if the automounter would handle more we need to be able to use higher ports: netstat shows always ports below 1024. tcp 0 0 client:941 server:nfs We need to mount up to 1400 nfs exports. Cheers Henning From hahn at mcmaster.ca Tue Jul 1 09:51:32 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> Message-ID: >> We have our own stack which we stick on top of the customers favourite >> red hat clone. Usually Scientific Linux. > > does it necessarily have to be a redhat clone. can it also be a debian based > clone? but why? is there some concrete advantage to using Debian? I've never understood why Debian users tend to be very True Believer, or what it is that hooks them. From prentice at ias.edu Tue Jul 1 10:20:32 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> Message-ID: <486A6760.5010006@ias.edu> Mark Hahn wrote: >>> We have our own stack which we stick on top of the customers favourite >>> red hat clone. Usually Scientific Linux. >> >> does it necessarily have to be a redhat clone. can it also be a debian >> based >> clone? > > but why? is there some concrete advantage to using Debian? > I've never understood why Debian users tend to be very True Believer, > or what it is that hooks them. And the Debian users can say the same thing about Red Hat users. Or SUSE users. And if any still exist, the Slackware users could say the same thing about the both of them. But then the Slackware users could also point out that the first Linux distro was Slackware, so they are using the one true Linux distro... If you want to have a religious war about which distro to use, go somewhere else. I'm sure there are plenty of mailing lists and newsgroups where I'm sure that happens every day. This is a mailing list about beowulf clusters, and the last time I checked, you can create clusters using any Linux distribution you like, or even non-Linux operating systems, such as IRIX, Solaris, etc. -- Prentice From landman at scalableinformatics.com Tue Jul 1 10:46:01 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: <486A6760.5010006@ias.edu> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> Message-ID: <486A6D59.7020704@scalableinformatics.com> Prentice Bisbal wrote: > Mark Hahn wrote: [...] > If you want to have a religious war about which distro to use, go > somewhere else. I'm sure there are plenty of mailing lists and > newsgroups where I'm sure that happens every day. Hmmm.... for me, its all about the kernel. Thats 90+% of the battle. Some distros use good kernels, some do not. I won't mention who I think is in the latter category. FWIW: we tend to build systems and place our own kernel on them. Basically we want them to work, and not be surprised by bad things, like crashes due to 4k stacks or backported (mis)features. We also want them to have updated drivers, and NFS/file system bits. > This is a mailing list about beowulf clusters, and the last time I > checked, you can create clusters using any Linux distribution you like, > or even non-Linux operating systems, such as IRIX, Solaris, etc. With all due respect, I think Mark knows what this list is about. There are lots of folks out there using Fedora, RHEL, Ubuntu, Debian, SuSE, ... We generally don't care which distro is used. Only that the kernel is reasonable, stable under load, and supports updated file systems/network capability. Beowulf depends upon good kernels at the end of the day. You need high performance and stability throughout. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615 From thpierce at gmail.com Tue Jul 1 05:07:22 2008 From: thpierce at gmail.com (Tom Pierce) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] June New York/Jersey HPC users meeting Message-ID: <25e9e5ad0807010507s74ea33e7p42abeff3d275b5a2@mail.gmail.com> Dear Dan, First, you missed a enjoyable meeting with lively discussion, good pub food and beer. I hope we meet there again in July. I attended most of the meeting. My memory summarized it: Sun Grid Engine users were the majority at the meeting ( 60% SGE users, and 40% Torqur/Maui users) The installations of the two systems are different experiences. With SGE, you are about "half-done" after you install the system. The installation of Torque/Maui is more functional right out of the box. Both seem to have similar functionality when setup. SGE has Sun developers actively working on it, so the newest versions have more options. eg a Flexlm link for license management. Torque/Maui is open source, and has not been modified as often as SGE has. Altho cpusets, similar to SGE cpusets, have recently been added. Torque/Maui has commercial upgrades to Torque/Moab for large sites, or people who want paid support. (and Moab supports Flexlm license management). There seem to be more installations of Torque/Maui than there are of SGE, but that was just a discussion of perceptions. However, the history of PBS, up through Torque, means that there are a great many PBS scripts on the internet for job submissions of HPC applications. The discussion of MPI interfaces was ongoing. Neither system seemed to have an advantage. Torque has the OSC mpiexec script and SGE has some builtin hooks for MPI. The discussions mentioned Openmpi, LAM, MPICH, GM and no obvious resolution that one system was more functional or easier than the other for MPI codes. At the end, I would call it a "draw". Torque/Maui easier to setup and lots of examples vs SGE flexibility and Flexlm license mgt. Tom Daniel.Roberts@sanofi-aventis.com wrote: Anyone have minutes or conclusions to offer from this scheduler smack down? Thanks Dan -----Original Message----- From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] If you live or work in the New York/North Jersey Metropolitan area, mark your calender for this Thursday, June 19th. The NYCA-HUG (New York City Area HPC Users Group) will be trying to answer the ultimate question Torque or Sun Grid Engine? We will be discussing the pros/cons of each scheduler for HPC clusters. Come and add your experiences, wants, and rants. Then you decide. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/768b214b/attachment.html From merc4krugger at gmail.com Tue Jul 1 06:27:44 2008 From: merc4krugger at gmail.com (Krugger) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> Message-ID: Hi, Am I understanding it correctly? You want to have more than 360 mounts in a single NFS client? And you want that client to be run on a non-privileged port? What you are doing doesn't make much sense to me, but you can try adding the option "lockd.udpport=32768 lockd.tcpport=32768" to your kernel flags so that the kernel puts the daemon lockd that handles NFS locks at the port you selected in the client side. I don't understand how changing the port will help you get more mounts in. I would actually suggest you review the maximum allowed filehandles for each process. You will also need and start services manually, something like: statd -p 32765 -o 32766 mountd -p 32767 If you use modules you need to reconfigure you modules with "options lockd nlm_udpport=32768 nlm_tcpport=32768" to your /etc/modules.conf If I am misunderstanding and you are having a maximum of 360 clients for your NFS server, then maybe you are having a network problem, because with NFS3 your clients will lose connection to the server when de UDP starts losing packets due to heavy I/O from the calculations if both happen on the same network. Maybe NFS v4 might help with TCP connections or/and some sort of shaping to make sure there is enough bandwith reservered for NFS to operate properly. Notice that all have differant ports 32765,32766,32767,32768 Krugger On Tue, Jul 1, 2008 at 10:36 AM, Henning Fehrmann wrote: > Hello, > > we need to automount NFS directories on high ports to increase the number of possible mounts. > Currently, we are limited up to ca 360 mounts. > > The NFS-server exports with the option 'insecure' but the mounts still end up on ports <1024 on the client side. > > Is there a way to enable automounts on higher ports? How can it be done manually: > mount -t nfs -o ....? > > We are using autofs version 5. > > Thank you, > Henning > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From vernard at venger.net Tue Jul 1 08:19:15 2008 From: vernard at venger.net (Vernard Martin) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: References: Message-ID: <486A4AF3.9040108@venger.net> Jon Aquilina wrote: > does anyone know of any rendering software that will work with a cluster? The Big Daddy of them all, Pixar's RenderMan Pro Server is supported under Linux and is used by nearly everybody in Hollywood that does graphic rendering for movies. It ain't cheap but its pretty much the best there Check out https://renderman.pixar.com/products/techspecs/index.htm for more info. From gregory.warnes at rochester.edu Tue Jul 1 09:39:38 2008 From: gregory.warnes at rochester.edu (Gregory Warnes) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: <200807010849.42415.kilian@stanford.edu> Message-ID: Or, of course, the original Mosix project. Ammon Barak is very amiable and willing to work with folks. http://www.mosix.org -Greg On 7/1/08 11:49AM , "Kilian CAVALOTTI" wrote: > Hi Jon, > > On Tuesday 01 July 2008 03:38:52 am Jon Aquilina wrote: >> > does anyone know an altenative to openmosix?? > > You may want to check out OpenSSI: http://www.openssi.org > > As its name says, that's a SSI clustering solution, with unified process > namespace, full process migration, load-balancing, single root > filesystem, etc. A complete list of features is available at: > http://wiki.openssi.org/go/Features > > Cheers, > -- > Kilian > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Gregory R. Warnes, Ph.D Program Director Center for Computational Arts, Sciences, and Engineering University of Rochester Tel: 585-273-2794 Fax: 585-276-2097 Email: gregory.warnes@rochester.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/6b37be64/attachment.html From landman at scalableinformatics.com Tue Jul 1 11:06:34 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: References: Message-ID: <486A722A.3000405@scalableinformatics.com> Hi Job Jon Aquilina wrote: > does anyone know an altenative to openmosix?? would it be worth reviving > the development of the kernel? OpenMOSIX was all about process migration between different independent OSes. You can still get some of that with Scyld, with OpenSSI, and a few others. If you prefer more of an SMP model (simpler programming), you should look at ScaleMP DSMs. Some on this list argue the shared memory programming is not easier than distributed memory programming, though I am not one of them who makes this argument. It has different challenges, costs and benefits than MPI. It has different limitations. Not so surprisingly, with the advent of many-core units, shared memory programming techniques are needed to get good performance within a single system. Disclosure: We are looking at these units for some of our work. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615 From smulcahy at aplpi.com Tue Jul 1 11:11:23 2008 From: smulcahy at aplpi.com (stephen mulcahy) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: <486A6D59.7020704@scalableinformatics.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486A6D59.7020704@scalableinformatics.com> Message-ID: <486A734B.3000701@aplpi.com> Joe Landman wrote: > Hmmm.... for me, its all about the kernel. Thats 90+% of the battle. > Some distros use good kernels, some do not. I won't mention who I think > is in the latter category. > .. > We generally don't care which distro is used. Only that the kernel is > reasonable, stable under load, and supports updated file systems/network > capability. This information would be most interesting to me and surely others on the list .. can you talk about of the distributions that provide "good kernels" if not about the others (and hey, theres hundreds of Linux distributions out there - http://lwn.net/Distributions/ so we couldn't infer the bad ones from your omissions ;) -stephen -- Stephen Mulcahy, Applepie Solutions Ltd., Innovation in Business Center, GMIT, Dublin Rd, Galway, Ireland. +353.91.751262 http://www.aplpi.com Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway) From geoff at galitz.org Tue Jul 1 12:05:54 2008 From: geoff at galitz.org (Geoff Galitz) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] Re: "hobbyists" In-Reply-To: <48693A89.3080605@moene.indiv.nluug.nl> References: <485920D8.2030309@ias.edu> <6.2.5.6.2.20080618164843.02b1bd30@jpl.nasa.gov> <200806190945.21604.kilian@stanford.edu><485A9520.2080508@scalableinformatics.com> <48693A89.3080605@moene.indiv.nluug.nl> Message-ID: <128FF5A06DBD4D74B8AA8CB6E4EF4B1F@geoffPC> Ohh... I was just waiting for the conversation to back to this. For an inside perspective: http://www.spiegel.de/international/europe/0,1518,562315,00.html Does that make me on-topic? -geoff Geoff Galitz Blankenheim NRW, Deutschland http://www.galitz.org -----Original Message----- From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Toon Moene Sent: Montag, 30. Juni 2008 21:57 To: Joe Landman Cc: beowulf@beowulf.org Subject: Re: [Beowulf] Re: "hobbyists" Joe Landman wrote: > Tactical nukes (aimed at armies) were on the table for a few of the NATO > scenarios involving responses to Soviet invasion of western Europe > (based upon some of the historical reading, though I am not sure how > serious they were). The western Europeans were understandably > un-enthusiastic about such scenarios. You bet we were. I was in the organization of the 400,000+ protest in Amsterdam in November. 1981. Cannon-fodder at a high level ... -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ Progress of GNU Fortran: http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jan.heichler at gmx.net Tue Jul 1 12:08:32 2008 From: jan.heichler at gmx.net (Jan Heichler) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> Message-ID: <66506789.20080701210832@gmx.net> An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/f49a393d/attachment.html From jan.heichler at gmx.net Tue Jul 1 12:09:08 2008 From: jan.heichler at gmx.net (Jan Heichler) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> Message-ID: <62974595.20080701210908@gmx.net> Hallo Dan, Dienstag, 1. Juli 2008, meintest Du: >>Hi Jon, >>We have our own stack which we stick on top of the customers favourite >>red hat clone. Usually Scientific Linux. >>Here is a bit more about it. >>http://www.clustervision.com/products_os.php >>We sell as a standalone product and it does quite well. I could even >>go so far to say that it is 'stack of choice' in many European >>institutions. DKqc> Every throught of getting a job in Sales and Marketing? :-) What makes you think that he hasn't that kind of job? ;-) @Andy: SCNR Regards Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/74653d4c/attachment.html From hahn at mcmaster.ca Tue Jul 1 12:19:24 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: <486A6760.5010006@ias.edu> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> Message-ID: >>> does it necessarily have to be a redhat clone. can it also be a debian >>> based >>> clone? >> >> but why? is there some concrete advantage to using Debian? >> I've never understood why Debian users tend to be very True Believer, >> or what it is that hooks them. > > And the Debian users can say the same thing about Red Hat users. Or SUSE very nice! an excellent parody of the True Believer response. but I ask again: what are the reasons one might prefer using debian? really, I'm not criticizing it - I really would like to know why it would matter whether someone (such as ClusterVisionOS (tm)) would use debian or another distro. From matt at technoronin.com Tue Jul 1 12:30:05 2008 From: matt at technoronin.com (Matt Lawrence) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] Re: Beowulf Digest, Vol 53, Issue 1 In-Reply-To: References: <200807010728.m617S3Ub011226@bluewest.scyld.com> Message-ID: On Tue, 1 Jul 2008, Mark Kosmowski wrote: > As heretical as this last sounds, I'm tempted to throw in the towel on > my PhD studies because I can no longer afford the power to run my > three node cluster at home. Energy costs may end up being the straw > that breaks this camel's back. Perhaps you should consider getting time on someone else's cluster. For something that only requires three nodes, there should be quite a number of places to run. -- Matt It's not what I know that counts. It's what I can remember in time to use. From hahn at mcmaster.ca Tue Jul 1 12:25:48 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: <486A6D59.7020704@scalableinformatics.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486A6D59.7020704@scalableinformatics.com> Message-ID: > Hmmm.... for me, its all about the kernel. Thats 90+% of the battle. Some > distros use good kernels, some do not. I won't mention who I think is in the > latter category. I was hoping for some discussion of concrete issues. for instance, I have the impression debian uses something other than sysvinit - does that work out well? is it a problem getting commercial packages (pathscale/pgi/intel compilers, gaussian, etc) to run? the couple debian people I know tend to have more ideological motives (which I do NOT impugn, except that I am personally more swayed by practical, concrete reasons.) From landman at scalableinformatics.com Tue Jul 1 12:53:23 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486A6D59.7020704@scalableinformatics.com> Message-ID: <486A8B33.7020600@scalableinformatics.com> Mark Hahn wrote: >> Hmmm.... for me, its all about the kernel. Thats 90+% of the battle. >> Some distros use good kernels, some do not. I won't mention who I >> think is in the latter category. > > I was hoping for some discussion of concrete issues. for instance, > I have the impression debian uses something other than sysvinit - does > that work out well? is it a problem getting commercial packages > (pathscale/pgi/intel compilers, gaussian, etc) to run? Hi Mark: We have multiple Ubuntu servers up, and thus far, no major problems ... just a few "translational" gotchas. We have successfully run pgi, intel, gaussian, gamess, ... on our Ubuntu units as well as our RHEL/Centos, Fedora, ... > > the couple debian people I know tend to have more ideological motives Yeah ... can't escape this. I like some of the elements of Ubuntu/Debian better than I do RHEL (the network configuration in Debian is IMO sane, while in RHEL/Centos/SuSE it is not). There are some aspects that are worse (no /etc/profile.d ... so I add that back in by hand ). > (which I do NOT impugn, except that I am personally more swayed by > practical, concrete reasons.) Building and deploying updated/correct kernels with Ubuntu/Debian is far easier (the build is much easier/saner) than with SuSE, RHEL, ... From a pragmatic view, this is what why we have a slight preference for that. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From lindahl at pbm.com Tue Jul 1 13:01:23 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: <486A722A.3000405@scalableinformatics.com> References: <486A722A.3000405@scalableinformatics.com> Message-ID: <20080701200122.GA23583@bx9.net> On Tue, Jul 01, 2008 at 02:06:34PM -0400, Joe Landman wrote: > If you prefer more of an SMP model (simpler programming), you should > look at ScaleMP DSMs. Some on this list argue the shared memory > programming is not easier than distributed memory programming, Gee, and I thought the biggest argument about ScaleMP was that the previous 50 times the same thing was attempted, it had low performance. I'd love to see some benchmarks (other than Stream). So if you do look at it, please share. -- greg From asabigue at fing.edu.uy Tue Jul 1 13:14:26 2008 From: asabigue at fing.edu.uy (ariel sabiguero yawelak) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] Re: Beowulf Digest, Vol 53, Issue 1 In-Reply-To: References: <200807010728.m617S3Ub011226@bluewest.scyld.com> Message-ID: <486A9022.5070109@fing.edu.uy> Well Mark, don't give up! I am not sure which one is your application domain, but if you require 24x7 computation, then you should not be hosting that at home. On the other hand, if you are not doing real computation and you just have a testbed at home, maybe for debugging your parallel applications or something similar, you might be interested in a virtualized solution. Several years ago, I used to "debug" some neural networks at home, but training sessions (up to two weeks of training) happened at the university. I would suggest to do something like that. You can always scale-down your problem in several phases and save the complete data-set / problem for THE RUN. You are not being a heretic there, but suffering energy costs ;-) In more places that you may believe, useful computing nodes are being replaced just because of energy costs. Even in some application domains you can even loose computational power if you move from 4 nodes into a single quad-core (i.e. memory bandwidth problems). I know it is very nice to be able to do everything at home.. but maybe before dropping your studies or working overtime to pay the electricity bill, you might want to reconsider the fact of collapsing your phisical deploy into a single virtualized cluster. (or just dispatch several threads/processes in a single system). If you collapse into a single system you have only 1 mainboard, one HDD, one power source, one processor (physically speaking), .... and you can achieve almost the performance of 4 systems in one, consuming the power of.... well maybe even less than a single one. I don't want to go into discussions about performance gain/loose due to the variation of the hardware architecture. Invest some bucks (if you haven't done that yet) in a good power source. Efficiency of OEM unbranded power sources is realy pathetic. may be 45-50% efficiency, while a good power source might be 75-80% efficient. Use the energy for computing, not for heating your house. What I mean is that you could consider just collapsing a complete "small" cluster into single system. If your application is CPU-bound and not I/O bound, VMware Server could be an option, as it is free software (unfortunately not open, even tough some patches can be done on the drivers). I think it is not possible to publish benchmarking data about VMware, but I can tell you that in long timescales, the performance you get in the host OS is similar than the one of the guest OS. There are a lot of problems related to jitter, from crazy clocks to delays, but if your application is not sensitive to that, then you are Ok. Maybe this is not a solution, but you can provide more information regarding your problem before quitting... my 2 cents.... ariel Mark Kosmowski escribi?: > At some point there a cost-benefit analysis needs to be performed. If > my cluster at peak usage only uses 4 Gb RAM per CPU (I live in > single-core land still and do not yet differentiate between CPU and > core) and my nodes all have 16 Gb per CPU then I am wasting RAM > resources and would be better off buying new machines and physically > transferring the RAM to and from them or running more jobs each > distributed across fewer CPUs. Or saving on my electricity bill and > powering down some nodes. > > As heretical as this last sounds, I'm tempted to throw in the towel on > my PhD studies because I can no longer afford the power to run my > three node cluster at home. Energy costs may end up being the straw > that breaks this camel's back. > > Mark E. Kosmowski > > >> From: "Jon Aquilina" >> > > >> not sure if this applies to all kinds of senarios that clusters are used in >> but isnt the more ram you have the better? >> >> On 6/30/08, Vincent Diepeveen wrote: >> >>> Toon, >>> >>> Can you drop a line on how important RAM is for weather forecasting in >>> latest type of calculations you're performing? >>> >>> Thanks, >>> Vincent >>> >>> >>> On Jun 30, 2008, at 8:20 PM, Toon Moene wrote: >>> >>> Jim Lux wrote: >>> >>>> Yep. And for good reason. Even a big DoD job is still tiny in Nvidia's >>>> >>>>> scale of operations. We face this all the time with NASA work. >>>>> Semiconductor manufacturers have no real reason to produce special purpose >>>>> or customized versions of their products for space use, because they can >>>>> sell all they can make to the consumer market. More than once, I've had a >>>>> phone call along the lines of this: >>>>> "Jim: I'm interested in your new ABC321 part." >>>>> "Rep: Great. I'll just send the NDA over and we can talk about it." >>>>> "Jim: Great, you have my email and my fax # is..." >>>>> "Rep: By the way, what sort of volume are you going to be using?" >>>>> "Jim: Oh, 10-12.." >>>>> "Rep: thousand per week, excellent..." >>>>> "Jim: No, a dozen pieces, total, lifetime buy, or at best maybe every >>>>> year." >>>>> "Rep: Oh..." >>>>> {Well, to be fair, it's not that bad, they don't hang up on you.. >>>>> >>>>> >>>> Since about a year, it's been clear to me that weather forecasting (i.e., >>>> running a more or less sophisticated atmospheric model to provide weather >>>> predictions) is going to be "mainstream" in the sense that every business >>>> that needs such forecasts for its operations can simply run them in-house. >>>> >>>> Case in point: I bought a $1100 HP box (the obvious target group being >>>> teenage downloaders) which performs the HIRLAM limited area model *on the >>>> grid that we used until October 2006* in December last year. >>>> >>>> It's about twice as slow as our then-operational 50-CPU Sun Fire 15K. >>>> >>>> I wonder what effect this will have on CPU developments ... >>>> >>>> -- >>>> Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 >>>> Saturnushof 14, 3738 XG Maartensdijk, The Netherlands >>>> At home: http://moene.indiv.nluug.nl/~toon/ >>>> Progress of GNU Fortran: http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html >>>> >>>> >>> _______________________________________________ >>> Beowulf mailing list, Beowulf@beowulf.org >>> To change your subscription (digest mode or unsubscribe) visit >>> http://www.beowulf.org/mailman/listinfo/beowulf >>> >>> >> >> -- >> Jonathan Aquilina >> > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > From perry at piermont.com Tue Jul 1 13:21:55 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <20080701164747.GA15901@gretchen.aei.uni-hannover.de> (Henning Fehrmann's message of "Tue\, 1 Jul 2008 18\:47\:47 +0200") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> Message-ID: <87fxqtuzh8.fsf@snark.cb.piermont.com> Henning Fehrmann writes: >> Thus, your problem sounds rather odd. There is no obvious reason you >> should be limited to 360 connections. >> >> Perhaps your problem is not what you think it is at all. Could you >> explain it in more detail? > > I guess it has also something to do with the automounter. I am not able > to increase this number. > But even if the automounter would handle more we need to be able to > use higher ports: > netstat shows always ports below 1024. > > tcp 0 0 client:941 server:nfs > > We need to mount up to 1400 nfs exports. All NFS clients are connecting to a single port, not to a different port for every NFS export. You do not need 1400 listening TCP ports on a server to export 1400 different file systems. Only one port is needed, whether you are exporting one file system or one million, just as only one SMTP port is needed whether you are receiving mail from one client or from one million. The clients are connecting from ports below 1024 because Berkeley set up a hack in the original BSD stack so that only root could open ports below 1024. This way, you could "know" the process on the remote host was a root process, thus you could feel "secure" [sic]. It doesn't add any real security any more, but it is also not the cause of any problem you are experiencing. We can help you figure this out, but you will have to give a lot more detail about the problem. Please describe your network setup. How many servers do you have? How many clients? How many file systems are those servers exporting? How many is a typical client mounting, and why? Start there and we can try to move forward. -- Perry E. Metzger perry@piermont.com From landman at scalableinformatics.com Tue Jul 1 13:24:04 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: <20080701200122.GA23583@bx9.net> References: <486A722A.3000405@scalableinformatics.com> <20080701200122.GA23583@bx9.net> Message-ID: <486A9264.5090902@scalableinformatics.com> Greg Lindahl wrote: > On Tue, Jul 01, 2008 at 02:06:34PM -0400, Joe Landman wrote: > >> If you prefer more of an SMP model (simpler programming), you should >> look at ScaleMP DSMs. Some on this list argue the shared memory >> programming is not easier than distributed memory programming, > > Gee, and I thought the biggest argument about ScaleMP was that the > previous 50 times the same thing was attempted, it had low > performance. The researchy DSMs had low performance. That is known. This one seems not to be bad over good IB nets. You always have latency. Can't escape that. > I'd love to see some benchmarks (other than Stream). So if you do look > at it, please share. If you are serious about this, I'll bug Shai as to what is shareable. He does have benchmarks. The ones I have seen (real applications, not microbenchmarks), looked pretty good. Which is why we are looking at them. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From andrew at moonet.co.uk Tue Jul 1 13:35:29 2008 From: andrew at moonet.co.uk (andrew holway) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <66506789.20080701210832@gmx.net> Message-ID: > does it necessarily have to be a redhat clone. can it also be a debian based > clone? Not at all, If there were demand or a customer with enough cash to throw at the job then we would of course accommodate his every need. Considering that it is taking several rather expensive developers quite a long time to push out the latest incarnation, ClusterVisionOS 4 through beta this cost could be considerable to ensure a stable environment. I'm no expert in the subtleties of distributions but maintaining and supporting one to a high enough standard is quite enough work thanks very much :) Ta Andy From lindahl at pbm.com Tue Jul 1 13:37:14 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: <486A9264.5090902@scalableinformatics.com> References: <486A722A.3000405@scalableinformatics.com> <20080701200122.GA23583@bx9.net> <486A9264.5090902@scalableinformatics.com> Message-ID: <20080701203713.GB28024@bx9.net> On Tue, Jul 01, 2008 at 04:24:04PM -0400, Joe Landman wrote: > If you are serious about this, I'll bug Shai as to what is shareable. He > does have benchmarks. The ones I have seen (real applications, not > microbenchmarks), looked pretty good. Which is why we are looking at > them. If you look back on this mailing list, you'll see that I asked him for benchmarks, and he posted stream. Which isn't interesting, because it's embarrassingly parallel. -- greg From hahn at mcmaster.ca Tue Jul 1 13:44:05 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: <20080701200122.GA23583@bx9.net> References: <486A722A.3000405@scalableinformatics.com> <20080701200122.GA23583@bx9.net> Message-ID: > I'd love to see some benchmarks (other than Stream). So if you do look > at it, please share. me too. in particular, I'd like to see "hot page" performance - where a multithreaded program bangs on a heavily write-shared page. From gerry.creager at tamu.edu Tue Jul 1 13:57:08 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:20 2009 Subject: Commodity supercomputing, was: Re: NDAs Re: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point? In-Reply-To: <486A9822.7000902@moene.indiv.nluug.nl> References: <1210016466.4924.1.camel@Vigor13> <48551E70.7070507@scalableinformatics.com> <4AF41375-3A13-4691-A2A1-D5B853FEC3A4@xs4all.nl> <20080615154227.u8fwdpn08ww4c40k@webmail.jpl.nasa.gov> <6.2.5.6.2.20080616084554.02e4dd18@jpl.nasa.gov> <486923D6.8070907@moene.indiv.nluug.nl> <1214864562.6912.29.camel@Vigor13> <486A1F7B.9080408@tamu.edu> <486A9822.7000902@moene.indiv.nluug.nl> Message-ID: <486A9A24.9000800@tamu.edu> I was at the WRF conf. last week. A colleague from the Netherlands was lamenting that he couldn't get ECMWF data (I don't recall the annual cost/year but it was huge). NOAA/NCEP GFS data are available via FTP and regular enough to allow really simple scripting, as well as other methods. I don't understand why folks wouldn't use these data. As for competing, if our companies are not sufficiently technically astute, should we be protecting them from European companies, just because the data are free? Toon Moene wrote: > Gerry Creager wrote: > >> In the US, at least for academic institutions and hobbyists, surface >> and upper air observations of the sort you describe are generally >> available for incorporation into models for data assimilation. Models >> are generally forced and bounded using model data from other >> atmospheric models, also available. As I understand it from >> colleagues in Europe, getting similar data over there is more >> problemmatical. > > Exactly ! And what happens in Europe is that companies take the freely > available US data, use it to compete with US companies, and disregard > the (meteorological superior) ECMWF data, because it is not free. > > A colleague of mine held some very unpopular talks in Reading, England, > about this (according to his figures, 99 % of the meteorological data > used in Europe originates from the US). > -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From eagles051387 at gmail.com Tue Jul 1 15:53:34 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: <486A4296.4050501@hope.edu> References: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> <3CB66E9F377C4961B5457896137EAD1B@geoffPC> <486A4296.4050501@hope.edu> Message-ID: my idea is more of for my thesis. if i am goign ot do anything like this. vernard thanks for the link. whats it like in a cluster environment? On Tue, Jul 1, 2008 at 4:43 PM, Paul Van Allsburg wrote: > I'd like to do the same, as a project for a group of students... Please > keep me in the loop? > Thanks! > Paul > > -- > Paul Van Allsburg Computational Science & Modeling Facilitator > Natural Sciences Division, Hope College > 35 East 12th Street > Holland, Michigan 49423 > 616-395-7292 http://www.hope.edu/academic/csm/ > > > Jon Aquilina wrote: > >> that would be greatly appreciated >> >> On 7/1/08, *Geoff Galitz* > >> wrote: >> >> >> That is out of my field of expertise. Sounds like a question for >> professional digital artists. I can put you in touch some folks >> that most likely know the answer to your questions, if you like. >> >> >> Anybody know of any current approaches to this? >> >> >> Geoff Galitz >> Blankenheim NRW, Deutschland >> http://www.galitz.org >> >> * From: * Jon Aquilina [mailto:eagles051387@gmail.com >> ] >> *Sent:* Dienstag, 1. Juli 2008 13:27 >> *To:* Geoff Galitz >> *Cc:* Beowulf Mailing List >> *Subject:* Re: [Beowulf] software for compatible with a cluster >> >> >> reason i am asking is because i would like to setup a rendering >> cluster and provide rendering services. does this also work for 3d >> animated movies that require rendering or does one need somethin >> entierly different for that? >> >> On 7/1/08, *Geoff Galitz* > > wrote: >> >> >> >> I know people who use Houdini for this: >> >> >> http://www.sidefx.com/index.php >> >> >> I cannot vouch for how well it works or what is involved, though. >> >> >> >> Geoff Galitz >> Blankenheim NRW, Deutschland >> http://www.galitz.org >> >> * From: * beowulf-bounces@beowulf.org >> >> [mailto:beowulf-bounces@beowulf.org >> ] *On Behalf Of *Jon Aquilina >> *Sent:* Dienstag, 1. Juli 2008 12:40 >> *To:* Beowulf Mailing List >> *Subject:* [Beowulf] software for compatible with a cluster >> >> >> does anyone know of any rendering software that will work with a >> cluster? >> >> -- Jonathan Aquilina >> >> >> >> >> -- Jonathan Aquilina >> >> >> >> >> -- >> Jonathan Aquilina >> > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080702/e3522a07/attachment.html From perry at piermont.com Tue Jul 1 16:23:10 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: <486A6760.5010006@ias.edu> (Prentice Bisbal's message of "Tue\, 01 Jul 2008 13\:20\:32 -0400") References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> Message-ID: <87bq1hgpep.fsf@snark.cb.piermont.com> Prentice Bisbal writes: >>> does it necessarily have to be a redhat clone. can it also be a debian >>> based >>> clone? >> >> but why? is there some concrete advantage to using Debian? >> I've never understood why Debian users tend to be very True Believer, >> or what it is that hooks them. > > And the Debian users can say the same thing about Red Hat users. Or SUSE > users. And if any still exist, the Slackware users could say the same > thing about the both of them. But then the Slackware users could also > point out that the first Linux distro was Slackware, so they are using > the one true Linux distro... Precisely. It pays to allow people to use what they want. Fewer religious battles that way. Whether one distro or another has an advantage isn't the point -- people have their own tastes and it doesn't pay to tell them "no" without good reason. Perry -- Perry E. Metzger perry@piermont.com From perry at piermont.com Tue Jul 1 16:25:19 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: (Mark Hahn's message of "Tue\, 1 Jul 2008 15\:25\:48 -0400 \(EDT\)") References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486A6D59.7020704@scalableinformatics.com> Message-ID: <877ic5gpb4.fsf@snark.cb.piermont.com> Mark Hahn writes: > I was hoping for some discussion of concrete issues. for instance, > I have the impression debian uses something other than sysvinit - > does that work out well? is it a problem getting commercial packages > (pathscale/pgi/intel compilers, gaussian, etc) to run? It is trivial to port init scripts between different init systems. They're just short shell scripts, they're utterly readable, and any sysadmin worth their salt can make the needed changes in a few minutes. If you have a large cluster, you need such a person anyway. Perry -- Perry E. Metzger perry@piermont.com From perry at piermont.com Tue Jul 1 16:31:50 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: (Mark Hahn's message of "Tue\, 1 Jul 2008 15\:19\:24 -0400 \(EDT\)") References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> Message-ID: <873amtgp09.fsf@snark.cb.piermont.com> Mark Hahn writes: >>> but why? is there some concrete advantage to using Debian? >>> I've never understood why Debian users tend to be very True Believer, >>> or what it is that hooks them. >> >> And the Debian users can say the same thing about Red Hat users. Or SUSE > > very nice! an excellent parody of the True Believer response. Actually, he was just being reasonable. > but I ask again: what are the reasons one might prefer using debian? > really, I'm not criticizing it - I really would like to know why it > would matter whether someone (such as ClusterVisionOS (tm)) would use > debian or another distro. Often it is just a question of what the people using the system are used to. I often prefer using BSD systems, largely because of certain technical advantages, but also to a great extent because my first big Unix boxes were Vaxes running 4.2BSD in the early 1980s and after 25 years with the same flavor of Unix you get used to the way things are done. It is much the same reason I use Emacs instead of vi -- I started using Emacs on Tops-20 decades ago and I'm too used to it now. If you told me I "have" to use vi, things would get ugly, even though I don't think there is anything wrong with using vi per se. Perry -- Perry E. Metzger perry@piermont.com From perry at piermont.com Tue Jul 1 16:34:17 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: (Jon Aquilina's message of "Wed\, 2 Jul 2008 00\:53\:34 +0200") References: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> <3CB66E9F377C4961B5457896137EAD1B@geoffPC> <486A4296.4050501@hope.edu> Message-ID: <87y74lfabq.fsf@snark.cb.piermont.com> "Jon Aquilina" writes: > my idea is more of for my thesis. If you're trying to do 3d animation on the cheap and you want something that's already cluster capable, I'd try Blender. It is open source and it has already made some reasonable length movies. Not being an animation type, I know nothing about how nice it is compared to commercial products, but it is hard to beat the price. Perry -- Perry E. Metzger perry@piermont.com From hahn at mcmaster.ca Tue Jul 1 22:06:43 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: References: Message-ID: >> I was hoping for some discussion of concrete issues. for instance, >> I have the impression debian uses something other than sysvinit - >> does that work out well? >> > Debian uses standard sysvinit-style scripts in /etc/init.d, /etc/rc0.d, ... thanks. I guess I was assuming that mainstream debian was like ubuntu. >> is it a problem getting commercial >> packages (pathscale/pgi/intel compilers, gaussian, etc) to run? >> > I¹ve never had any major problems. Most linux vendors supply both RPM¹s and > .tar.gz installers, and I generally have better luck with the latter, even > on RPM based systems anyway. interesting - I wonder why. the main difference would be that the rpm format encodes dependencies... >> the couple debian people I know tend to have more ideological motives >> (which I do NOT impugn, except that I am personally more swayed by >> practical, concrete reasons.) >> > My Œconversion¹ to use of Debian had little to do with ideological motives, > and a lot more to do with minimizing the amount of time I had to take away > from my research to support the Linux clusters I was maintaining at the > time. again interesting, thanks. what sorts of things in rpm-based distros consumed your time? > Side note, one very nice thing about debian is the ability to upgrade a > system in-place from one O/S release to another via > > apt-get dist-upgrade > > Much nicer than reinstalling the O/S as seems to be (used to be?) the norm > with RPM-based systems I've done major version upgrades using rpm, admittedly in the pre-fedora days. it _is_ a nice capability - I'm a little surprised desktop-oriented distros don't emphasize it... From tjrc at sanger.ac.uk Tue Jul 1 22:37:19 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: <486A8B33.7020600@scalableinformatics.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486A6D59.7020704@scalableinformatics.com> <486A8B33.7020600@scalableinformatics.com> Message-ID: <86A30BE3-3B8E-47C8-8286-D2D7E2C74A40@sanger.ac.uk> On 1 Jul 2008, at 8:53 pm, Joe Landman wrote: >> the couple debian people I know tend to have more ideological motives > > Yeah ... can't escape this. Indeed. Ubuntu is slightly more pragmatic than Debian, as far as the ideological stuff goes. > I like some of the elements of Ubuntu/Debian better than I do RHEL > (the network configuration in Debian is IMO sane, while in RHEL/ > Centos/SuSE it is not). There are some aspects that are worse (no / > etc/profile.d ... so I add that back in by hand ). Here, our clusters all run Debian, but we also have RHAS and SLES around when support matrices demand it (Oracle, mainly). I'd agree that fundamentally it's a case of what you're used to. We stopped using Red Hat widely about four years ago, and the reasons (which are probably not valid any more) were: 1) Not all userland programs were 64-bit file aware. 2) There were certain features which we just couldn't get to work properly on RHAS - a prime example being multipath SAN access. It "just worked" on Debian. 3) Smooth upgrades from one major release to the next without having to reinstall. While this is probably not important for beowulf nodes, it is for more complex servers. I still prefer Debian's package management system, but that's probably because I'm used to it, rather than it inherently being superior. yast2 can do pretty much everything that aptitude does, although I think aptitude is more amenable to automation through cfengine and the like. There are some very powerful little parts of the packaging system, like dpkg-divert, which allows you to replace a file from a package with your own, in such a way that it will not be overwritten the next time the package is upgraded. For those of us that need to customise our systems that sort of thing is very useful, and saves a lot of work down the line. >> (which I do NOT impugn, except that I am personally more swayed by >> practical, concrete reasons.) > > > Building and deploying updated/correct kernels with Ubuntu/Debian is > far easier (the build is much easier/saner) than with SuSE, > RHEL, ... From a pragmatic view, this is what why we have a slight > preference for that. I'd agree with that. Using make-kpkg to build a custom kernel .deb which you can then easily deploy to all your machines is a real boon. At the end of the day, people should use what they're comfortable with. I don't necessarily buy the support argument; there are some companies (Platform, for example) who will support you whichever distro you use; all they care about is what kernel version and C library version you're running. I like this attitude and I wish it was more widespread amongst proprietary software vendors. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From eagles051387 at gmail.com Tue Jul 1 22:37:21 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: <87y74lfabq.fsf@snark.cb.piermont.com> References: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> <3CB66E9F377C4961B5457896137EAD1B@geoffPC> <486A4296.4050501@hope.edu> <87y74lfabq.fsf@snark.cb.piermont.com> Message-ID: if i use blender how nicely does it work in a cluster? On Wed, Jul 2, 2008 at 1:34 AM, Perry E. Metzger wrote: > > "Jon Aquilina" writes: > > my idea is more of for my thesis. > > If you're trying to do 3d animation on the cheap and you want > something that's already cluster capable, I'd try Blender. It is open > source and it has already made some reasonable length movies. Not > being an animation type, I know nothing about how nice it is compared > to commercial products, but it is hard to beat the price. > > Perry > -- > Perry E. Metzger perry@piermont.com > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080702/97bf9a8b/attachment.html From carsten.aulbert at aei.mpg.de Wed Jul 2 00:26:58 2008 From: carsten.aulbert at aei.mpg.de (Carsten Aulbert) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <87fxqtuzh8.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> Message-ID: <486B2DC2.9010604@aei.mpg.de> Hi Perry, Perry E. Metzger wrote: > > All NFS clients are connecting to a single port, not to a different > port for every NFS export. You do not need 1400 listening TCP ports on > a server to export 1400 different file systems. Only one port is > needed, whether you are exporting one file system or one million, just > as only one SMTP port is needed whether you are receiving mail from > one client or from one million. > That's clear and not the problem > The clients are connecting from ports below 1024 because Berkeley set > up a hack in the original BSD stack so that only root could open ports > below 1024. This way, you could "know" the process on the remote host > was a root process, thus you could feel "secure" [sic]. It doesn't add > any real security any more, but it is also not the cause of any > problem you are experiencing. We might run out of "secure" ports. > We can help you figure this out, but you will have to give a lot more > detail about the problem. Please describe your network setup. How many > servers do you have? How many clients? How many file systems are those > servers exporting? How many is a typical client mounting, and why? > Start there and we can try to move forward. > OK, we have 1342 nodes which act as servers as well as clients. Every node exports a single local directory and all other nodes can mount this. What we do now to optimize the available bandwidth and IOs is spread millions of files according to a hash algorithm to all nodes (multiple copies as well) and then run a few 1000 jobs opening one file from one box then one file from the other box and so on. With a short autofs timeout that ought to work. Typically it is possible that a single process opens about 10-15 files per second, i.e. making 10-15 mounts per second. With 4 parallel process per node that's 40-60 mounts/second. With a timeout of 5 seconds we should roughly have 200-300 concurrent mounts (on average, no idea abut the variance). Our tests so far have shown that sometimes a node keeps a few mounts open (autofs4 problems AFAIK) and at some point is not able to mount more shares. Usually this occurs at about 350 mounts and we are not yet 100% sure if we are running out of secure ports. All our boxes export now with "insecure" option (NFSv3), but our clients all connect from a "secure" port, anyone here who might give us a hint how to force this in Linux? Thanks a lot Carsten From tjrc at sanger.ac.uk Wed Jul 2 01:19:50 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486B2DC2.9010604@aei.mpg.de> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> Message-ID: <66EB3DC0-B281-4869-BB8E-A55E577C44FE@sanger.ac.uk> On 2 Jul 2008, at 8:26 am, Carsten Aulbert wrote: > OK, we have 1342 nodes which act as servers as well as clients. Every > node exports a single local directory and all other nodes can mount > this. > > What we do now to optimize the available bandwidth and IOs is spread > millions of files according to a hash algorithm to all nodes (multiple > copies as well) and then run a few 1000 jobs opening one file from one > box then one file from the other box and so on. With a short autofs > timeout that ought to work. Typically it is possible that a single > process opens about 10-15 files per second, i.e. making 10-15 mounts > per > second. With 4 parallel process per node that's 40-60 mounts/second. > With a timeout of 5 seconds we should roughly have 200-300 concurrent > mounts (on average, no idea abut the variance). Please tell me you're not serious! The overheads of just performing the NFS mounts are going to kill you, never mind all the network traffic going all over the place. Since you've distributed the files to the local disks of the nodes, surely the right way to perform this work is to schedule the computations so that each node works on the data on its own local disk, and doesn't have to talk networked storage at all? Or don't you know in advance which files a particular job is going to need? Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From henning.fehrmann at aei.mpg.de Wed Jul 2 01:44:58 2008 From: henning.fehrmann at aei.mpg.de (Henning Fehrmann) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <66EB3DC0-B281-4869-BB8E-A55E577C44FE@sanger.ac.uk> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <66EB3DC0-B281-4869-BB8E-A55E577C44FE@sanger.ac.uk> Message-ID: <20080702084458.GA12879@gretchen.aei.uni-hannover.de> On Wed, Jul 02, 2008 at 09:19:50AM +0100, Tim Cutts wrote: > > On 2 Jul 2008, at 8:26 am, Carsten Aulbert wrote: > > >OK, we have 1342 nodes which act as servers as well as clients. Every > >node exports a single local directory and all other nodes can mount this. > > > >What we do now to optimize the available bandwidth and IOs is spread > >millions of files according to a hash algorithm to all nodes (multiple > >copies as well) and then run a few 1000 jobs opening one file from one > >box then one file from the other box and so on. With a short autofs > >timeout that ought to work. Typically it is possible that a single > >process opens about 10-15 files per second, i.e. making 10-15 mounts per > >second. With 4 parallel process per node that's 40-60 mounts/second. > >With a timeout of 5 seconds we should roughly have 200-300 concurrent > >mounts (on average, no idea abut the variance). > > Please tell me you're not serious! The overheads of just performing the NFS mounts are going to kill you, never mind all the network traffic going > all over the place. > > Since you've distributed the files to the local disks of the nodes, surely the right way to perform this work is to schedule the computations so that > each node works on the data on its own local disk, and doesn't have to talk networked storage at all? Or don't you know in advance which files a > particular job is going to need? Yes, this is the problem. The amount of files is too big to store it everywhere (few TByte and 50 million files). Mounting a view NFS server does not provide the bandwidth. On the other hand, the coreswitch should be able to handle the flows non blocking. We think that nfs mounts are the fastest possibility to distribute the demanded files to the nodes. Henning From tjrc at sanger.ac.uk Wed Jul 2 01:45:21 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: References: Message-ID: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> On 2 Jul 2008, at 6:06 am, Mark Hahn wrote: >>> I was hoping for some discussion of concrete issues. for instance, >>> I have the impression debian uses something other than sysvinit - >>> does that work out well? >>> >> Debian uses standard sysvinit-style scripts in /etc/init.d, /etc/ >> rc0.d, ... > > thanks. I guess I was assuming that mainstream debian was like > ubuntu. It's sort of the other way around. Remember that Ubuntu is based off a six-monthly snapshot of Debian's testing track, which is why Hardy looks a lot more like the upcoming Debian Lenny than it does like Debian Etch. > interesting - I wonder why. the main difference would be that the > rpm format encodes dependencies... The difficulty is that many ISVs tend to do a fairly terrible job of packaging their applications as RPM's or DEB's, for example creating init scripts which don't obey the distribution's policies, or making willy-nilly modifications to configuration files all over the place, even in other packages (which in the Debian world is a *big* no-no, that's why many Debian/Ubuntu packages have now moved to the conf.d type of configuration directory, so that other packages can drop in little independent snippets of configuration) I have seen, for example, .deb packages from a Large Company With Which We Are All Familiar which essentially attempted to convert your system into a Red Hat system by moving all your init scripts around and whatnot, so once you'd installed this abomination, you'd totally wrecked the ability of many of the main distro packages to be updated ever again. Oh, and of course uninstalling the package didn't put anything back the way it had been before. Like you, I tend to use tarballs if they are available, and if I want to turn them into packages I do it myself, and make sure they are policy compliant for the distro. So this, while not a statement in favour of either flavour of distro, is definitely a warning to be very wary of what packages that have come from sources other than the distro itself might do (which of course, you'd be wary of anyway for security reasons). Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From ajt at rri.sari.ac.uk Wed Jul 2 02:23:06 2008 From: ajt at rri.sari.ac.uk (Tony Travis) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> Message-ID: <486B48FA.7080403@rri.sari.ac.uk> Mark Hahn wrote: >[...] > but I ask again: what are the reasons one might prefer using debian? > really, I'm not criticizing it - I really would like to know why it > would matter whether someone (such as ClusterVisionOS (tm)) would use > debian or another distro. Hello, Mark. I've been on a well trodden path from trying out the 'free' version of Scyld under RH6.2, then using openMosix under all versions of RH up to RH9, Fedora up to core2, then Debian Sarge and now Ubuntu 6.06.1 LTS with an upgrade to 8.04.1 LTS imminent. As I see it, this has been a developmental journey and also a learning experiencefor me. As others on this thread have admitted, I'm not blind to the ideological objectives of Debian. However, I'm now using a very good commerically supported version of Linux with the what is widely acknowledged to be the largest user and developer community. It's my own experience of trying to do my work under RH/Fedora that's put me off these distro's and I see a BIG divide between 'real' HPC communities using BIG iron, and small Beowulf clusters like mine. I've got to admit that Tim Cutts did influence my decision to try out Debian (thanks, Tim!). I also use the (UK) NERC's Bio-Linux binary deb's and I was also influenced by their decision to change from RH to Debian for Bio-Linux. I can see that other communities use RH for similar reasons, though I should mention that our Beowulf spends a lot of time running quantum chemistry simulations (GAMESS etc.). I've pout up an Ubuntu blue-print for 'biobuntu', which consolidates the work I'm doing on several projects: https://blueprints.launchpad.net/ubuntu/+spec/biobuntu I am, of course, familiar with 'other' Biolinuxen and rpm repositories of bioinformatics software: http://en.wikipedia.org/wiki/BioLinux Having tried out many of these alternatives, I remain convinced that NEBC's Bio-Linux is most appropriate for my work. In particular, the level of support in the form of documentation and training courses provided by NEBC is very good. This means I don't have to reinvent the wheel - Always a good point for any Beowulf-related activity :-) Tony. -- Dr. A.J.Travis, | mailto:ajt@rri.sari.ac.uk Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751 Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687 From Bogdan.Costescu at iwr.uni-heidelberg.de Wed Jul 2 02:35:57 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486B2DC2.9010604@aei.mpg.de> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> Message-ID: On Wed, 2 Jul 2008, Carsten Aulbert wrote: > OK, we have 1342 nodes which act as servers as well as clients. Every > node exports a single local directory and all other nodes can mount this. Have you considered using a parallel file system ? > What we do now to optimize the available bandwidth and IOs is spread > millions of files according to a hash algorithm to all nodes (multiple > copies as well) There have been many talks of improving performance by paying attention to the data locality on this very list. Are you not able to move the code to where the data is or move the data to where the code is ? F.e. using a simple TCP connection (nc, rsh, rsync or even http) to transfer the file to the local disk before using it is probably more efficient than the way you use NFS is you deal with small files (as they have to be written to some local storage). The setup and tear-down costs of the NFS connection (automounter, mount, unmount) simply doesn't exist in this case; the transfer of data on the wire happens the same way. Or you could even get around the limitation of storing it locally by using a ramdisk to temporarily store the files (if you have the free memory...) - from what I understand they are read then used immediately and not needed again in a short time frame so it makes no sense to store them for longer, a perfect application for a tmpfs. -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From Bogdan.Costescu at iwr.uni-heidelberg.de Wed Jul 2 02:59:47 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> Message-ID: On Wed, 2 Jul 2008, Tim Cutts wrote: > The difficulty is that many ISVs tend to do a fairly terrible job of > packaging their applications as RPM's or DEB's I very much agree with this. While you mentioned init scripts that don't fit the distribution, I can add init scripts that are totally missing when they should be provided - a hand-made init script would not be part of the installed package and could fail in various ways if the package is updated or... uninstalled. > Like you, I tend to use tarballs if they are available, and if I > want to turn them into packages I do it myself, and make sure they > are policy compliant for the distro. I think that's actually more important than the distribution per-se. If you are able to package something to fit the distribution (f.e. to install a missing kernel module, add an important software package, etc.) you can more efficiently use your time later on as packaging (done properly) is normally a one-time effort. This goes into the direction that the admin should use the distribution, not fight it! -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From eagles051387 at gmail.com Wed Jul 2 04:16:43 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> Message-ID: one thing must not be forgotten though. in regards to pkging stuff for the ubuntu variation once someone like you and me you upload it for someone higher up on the chain to check and upload to the servers. so basically someone is checking what someone else has packaged. On 7/2/08, Tim Cutts wrote: > > > On 2 Jul 2008, at 6:06 am, Mark Hahn wrote: > > I was hoping for some discussion of concrete issues. for instance, >>>> I have the impression debian uses something other than sysvinit - >>>> does that work out well? >>>> >>>> Debian uses standard sysvinit-style scripts in /etc/init.d, /etc/rc0.d, >>> ... >>> >> >> thanks. I guess I was assuming that mainstream debian was like ubuntu. >> > > It's sort of the other way around. Remember that Ubuntu is based off a > six-monthly snapshot of Debian's testing track, which is why Hardy looks a > lot more like the upcoming Debian Lenny than it does like Debian Etch. > > interesting - I wonder why. the main difference would be that the rpm >> format encodes dependencies... >> > > The difficulty is that many ISVs tend to do a fairly terrible job of > packaging their applications as RPM's or DEB's, for example creating init > scripts which don't obey the distribution's policies, or making willy-nilly > modifications to configuration files all over the place, even in other > packages (which in the Debian world is a *big* no-no, that's why many > Debian/Ubuntu packages have now moved to the conf.d type of configuration > directory, so that other packages can drop in little independent snippets of > configuration) > > I have seen, for example, .deb packages from a Large Company With Which We > Are All Familiar which essentially attempted to convert your system into a > Red Hat system by moving all your init scripts around and whatnot, so once > you'd installed this abomination, you'd totally wrecked the ability of many > of the main distro packages to be updated ever again. Oh, and of course > uninstalling the package didn't put anything back the way it had been > before. > > Like you, I tend to use tarballs if they are available, and if I want to > turn them into packages I do it myself, and make sure they are policy > compliant for the distro. > > So this, while not a statement in favour of either flavour of distro, is > definitely a warning to be very wary of what packages that have come from > sources other than the distro itself might do (which of course, you'd be > wary of anyway for security reasons). > > Tim > > > -- > The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, > a charity registered in England with number 1021457 and acompany registered > in England with number 2742969, whose registeredoffice is 215 Euston Road, > London, NW1 2BE._______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080702/5e5d59d0/attachment.html From eagles051387 at gmail.com Wed Jul 2 04:18:20 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> Message-ID: im also not sure what support is like in other distros but i commend the kubuntu volunteers who man that irc channel for support as well as those who help with development. are there any other distros that provide support like this? On 7/2/08, Jon Aquilina wrote: > > one thing must not be forgotten though. in regards to pkging stuff for the > ubuntu variation once someone like you and me you upload it for someone > higher up on the chain to check and upload to the servers. so basically > someone is checking what someone else has packaged. > > On 7/2/08, Tim Cutts wrote: >> >> >> On 2 Jul 2008, at 6:06 am, Mark Hahn wrote: >> >> I was hoping for some discussion of concrete issues. for instance, >>>>> I have the impression debian uses something other than sysvinit - >>>>> does that work out well? >>>>> >>>>> Debian uses standard sysvinit-style scripts in /etc/init.d, /etc/rc0.d, >>>> ... >>>> >>> >>> thanks. I guess I was assuming that mainstream debian was like ubuntu. >>> >> >> It's sort of the other way around. Remember that Ubuntu is based off a >> six-monthly snapshot of Debian's testing track, which is why Hardy looks a >> lot more like the upcoming Debian Lenny than it does like Debian Etch. >> >> interesting - I wonder why. the main difference would be that the rpm >>> format encodes dependencies... >>> >> >> The difficulty is that many ISVs tend to do a fairly terrible job of >> packaging their applications as RPM's or DEB's, for example creating init >> scripts which don't obey the distribution's policies, or making willy-nilly >> modifications to configuration files all over the place, even in other >> packages (which in the Debian world is a *big* no-no, that's why many >> Debian/Ubuntu packages have now moved to the conf.d type of configuration >> directory, so that other packages can drop in little independent snippets of >> configuration) >> >> I have seen, for example, .deb packages from a Large Company With Which We >> Are All Familiar which essentially attempted to convert your system into a >> Red Hat system by moving all your init scripts around and whatnot, so once >> you'd installed this abomination, you'd totally wrecked the ability of many >> of the main distro packages to be updated ever again. Oh, and of course >> uninstalling the package didn't put anything back the way it had been >> before. >> >> Like you, I tend to use tarballs if they are available, and if I want to >> turn them into packages I do it myself, and make sure they are policy >> compliant for the distro. >> >> So this, while not a statement in favour of either flavour of distro, is >> definitely a warning to be very wary of what packages that have come from >> sources other than the distro itself might do (which of course, you'd be >> wary of anyway for security reasons). >> >> Tim >> >> >> -- >> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, >> a charity registered in England with number 1021457 and acompany registered >> in England with number 2742969, whose registeredoffice is 215 Euston Road, >> London, NW1 2BE._______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> > > > > -- > Jonathan Aquilina -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080702/56c63de1/attachment.html From carsten.aulbert at aei.mpg.de Wed Jul 2 04:22:41 2008 From: carsten.aulbert at aei.mpg.de (Carsten Aulbert) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] automount on high ports In-Reply-To: References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> Message-ID: <486B6501.5000108@aei.mpg.de> Hi Bogdan, Bogdan Costescu wrote: > > Have you considered using a parallel file system ? We looked a bit into a few, but would love to get any input from anyone on that. What we found so far was not really convincing, e.g. glusterFS at that time was not really stable, lustre was too easy to crash - at l east at that time, ... > There have been many talks of improving performance by paying attention > to the data locality on this very list. Are you not able to move the > code to where the data is or move the data to where the code is ? In principle this *should* be possible, however then this particular user (and maybe many in the future) would need to circumvent the batch system and it's usually quite a hassle to set this up correctly beforehand. > > F.e. using a simple TCP connection (nc, rsh, rsync or even http) to > transfer the file to the local disk before using it is probably more > efficient than the way you use NFS is you deal with small files (as they > have to be written to some local storage). The setup and tear-down costs > of the NFS connection (automounter, mount, unmount) simply doesn't exist > in this case; the transfer of data on the wire happens the same way. Or > you could even get around the limitation of storing it locally by using > a ramdisk to temporarily store the files (if you have the free > memory...) - from what I understand they are read then used immediately > and not needed again in a short time frame so it makes no sense to store > them for longer, a perfect application for a tmpfs. The interesting bit is: Even with the data on a remote disk the overhead is not really that much more. The files are typically less than 100k in size, even doing an rsync or nc|tar from one box to another is REALLY slow with that many small files. tmpfs et al: The jobs usually reads the data once directly form the NFS share and processes it, it's not going back to this file again (well at least not this process). So I do think NFS would not be that bad although it won't be the optimal, but it's usually the easiest for the user to use and quite generic in the approach. Of course one could devise other and much better schemes, but you have always find a good compromise between usability and man-power needed to tailor a specific scheme. Thanks! Carsten From tjrc at sanger.ac.uk Wed Jul 2 04:29:03 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] A press release In-Reply-To: References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> Message-ID: On 2 Jul 2008, at 12:16 pm, Jon Aquilina wrote: > one thing must not be forgotten though. in regards to pkging stuff > for the > ubuntu variation once someone like you and me you upload it for > someone > higher up on the chain to check and upload to the servers. so > basically > someone is checking what someone else has packaged. For maintainers that aren't Debian Developers (or the Ubuntu equivalent), yes, that's true. In my case, I am formally a Debian Developer (have been for more than 10 years), so my GPG signature on a binary upload is considered good enough, and it's not checked further, other than for really serious failures like a failure of the package to build from source on one of the autobuilders. I do check them myself fairly thoroughly though - lintian is a very useful tool for checking that packages comply with policy. Besides, the packages I maintain for Debian are things I use heavily in my day job, so it's in my own interest to make sure they work properly! I suspect the amount of checking that goes on in the universe and multiverse parts of Ubuntu is pretty minimal - I believe the packages are basically straight rebuilds of the Debian source packages using the Ubuntu autobuilder network, so that the library dependencies are correct. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From perry at piermont.com Wed Jul 2 04:32:55 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:20 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: (Jon Aquilina's message of "Wed\, 2 Jul 2008 07\:37\:21 +0200") References: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> <3CB66E9F377C4961B5457896137EAD1B@geoffPC> <486A4296.4050501@hope.edu> <87y74lfabq.fsf@snark.cb.piermont.com> Message-ID: <87wsk4ed20.fsf@snark.cb.piermont.com> "Jon Aquilina" writes: > if i use blender how nicely does it work in a cluster? I believe it works quite well. Perry From perry at piermont.com Wed Jul 2 04:50:48 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <66EB3DC0-B281-4869-BB8E-A55E577C44FE@sanger.ac.uk> (Tim Cutts's message of "Wed\, 2 Jul 2008 09\:19\:50 +0100") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <66EB3DC0-B281-4869-BB8E-A55E577C44FE@sanger.ac.uk> Message-ID: <87od5gec87.fsf@snark.cb.piermont.com> Tim Cutts writes: > On 2 Jul 2008, at 8:26 am, Carsten Aulbert wrote: > >> OK, we have 1342 nodes which act as servers as well as clients. Every >> node exports a single local directory and all other nodes can mount >> this. >> >> What we do now to optimize the available bandwidth and IOs is spread >> millions of files according to a hash algorithm to all nodes (multiple >> copies as well) and then run a few 1000 jobs opening one file from one >> box then one file from the other box and so on. With a short autofs >> timeout that ought to work. Typically it is possible that a single >> process opens about 10-15 files per second, i.e. making 10-15 mounts >> per >> second. With 4 parallel process per node that's 40-60 mounts/second. >> With a timeout of 5 seconds we should roughly have 200-300 concurrent >> mounts (on average, no idea abut the variance). > > Please tell me you're not serious! The overheads of just performing > the NFS mounts are going to kill you, never mind all the network > traffic going all over the place. > > Since you've distributed the files to the local disks of the nodes, > surely the right way to perform this work is to schedule the > computations so that each node works on the data on its own local > disk, and doesn't have to talk networked storage at all? Or don't you > know in advance which files a particular job is going to need? Perhaps it makes sense given their job load. Perhaps it doesn't. If they need access to far more storage than a single node can hold, it might make sense. If individual nodes need lots of I/O but only on a very rare basis, so the disk bandwidth would be unused on most nodes most of the time if they were doing everything locally, perhaps it might make sense. I'll agree that it isn't an obviously good solution to most workloads, but we don't really know what their workload is like so we can't say that this is a bad move ab initio. Perry From atchley at myri.com Wed Jul 2 05:07:27 2008 From: atchley at myri.com (Scott Atchley) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486B6501.5000108@aei.mpg.de> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <486B6501.5000108@aei.mpg.de> Message-ID: <04BB8220-B185-42A2-8E34-DA61066B6D51@myri.com> On Jul 2, 2008, at 7:22 AM, Carsten Aulbert wrote: > Bogdan Costescu wrote: >> >> Have you considered using a parallel file system ? > > We looked a bit into a few, but would love to get any input from > anyone > on that. What we found so far was not really convincing, e.g. > glusterFS > at that time was not really stable, lustre was too easy to crash - > at l > east at that time, ... Hi Carsten, I have not looked at GlusterFS at all. I have worked with Lustre and PVFS2 (I wrote the shims to allow them to run on MX). Although I believe Lustre's robustness is very good these days, I do not believe that it will not work in your setting. I think that they currently do not recommend mounting a client on a node that is also working as a server as you are doing with NFS. I believe it is due to memory contention leading to deadlock. PVFS2 does, however, support your scenario where each node is a server and can be mounted locally as well. PVFS2 servers run in userspace and can be easily debugged. If you are using MPI-IO, it integrates nicely as well. Even so, keep in mind that using each node as a server will consume network resources and will compete with MPI communications. Scott From perry at piermont.com Wed Jul 2 05:28:48 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486B2DC2.9010604@aei.mpg.de> (Carsten Aulbert's message of "Wed\, 02 Jul 2008 09\:26\:58 +0200") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> Message-ID: <87d4lweagv.fsf@snark.cb.piermont.com> Carsten Aulbert writes: >> The clients are connecting from ports below 1024 because Berkeley set >> up a hack in the original BSD stack so that only root could open ports >> below 1024. This way, you could "know" the process on the remote host >> was a root process, thus you could feel "secure" [sic]. It doesn't add >> any real security any more, but it is also not the cause of any >> problem you are experiencing. > > We might run out of "secure" ports. A given client would need to be forming over 1000 connections to a given server NFS port for that to be a problem. This is not going to happen. The protocol doesn't work in such a way as to cause that to occur. >> We can help you figure this out, but you will have to give a lot more >> detail about the problem. Please describe your network setup. How many >> servers do you have? How many clients? How many file systems are those >> servers exporting? How many is a typical client mounting, and why? >> Start there and we can try to move forward. > > OK, we have 1342 nodes which act as servers as well as clients. Every > node exports a single local directory and all other nodes can mount this. Okay. In this instance, you're not going to run out of ports. Every machine might get 1341 connections from clients, and every machine might make 1341 client connections going out to other machines. None of this should cause you to run out of ports, period. If you don't understand that, refer back to my original message. A TCP socket is a unique 4-tuple. The host:port 2-tuples are NOT unique and not an exhaustible resource. There is is no way that your case is going to even remotely exhaust the 4-tuple space. > What we do now to optimize the available bandwidth and IOs is spread > millions of files according to a hash algorithm to all nodes (multiple > copies as well) and then run a few 1000 jobs opening one file from one > box then one file from the other box and so on. With a short autofs > timeout that ought to work. I think there is no point in having a short autofs timeout, and you're likely to radically increase the overhead when you open files. > Our tests so far have shown that sometimes a node keeps a few mounts > open (autofs4 problems AFAIK) and at some point is not able to mount > more shares. Usually this occurs at about 350 mounts and we are not yet > 100% sure if we are running out of secure ports. You probably aren't running out of ports per se. You may be running out of OS resources, like file descriptors or something similar. -- Perry E. Metzger perry@piermont.com From landman at scalableinformatics.com Wed Jul 2 05:31:15 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486B2DC2.9010604@aei.mpg.de> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> Message-ID: <486B7513.1020509@scalableinformatics.com> Carsten Aulbert wrote: >> The clients are connecting from ports below 1024 because Berkeley set >> up a hack in the original BSD stack so that only root could open ports >> below 1024. This way, you could "know" the process on the remote host >> was a root process, thus you could feel "secure" [sic]. It doesn't add >> any real security any more, but it is also not the cause of any >> problem you are experiencing. > > We might run out of "secure" ports. But you can force NFS to connect from the ports above 1024 so this shouldn't be an issue. [...] > OK, we have 1342 nodes which act as servers as well as clients. Every There is a short writeup on this with quotes from Bruce Allen in HPCwire. Too bad you didn't opt for JackRabbits there :) > node exports a single local directory and all other nodes can mount this. Fine, nothing terrible. > > What we do now to optimize the available bandwidth and IOs is spread > millions of files according to a hash algorithm to all nodes (multiple > copies as well) and then run a few 1000 jobs opening one file from one > box then one file from the other box and so on. With a short autofs Hmmm.... So you want to "track" spatial metadata (e.g. where the file is) according to some hash function that each node can execute, and then once this is known, perform IO. So, for example (as a relatively naive/simple minded version) some quick Perl pseudo-code ... # .... my $hash = MD5SUM($filename); my $machine = $hash % $Number_of_machines; my $machine_name= $name[$machine]; my $full_path = sprintf("/%s/%s",$machine_name,$filename); open(my $fh, ">".$full_path) or die "FATAL ERROR: unable to open $full_path\n"; # .... Is this about right? > timeout that ought to work. Typically it is possible that a single > process opens about 10-15 files per second, i.e. making 10-15 mounts per > second. With 4 parallel process per node that's 40-60 mounts/second. Hmmm ... mount latency we have seen is ~0.1 seconds or so, so I can believe 10-14/second. Note that due to strange latency effects in larger machines, we have also seen an automount take 0.5 seconds and more. Some delays due to name resolution. Never fully traced it, but this was on a 32 node cluster. You are talking a little bigger. > With a timeout of 5 seconds we should roughly have 200-300 concurrent > mounts (on average, no idea abut the variance). 200-300 mounts across 1342 nodes, sure. 200-300 mounts of one file system on one server from 200-300 client machines? I have some doubts ... > Our tests so far have shown that sometimes a node keeps a few mounts > open (autofs4 problems AFAIK) and at some point is not able to mount > more shares. Usually this occurs at about 350 mounts and we are not yet > 100% sure if we are running out of secure ports. Older kernels couldn't do more than 256 mounts. Not sure when/if this limit has been raised. This is a different problem though. If you have N machines mounting a file system, then you get N requests on port 2049 or similar (the inbound NFS port). You don't run out of secure ports. If the issue is that you are running 200+ outgoing mount requests from one machine, you will likely have a delay issue as you cross the 256 mount number (if your kernel hasn't been patched ... not sure if/when this has/will change). > All our boxes export now with "insecure" option (NFSv3), but our clients > all connect from a "secure" port, anyone here who might give us a hint > how to force this in Linux? See if you can get less than 256 mounts working well. If so, and it only starts falling off above 256 mounts, this would be important to know. Joe > > Thanks a lot > > Carsten > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From carsten.aulbert at aei.mpg.de Wed Jul 2 05:55:21 2008 From: carsten.aulbert at aei.mpg.de (Carsten Aulbert) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <87d4lweagv.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> Message-ID: <486B7AB9.9050202@aei.mpg.de> Hi Perry, Perry E. Metzger wrote: > > Okay. In this instance, you're not going to run out of ports. Every > machine might get 1341 connections from clients, and every machine > might make 1341 client connections going out to other machines. None > of this should cause you to run out of ports, period. If you don't > understand that, refer back to my original message. A TCP socket is a > unique 4-tuple. The host:port 2-tuples are NOT unique and not an > exhaustible resource. There is is no way that your case is going to > even remotely exhaust the 4-tuple space. Well, I understand your reasoning, but that's contradicted to what we do see netstat -an|awk '/2049/ {print $4}'|sed 's/10.10.13.41://'|sort -n shows us the follwing: 665 666 667 668 669 670 671 672 673 674 675 676 677 [...] 1017 1018 1019 1020 1021 1022 1023 Which corresponds exactly to the maximum achievable mounts of 358 right now. Besides, I'm far from being an expert on TCP/IP, but is it possible for a local process to bind to a port which is already in use but to another host? I don't think so, but may be wrong. Cheers Carsten From eagles051387 at gmail.com Wed Jul 2 06:05:09 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: <20080702125625.GE47386@gby2.aoes.com> References: <3CB66E9F377C4961B5457896137EAD1B@geoffPC> <486A4296.4050501@hope.edu> <87y74lfabq.fsf@snark.cb.piermont.com> <87wsk4ed20.fsf@snark.cb.piermont.com> <20080702125625.GE47386@gby2.aoes.com> Message-ID: like you said in regards to maya money is a factor for me. if i do descide to setup a rendering cluster my problem is going to be finding someone who can make a small video in blender for me so i can render it. On 7/2/08, Greg Byshenk wrote: > > On Wed, Jul 02, 2008 at 07:32:55AM -0400, Perry E. Metzger wrote: > > "Jon Aquilina" writes: > > > > if i use blender how nicely does it work in a cluster? > > > I believe it works quite well. > > > The "Helmer" minicluster uses blender, and appears > to perform well. > > Also, Maya's 'muster' engine runs under Linux, and quite successfully. We > use it in a mixed environment, where the render pool consists of both > Windows workstations and Linux cluster nodes. > > Note, though, that like other commercial 3D products, Maya is expensive, > and may not be suitable for a student project. > > -- > Greg Byshenk > > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080702/1525345d/attachment.html From mark.kosmowski at gmail.com Wed Jul 2 06:11:46 2008 From: mark.kosmowski at gmail.com (Mark Kosmowski) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] Re: energy costs and poor grad students Message-ID: I'm in the US. I'm almost, but not quite ready for production runs - still learning the software / computational theory. I'm the first person in the research group (physical chemistry) to try to learn plane wave methods of solid state calculation as opposed to isolated atom-centered approximations and periodic atom centered calculations. It is turning out that the package I have spent the most time learning is perhaps not the best one for what we are doing. For a variety of reasons, many of which more off-topic than tac nukes and energy efficient washing machines ;) , I'm doing my studies part-time while working full-time in industry. I think I have come to a compromise that can keep me in business. Until I have a better understanding of the software and am ready for production runs, I'll stick to a small system that can be run on one node and leave the other two powered down. I've also applied for an adjunt instructor position at a local college for some extra cash and good experience. When I'm ready for production runs I can either just bite the bullet and pay the electricity bill or seek computer time elsewhere. Thanks for the encouragement, Mark E. Kosmowski On 7/1/08, ariel sabiguero yawelak wrote: > Well Mark, don't give up! > I am not sure which one is your application domain, but if you require 24x7 > computation, then you should not be hosting that at home. > On the other hand, if you are not doing real computation and you just have a > testbed at home, maybe for debugging your parallel applications or something > similar, you might be interested in a virtualized solution. Several years > ago, I used to "debug" some neural networks at home, but training sessions > (up to two weeks of training) happened at the university. > I would suggest to do something like that. > You can always scale-down your problem in several phases and save the > complete data-set / problem for THE RUN. > > You are not being a heretic there, but suffering energy costs ;-) > In more places that you may believe, useful computing nodes are being > replaced just because of energy costs. Even in some application domains you > can even loose computational power if you move from 4 nodes into a single > quad-core (i.e. memory bandwidth problems). I know it is very nice to be > able to do everything at home.. but maybe before dropping your studies or > working overtime to pay the electricity bill, you might want to reconsider > the fact of collapsing your phisical deploy into a single virtualized > cluster. (or just dispatch several threads/processes in a single system). > If you collapse into a single system you have only 1 mainboard, one HDD, one > power source, one processor (physically speaking), .... and you can achieve > almost the performance of 4 systems in one, consuming the power of.... well > maybe even less than a single one. I don't want to go into discussions about > performance gain/loose due to the variation of the hardware architecture. > Invest some bucks (if you haven't done that yet) in a good power source. > Efficiency of OEM unbranded power sources is realy pathetic. may be 45-50% > efficiency, while a good power source might be 75-80% efficient. Use the > energy for computing, not for heating your house. > What I mean is that you could consider just collapsing a complete "small" > cluster into single system. If your application is CPU-bound and not I/O > bound, VMware Server could be an option, as it is free software > (unfortunately not open, even tough some patches can be done on the > drivers). I think it is not possible to publish benchmarking data about > VMware, but I can tell you that in long timescales, the performance you get > in the host OS is similar than the one of the guest OS. There are a lot of > problems related to jitter, from crazy clocks to delays, but if your > application is not sensitive to that, then you are Ok. > Maybe this is not a solution, but you can provide more information regarding > your problem before quitting... > > my 2 cents.... > > ariel > > Mark Kosmowski escribi?: > > > At some point there a cost-benefit analysis needs to be performed. If > > my cluster at peak usage only uses 4 Gb RAM per CPU (I live in > > single-core land still and do not yet differentiate between CPU and > > core) and my nodes all have 16 Gb per CPU then I am wasting RAM > > resources and would be better off buying new machines and physically > > transferring the RAM to and from them or running more jobs each > > distributed across fewer CPUs. Or saving on my electricity bill and > > powering down some nodes. > > > > As heretical as this last sounds, I'm tempted to throw in the towel on > > my PhD studies because I can no longer afford the power to run my > > three node cluster at home. Energy costs may end up being the straw > > that breaks this camel's back. > > > > Mark E. Kosmowski > > > > > > > > > From: "Jon Aquilina" > > > > > > > > > > > > > > > not sure if this applies to all kinds of senarios that clusters are used > in > > > but isnt the more ram you have the better? > > > > > > On 6/30/08, Vincent Diepeveen wrote: > > > > > > > > > > Toon, > > > > > > > > Can you drop a line on how important RAM is for weather forecasting in > > > > latest type of calculations you're performing? > > > > > > > > Thanks, > > > > Vincent > > > > > > > > > > > > On Jun 30, 2008, at 8:20 PM, Toon Moene wrote: > > > > > > > > Jim Lux wrote: > > > > > > > > > > > > > Yep. And for good reason. Even a big DoD job is still tiny in > Nvidia's > > > > > > > > > > > > > > > > scale of operations. We face this all the time with NASA work. > > > > > > Semiconductor manufacturers have no real reason to produce > special purpose > > > > > > or customized versions of their products for space use, because > they can > > > > > > sell all they can make to the consumer market. More than once, > I've had a > > > > > > phone call along the lines of this: > > > > > > "Jim: I'm interested in your new ABC321 part." > > > > > > "Rep: Great. I'll just send the NDA over and we can talk about > it." > > > > > > "Jim: Great, you have my email and my fax # is..." > > > > > > "Rep: By the way, what sort of volume are you going to be using?" > > > > > > "Jim: Oh, 10-12.." > > > > > > "Rep: thousand per week, excellent..." > > > > > > "Jim: No, a dozen pieces, total, lifetime buy, or at best maybe > every > > > > > > year." > > > > > > "Rep: Oh..." > > > > > > {Well, to be fair, it's not that bad, they don't hang up on you.. > > > > > > > > > > > > > > > > > > > > > > > Since about a year, it's been clear to me that weather forecasting > (i.e., > > > > > running a more or less sophisticated atmospheric model to provide > weather > > > > > predictions) is going to be "mainstream" in the sense that every > business > > > > > that needs such forecasts for its operations can simply run them > in-house. > > > > > > > > > > Case in point: I bought a $1100 HP box (the obvious target group > being > > > > > teenage downloaders) which performs the HIRLAM limited area model > *on the > > > > > grid that we used until October 2006* in December last year. > > > > > > > > > > It's about twice as slow as our then-operational 50-CPU Sun Fire > 15K. > > > > > > > > > > I wonder what effect this will have on CPU developments ... > > > > > > > > > > -- > > > > > Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 > 214290 > > > > > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > > > > > At home: http://moene.indiv.nluug.nl/~toon/ > > > > > Progress of GNU Fortran: > http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > Beowulf mailing list, Beowulf@beowulf.org > > > > To change your subscription (digest mode or unsubscribe) visit > > > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > > > > > > > > > > > > > -- > > > Jonathan Aquilina > > > > > > > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org > > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > > From landman at scalableinformatics.com Wed Jul 2 06:44:20 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] Re: energy costs and poor grad students In-Reply-To: References: Message-ID: <486B8634.6020309@scalableinformatics.com> Hi Mark Mark Kosmowski wrote: > I'm in the US. I'm almost, but not quite ready for production runs - > still learning the software / computational theory. I'm the first > person in the research group (physical chemistry) to try to learn > plane wave methods of solid state calculation as opposed to isolated > atom-centered approximations and periodic atom centered calculations. Heh... my research group in grad school went through that transition in the mid 90s. Went from an LCAO-type simulation to CP like methods. We needed a t3e to run those (then). Love to compare notes and see which code you are using someday. On-list/off-list is fine. > It is turning out that the package I have spent the most time learning > is perhaps not the best one for what we are doing. For a variety of > reasons, many of which more off-topic than tac nukes and energy > efficient washing machines ;) , I'm doing my studies part-time while > working full-time in industry. More power to ya! I did mine that way too ... the writing was the hardest part. Just don't lose focus, or stop believing you can do it. When the light starts getting visible at the end of the process, it is quite satisfying. I have other words to describe this, but they require a beer lever to get them out of me ... > I think I have come to a compromise that can keep me in business. > Until I have a better understanding of the software and am ready for > production runs, I'll stick to a small system that can be run on one > node and leave the other two powered down. I've also applied for an > adjunt instructor position at a local college for some extra cash and > good experience. When I'm ready for production runs I can either just > bite the bullet and pay the electricity bill or seek computer time > elsewhere. Give us a shout when you want to try the time on a shared resource. Some folks here may be able to make good suggestions. RGB is a physics guy at Duke, doing lots of simulations, and might know of resources. Others here might as well. Joe > > Thanks for the encouragement, > > Mark E. Kosmowski > > On 7/1/08, ariel sabiguero yawelak wrote: >> Well Mark, don't give up! >> I am not sure which one is your application domain, but if you require 24x7 >> computation, then you should not be hosting that at home. >> On the other hand, if you are not doing real computation and you just have a >> testbed at home, maybe for debugging your parallel applications or something >> similar, you might be interested in a virtualized solution. Several years >> ago, I used to "debug" some neural networks at home, but training sessions >> (up to two weeks of training) happened at the university. >> I would suggest to do something like that. >> You can always scale-down your problem in several phases and save the >> complete data-set / problem for THE RUN. >> >> You are not being a heretic there, but suffering energy costs ;-) >> In more places that you may believe, useful computing nodes are being >> replaced just because of energy costs. Even in some application domains you >> can even loose computational power if you move from 4 nodes into a single >> quad-core (i.e. memory bandwidth problems). I know it is very nice to be >> able to do everything at home.. but maybe before dropping your studies or >> working overtime to pay the electricity bill, you might want to reconsider >> the fact of collapsing your phisical deploy into a single virtualized >> cluster. (or just dispatch several threads/processes in a single system). >> If you collapse into a single system you have only 1 mainboard, one HDD, one >> power source, one processor (physically speaking), .... and you can achieve >> almost the performance of 4 systems in one, consuming the power of.... well >> maybe even less than a single one. I don't want to go into discussions about >> performance gain/loose due to the variation of the hardware architecture. >> Invest some bucks (if you haven't done that yet) in a good power source. >> Efficiency of OEM unbranded power sources is realy pathetic. may be 45-50% >> efficiency, while a good power source might be 75-80% efficient. Use the >> energy for computing, not for heating your house. >> What I mean is that you could consider just collapsing a complete "small" >> cluster into single system. If your application is CPU-bound and not I/O >> bound, VMware Server could be an option, as it is free software >> (unfortunately not open, even tough some patches can be done on the >> drivers). I think it is not possible to publish benchmarking data about >> VMware, but I can tell you that in long timescales, the performance you get >> in the host OS is similar than the one of the guest OS. There are a lot of >> problems related to jitter, from crazy clocks to delays, but if your >> application is not sensitive to that, then you are Ok. >> Maybe this is not a solution, but you can provide more information regarding >> your problem before quitting... >> >> my 2 cents.... >> >> ariel >> >> Mark Kosmowski escribi?: >> >>> At some point there a cost-benefit analysis needs to be performed. If >>> my cluster at peak usage only uses 4 Gb RAM per CPU (I live in >>> single-core land still and do not yet differentiate between CPU and >>> core) and my nodes all have 16 Gb per CPU then I am wasting RAM >>> resources and would be better off buying new machines and physically >>> transferring the RAM to and from them or running more jobs each >>> distributed across fewer CPUs. Or saving on my electricity bill and >>> powering down some nodes. >>> >>> As heretical as this last sounds, I'm tempted to throw in the towel on >>> my PhD studies because I can no longer afford the power to run my >>> three node cluster at home. Energy costs may end up being the straw >>> that breaks this camel's back. >>> >>> Mark E. Kosmowski >>> >>> >>> >>>> From: "Jon Aquilina" >>>> >>>> >>> >>> >>>> not sure if this applies to all kinds of senarios that clusters are used >> in >>>> but isnt the more ram you have the better? >>>> >>>> On 6/30/08, Vincent Diepeveen wrote: >>>> >>>> >>>>> Toon, >>>>> >>>>> Can you drop a line on how important RAM is for weather forecasting in >>>>> latest type of calculations you're performing? >>>>> >>>>> Thanks, >>>>> Vincent >>>>> >>>>> >>>>> On Jun 30, 2008, at 8:20 PM, Toon Moene wrote: >>>>> >>>>> Jim Lux wrote: >>>>> >>>>> >>>>>> Yep. And for good reason. Even a big DoD job is still tiny in >> Nvidia's >>>>>> >>>>>>> scale of operations. We face this all the time with NASA work. >>>>>>> Semiconductor manufacturers have no real reason to produce >> special purpose >>>>>>> or customized versions of their products for space use, because >> they can >>>>>>> sell all they can make to the consumer market. More than once, >> I've had a >>>>>>> phone call along the lines of this: >>>>>>> "Jim: I'm interested in your new ABC321 part." >>>>>>> "Rep: Great. I'll just send the NDA over and we can talk about >> it." >>>>>>> "Jim: Great, you have my email and my fax # is..." >>>>>>> "Rep: By the way, what sort of volume are you going to be using?" >>>>>>> "Jim: Oh, 10-12.." >>>>>>> "Rep: thousand per week, excellent..." >>>>>>> "Jim: No, a dozen pieces, total, lifetime buy, or at best maybe >> every >>>>>>> year." >>>>>>> "Rep: Oh..." >>>>>>> {Well, to be fair, it's not that bad, they don't hang up on you.. >>>>>>> >>>>>>> >>>>>>> >>>>>> Since about a year, it's been clear to me that weather forecasting >> (i.e., >>>>>> running a more or less sophisticated atmospheric model to provide >> weather >>>>>> predictions) is going to be "mainstream" in the sense that every >> business >>>>>> that needs such forecasts for its operations can simply run them >> in-house. >>>>>> Case in point: I bought a $1100 HP box (the obvious target group >> being >>>>>> teenage downloaders) which performs the HIRLAM limited area model >> *on the >>>>>> grid that we used until October 2006* in December last year. >>>>>> >>>>>> It's about twice as slow as our then-operational 50-CPU Sun Fire >> 15K. >>>>>> I wonder what effect this will have on CPU developments ... >>>>>> >>>>>> -- >>>>>> Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 >> 214290 >>>>>> Saturnushof 14, 3738 XG Maartensdijk, The Netherlands >>>>>> At home: http://moene.indiv.nluug.nl/~toon/ >>>>>> Progress of GNU Fortran: >> http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Beowulf mailing list, Beowulf@beowulf.org >>>>> To change your subscription (digest mode or unsubscribe) visit >>>>> http://www.beowulf.org/mailman/listinfo/beowulf >>>>> >>>>> >>>>> >>>> -- >>>> Jonathan Aquilina >>>> >>>> >>> _______________________________________________ >>> Beowulf mailing list, Beowulf@beowulf.org >>> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >>> >>> > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From henning.fehrmann at aei.mpg.de Wed Jul 2 06:42:28 2008 From: henning.fehrmann at aei.mpg.de (Henning Fehrmann) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486B7AB9.9050202@aei.mpg.de> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> Message-ID: <20080702134228.GA5152@gretchen.aei.uni-hannover.de> > Which corresponds exactly to the maximum achievable mounts of 358 right 359 ;) If the number of mounts is smaller the ports are randomly used in this range. It would be convenient to enter the insecure area. Using the option insecure for the NFS exports is apparently not sufficient. Also every nfs server is connected from a distinct port on the client side. Two mounts to a single server might end up on the same port. Cheers Henning From gerry.creager at tamu.edu Wed Jul 2 07:09:34 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <04BB8220-B185-42A2-8E34-DA61066B6D51@myri.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <486B6501.5000108@aei.mpg.de> <04BB8220-B185-42A2-8E34-DA61066B6D51@myri.com> Message-ID: <486B8C1E.2090007@tamu.edu> Scott Atchley wrote: > On Jul 2, 2008, at 7:22 AM, Carsten Aulbert wrote: > >> Bogdan Costescu wrote: >>> >>> Have you considered using a parallel file system ? >> >> We looked a bit into a few, but would love to get any input from anyone >> on that. What we found so far was not really convincing, e.g. glusterFS >> at that time was not really stable, lustre was too easy to crash - at l >> east at that time, ... > > Hi Carsten, > > I have not looked at GlusterFS at all. I have worked with Lustre and > PVFS2 (I wrote the shims to allow them to run on MX). > > Although I believe Lustre's robustness is very good these days, I do not > believe that it will not work in your setting. I think that they > currently do not recommend mounting a client on a node that is also > working as a server as you are doing with NFS. I believe it is due to > memory contention leading to deadlock. Lustre is good enough that it's the parallel FS at TACC for the Ranger cluster. And, I've had no real problems as a user thereof. We're brining up glustre on our new cluster here ( CentOS/RHEL5, not debian ). We looked at zfs but didn't have sufficient experience to go that path. > PVFS2 does, however, support your scenario where each node is a server > and can be mounted locally as well. PVFS2 servers run in userspace and > can be easily debugged. If you are using MPI-IO, it integrates nicely as > well. Even so, keep in mind that using each node as a server will > consume network resources and will compete with MPI communications. Someone at NCAR recently suggested we review PVFS2. I'm gonna do it as soon as I get a free moment on vacation. -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From Bogdan.Costescu at iwr.uni-heidelberg.de Wed Jul 2 07:12:09 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <87d4lweagv.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> Message-ID: On Wed, 2 Jul 2008, Perry E. Metzger wrote: > A given client would need to be forming over 1000 connections to a > given server NFS port for that to be a problem. Not quite. The reserved ports that are free for use (512 and up) are not all free to be taken by NFS as it pleases - there are many daemons that have to use those well-known ports. F.e. some years ago a common complaint was that the CUPS daemon (port 631) was often conflicting with NFS client mounts; I think that what was chosen by various distributions was the easy way out - make the NFS client only allocate ports starting at 650 or so. > Every machine might get 1341 connections from clients, and every > machine might make 1341 client connections going out to other > machines None of this should cause you to run out of ports, period. With all due respect, I think that you are not quite familiar with the NFS implementation on Linux (and maybe other NFS implementations). What you describe is the theoretical use of TCP connections; the way NFS on Linux uses TCP is not quite as you imagine: there is one port taken on the client for each NFS mount and that port is not reused. Also mounting 2 different mount points from the same NFS server to the same NFS client uses 2 TCP ports on the client side - at least with NFS v2 and v3; for v4 I think that there is only one connection between a client and a server independent on the number of mount points. I do encourage you to subscribe to the Linux NFS list if you want to learn more; I've been there for a long time (unfortunately not anymore...) and the people, especially the developers, were very helpful. -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From ntmoore at gmail.com Wed Jul 2 07:22:37 2008 From: ntmoore at gmail.com (Nathan Moore) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] Re: energy costs and poor grad students In-Reply-To: <486B8634.6020309@scalableinformatics.com> References: <486B8634.6020309@scalableinformatics.com> Message-ID: <6009416b0807020722l56f05affs878b762d285bba9d@mail.gmail.com> Does your university have public computer labs? Do the computers run some variant of Unix? At UMN, where I did my grad work in physics, there were a number of semi-public "Scientific Visualization" or "Large Data Analysis" labs that were hosted in the local supercomputer center. The center there has a number of large machines that you had to apply and give a really good rationale to use, but the smaller development labs (with 2-way to 10-way sunfires, similar sized sgi's, linux machines, etc) basically sat vacant 5-6 days per week. Some of the labs had a pbs queue, some had a condor queue, and some just required that background jobs be "nice +19 ./a.out". My graduate work required several large parametric studies which computationally looked like lots of monte-carlo-ish runs which could be done in parallel. The beauty of this was that no message passing was required, so, if there were 23 cores open one evening at 6pm, and assuming no one would be doing work overnight (for the next 14 hours), I could start 23 14 hour jobs at 6pm and have a little less than 2 weeks of cpu work done by 8am the next morning. I used (and mentioned) the technique in the paper, http://www.pnas.org/cgi/content/full/101/37/13431 (search for "computational impotence"). This only works though if your university's computer labs run a unix-ish os, and if the sysadmins are progressive. At the school where I presently teach similar endeavors have been much harder to start-up. Nathan Moore On Wed, Jul 2, 2008 at 8:44 AM, Joe Landman wrote: > Hi Mark > > Mark Kosmowski wrote: > >> I'm in the US. I'm almost, but not quite ready for production runs - >> still learning the software / computational theory. I'm the first >> person in the research group (physical chemistry) to try to learn >> plane wave methods of solid state calculation as opposed to isolated >> atom-centered approximations and periodic atom centered calculations. >> > > Heh... my research group in grad school went through that transition in the > mid 90s. Went from an LCAO-type simulation to CP like methods. We needed a > t3e to run those (then). > > Love to compare notes and see which code you are using someday. > On-list/off-list is fine. > > It is turning out that the package I have spent the most time learning >> is perhaps not the best one for what we are doing. For a variety of >> reasons, many of which more off-topic than tac nukes and energy >> efficient washing machines ;) , I'm doing my studies part-time while >> working full-time in industry. >> > > More power to ya! I did mine that way too ... the writing was the hardest > part. Just don't lose focus, or stop believing you can do it. When the > light starts getting visible at the end of the process, it is quite > satisfying. > > I have other words to describe this, but they require a beer lever to get > them out of me ... > > I think I have come to a compromise that can keep me in business. >> Until I have a better understanding of the software and am ready for >> production runs, I'll stick to a small system that can be run on one >> node and leave the other two powered down. I've also applied for an >> adjunt instructor position at a local college for some extra cash and >> good experience. When I'm ready for production runs I can either just >> bite the bullet and pay the electricity bill or seek computer time >> elsewhere. >> > > Give us a shout when you want to try the time on a shared resource. Some > folks here may be able to make good suggestions. RGB is a physics guy at > Duke, doing lots of simulations, and might know of resources. Others here > might as well. > > Joe > > > >> Thanks for the encouragement, >> >> Mark E. Kosmowski >> >> On 7/1/08, ariel sabiguero yawelak wrote: >> >>> Well Mark, don't give up! >>> I am not sure which one is your application domain, but if you require >>> 24x7 >>> computation, then you should not be hosting that at home. >>> On the other hand, if you are not doing real computation and you just >>> have a >>> testbed at home, maybe for debugging your parallel applications or >>> something >>> similar, you might be interested in a virtualized solution. Several years >>> ago, I used to "debug" some neural networks at home, but training >>> sessions >>> (up to two weeks of training) happened at the university. >>> I would suggest to do something like that. >>> You can always scale-down your problem in several phases and save the >>> complete data-set / problem for THE RUN. >>> >>> You are not being a heretic there, but suffering energy costs ;-) >>> In more places that you may believe, useful computing nodes are being >>> replaced just because of energy costs. Even in some application domains >>> you >>> can even loose computational power if you move from 4 nodes into a single >>> quad-core (i.e. memory bandwidth problems). I know it is very nice to be >>> able to do everything at home.. but maybe before dropping your studies or >>> working overtime to pay the electricity bill, you might want to >>> reconsider >>> the fact of collapsing your phisical deploy into a single virtualized >>> cluster. (or just dispatch several threads/processes in a single system). >>> If you collapse into a single system you have only 1 mainboard, one HDD, >>> one >>> power source, one processor (physically speaking), .... and you can >>> achieve >>> almost the performance of 4 systems in one, consuming the power of.... >>> well >>> maybe even less than a single one. I don't want to go into discussions >>> about >>> performance gain/loose due to the variation of the hardware architecture. >>> Invest some bucks (if you haven't done that yet) in a good power source. >>> Efficiency of OEM unbranded power sources is realy pathetic. may be >>> 45-50% >>> efficiency, while a good power source might be 75-80% efficient. Use the >>> energy for computing, not for heating your house. >>> What I mean is that you could consider just collapsing a complete "small" >>> cluster into single system. If your application is CPU-bound and not I/O >>> bound, VMware Server could be an option, as it is free software >>> (unfortunately not open, even tough some patches can be done on the >>> drivers). I think it is not possible to publish benchmarking data about >>> VMware, but I can tell you that in long timescales, the performance you >>> get >>> in the host OS is similar than the one of the guest OS. There are a lot >>> of >>> problems related to jitter, from crazy clocks to delays, but if your >>> application is not sensitive to that, then you are Ok. >>> Maybe this is not a solution, but you can provide more information >>> regarding >>> your problem before quitting... >>> >>> my 2 cents.... >>> >>> ariel >>> >>> Mark Kosmowski escribi?: >>> >>> At some point there a cost-benefit analysis needs to be performed. If >>>> my cluster at peak usage only uses 4 Gb RAM per CPU (I live in >>>> single-core land still and do not yet differentiate between CPU and >>>> core) and my nodes all have 16 Gb per CPU then I am wasting RAM >>>> resources and would be better off buying new machines and physically >>>> transferring the RAM to and from them or running more jobs each >>>> distributed across fewer CPUs. Or saving on my electricity bill and >>>> powering down some nodes. >>>> >>>> As heretical as this last sounds, I'm tempted to throw in the towel on >>>> my PhD studies because I can no longer afford the power to run my >>>> three node cluster at home. Energy costs may end up being the straw >>>> that breaks this camel's back. >>>> >>>> Mark E. Kosmowski >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080702/9fbf614d/attachment.html From perry at piermont.com Wed Jul 2 07:26:13 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486B7AB9.9050202@aei.mpg.de> (Carsten Aulbert's message of "Wed\, 02 Jul 2008 14\:55\:21 +0200") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> Message-ID: <874p78e516.fsf@snark.cb.piermont.com> Skip to the bottom for advice on how to make NFS only use non-prived ports. My guess is still that it isn't priv ports that are causing trouble, but I describe at the bottom what you need to do to get rid of that issue entirely. I'd advise reading the rest, but the part about how to disable the stuff is after the --- near the bottom. Carsten Aulbert writes: > Well, I understand your reasoning, but that's contradicted to what we do see > > netstat -an|awk '/2049/ {print $4}'|sed 's/10.10.13.41://'|sort -n > > shows us the follwing: Are those all mounts to ONE HOST? Because if they are, you're going to run out of ports. If you're connecting to multiple hosts should you be okay, but you certainly could run out of ports between two hosts -- you only have 1023 prived connections from a given host to a single port on another box. Of course, one might validly ask why the other 650 odd ports aren't usable -- clearly they should be, right? The limit is 1023, not 358. It might be that there is some Linux oddness here. Anyway, this shouldn't be a problem if you're connecting to MANY servers, but maybe there's some linux weirdness here. See below. > Which corresponds exactly to the maximum achievable mounts of 358 right > now. Besides, I'm far from being an expert on TCP/IP, but is it possible > for a local process to bind to a port which is already in use but to > another host? Of course! You can use the same local port number with connections to different remote hosts. You can even use the same local port number with multiple connections to the same remote host provided the remote host is using different port numbers on its end. Every open socket is a 4-tuple of localip:localport:remoteip:remoteport Provided two sockets don't share that 4-tuple, you can have both. Now, a given OS may screw up how they handle this, but the *protocol* certainly permits it. Perhaps you're right and Linux isn't dealing with this gracefully. We can check that. > I don't think so, but may be wrong. Then how does an SMTP server handle thousands of simultaneous connections all coming to port 25? :) In any case, this is what the NFS FAQ says. It does mention the priv port problem, but only in a context in which makes me think it is talking about two given hosts and not one client and many hosts. However, I might be wrong. See below: >From http://nfs.sourceforge.net/ B3. Why can't I mount more than 255 NFS file systems on my client? Why is it sometimes even less than 255? A. On Linux, each mounted file system is assigned a major number, which indicates what file system type it is (eg. ext3, nfs, isofs); and a minor number, which makes it unique among the file systems of the same type. In kernels prior to 2.6, Linux major and minor numbers have only 8 bits, so they may range numerically from zero to 255. Because a minor number has only 8 bits, a system can mount only 255 file systems of the same type. So a system can mount up to 255 NFS file systems, another 255 ext3 file system, 255 more iosfs file systems, and so on. Kernels after 2.6 have 20-bit wide minor numbers, which alleviate this restriction. For the Linux NFS client, however, the problem is somewhat worse because it is an anonymous file system. Local disk-based file systems have a block device associated with them, but anonymous file systems do not. /proc, for example, is an anonymous file system, and so are other network file systems like AFS. All anonymous file systems share the same major number, so there can be a maximum of only 255 anonymous file systems mounted on a single host. Usually you won't need more than ten or twenty total NFS mounts on any given client. In some large enterprises, though, your work and users might be spread across hundreds of NFS file servers. To work around the limitation on the number of NFS file systems you can mount on a single host, we recommend that you set up and run one of the automounter daemons for Linux. An automounter finds and mounts file systems as they are needed, and unmounts any that it finds are inactive. You can find more information on Linux automounters here. You may also run into a limit on the number of privileged network ports on your system. The NFS client uses a unique socket with its own port number for each NFS mount point. Using an automounter helps address the limited number of available ports by automatically unmounting file systems that are not in use, thus freeing their network ports. NFS version 4 support in the Linux NFS client uses a single socket per client-server pair, which also helps increase the allowable number of NFS mount points on a client. Now, until you brought this up, I would have guessed that this meant you could run out of priv ports between host A and host B -- i.e. host B is the client, is connecting to one port on host A, and is trying to mount more than 1023 file systems on host A and fails because it runs out of priv ports. However, if your test is not between two hosts but is rather between multiple hosts, perhaps for whatever reason Linux is braindead and is not allowing you to re-use the same local socket ports. We can diagnose that later. --- So, here are the things you need to do to totally remove the priv ports thing from the situation: 1) On the server, in your exports file you have to put the "insecure" option onto every exported file system. Otherwise the mountd will demand that the remote side use a "secure" mount. You've already done this according to the initial mail message. However, that only tells the server not to care if the client comes in from a port above 1024 2) The client side is where the action is -- the client picks the port it opens after all. Unfortunately, Linux DOES NOT have an option to do this. BSD, Solaris, etc. do, but not Linux. You need to hack the source to make it happen. On a reasonably current source tree, go to: /usr/src/linux/fs/nfs/mount_clnt.c and look for the argument structure being built for rpc_create. You need to or-in RPC_CLNT_CREATE_NONPRIVPORT to the .flags member, as in (for example, depending on your version, this is 2.6.24): .flags = RPC_CLNT_CREATE_INTR, to .flags = RPC_CLNT_CREATE_INTR | RPC_CLNT_CREATE_NONPRIVPORT, This is a bloody ugly hack that will make ALL connections unprived, so you might have trouble with "normal" mounts. This can be done more cleanly, but it would require more than a one line patch. However, it would get you through testing. If it works for you and you really need it, a clean mount option could be added. My guess is that this is not your problem! However, can check and see if I'm wrong, and if I am, then we can move on to fixing it better. Perry -- Perry E. Metzger perry@piermont.com From perry at piermont.com Wed Jul 2 07:30:14 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <20080702134228.GA5152@gretchen.aei.uni-hannover.de> (Henning Fehrmann's message of "Wed\, 2 Jul 2008 15\:42\:28 +0200") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <20080702134228.GA5152@gretchen.aei.uni-hannover.de> Message-ID: <87wsk4cqa1.fsf@snark.cb.piermont.com> Henning Fehrmann writes: >> Which corresponds exactly to the maximum achievable mounts of 358 right > > 359 ;) > > If the number of mounts is smaller the ports are randomly used in this range. > It would be convenient to enter the insecure area. > Using the option insecure for the NFS exports is apparently not > sufficient. Well, no, it isn't. The server doesn't control what the client does. The "insecure" option only says the server will accept such connections -- you have to tell the client to make them. On BSD and Solaris that's easy, but on Linux you need to hack the kernel. I have just sent a message explaining how do do that. Note that I still don't think this is your problem, but you might as well check. -- Perry E. Metzger perry@piermont.com From perry at piermont.com Wed Jul 2 07:35:45 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: (Bogdan Costescu's message of "Wed\, 2 Jul 2008 16\:12\:09 +0200 \(CEST\)") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> Message-ID: <87skuscq0u.fsf@snark.cb.piermont.com> Bogdan Costescu writes: >> Every machine might get 1341 connections from clients, and every >> machine might make 1341 client connections going out to other >> machines None of this should cause you to run out of ports, period. > > With all due respect, I think that you are not quite familiar with the > NFS implementation on Linux (and maybe other NFS > implementations). I'm plenty familiar with the implementations on other OSes. I only looked at the code on Linux this morning for the first time (never had call before)... > What you describe is the theoretical use of TCP > connections; the way NFS on Linux uses TCP is not quite as you > imagine: there is one port taken on the client for each NFS mount and > that port is not reused. That's not an NFS implementation issue. It is a TCP implementation issue. (Actually, I'm currently looking at the code and it may be an issue in the rpc code, but never mind that.) In general, the OS should let you use a given port to connect to as many remote hosts as you like. The only thing it should prevent is having you talk to a single remote host/port combination from one local port (because you can't -- that would be the same 4-tuple.) > Also mounting 2 different mount points from > the same NFS server to the same NFS client uses 2 TCP ports on the > client side - at least with NFS v2 and v3; for v4 I think that there > is only one connection between a client and a server independent on > the number of mount points. That is indeed correct. (Actually, linux can burn more than 2 ports, depending.) -- Perry E. Metzger perry@piermont.com From rgb at phy.duke.edu Wed Jul 2 07:50:40 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] A press release In-Reply-To: <87bq1hgpep.fsf@snark.cb.piermont.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <87bq1hgpep.fsf@snark.cb.piermont.com> Message-ID: On Tue, 1 Jul 2008, Perry E. Metzger wrote: > > Prentice Bisbal writes: >>>> does it necessarily have to be a redhat clone. can it also be a debian >>>> based >>>> clone? >>> >>> but why? is there some concrete advantage to using Debian? >>> I've never understood why Debian users tend to be very True Believer, >>> or what it is that hooks them. >> >> And the Debian users can say the same thing about Red Hat users. Or SUSE >> users. And if any still exist, the Slackware users could say the same >> thing about the both of them. But then the Slackware users could also >> point out that the first Linux distro was Slackware, so they are using >> the one true Linux distro... Or rather, one of two or three contemporary "firsts", in the guise of SLS which became Slackware. I actually started with SLS and then transitioned to Slackware, all 20 or 30 little floppies of it. The problem (for me) was getting an install on a 4 MB system, which is all that I had at the time. > Precisely. It pays to allow people to use what they want. Fewer > religious battles that way. Whether one distro or another has an > advantage isn't the point -- people have their own tastes and it > doesn't pay to tell them "no" without good reason. It isn't all about religion. There are two "real" problems with Slackware. One is its packaging system, the other (related) is maintenance. It's packaging system doesn't really manage dependences or automated updates, and dependence resolution is a major pain in the ass when one is installing a large sheaf of applications all at once. I was once a passionate, fervent, nay, religious user -- it has/had a very SunOS/BSD-like etc layout that was quite painless for me to work, moving over from administrating a mostly-SunOS network, where RH had a much more SysV-like interface that I had to learn. The sources for most of its apps were visibly ports of of the same software I regularly built for the Suns -- remember that right up to linux, Sun workstations were "the" unix boxes for people that wrote and adopted Linux. Maintaining all the open source packages was "easy" on Suns because that is what the open source writers were using and was usually the makefile default, but it was a PITA (or more practically, "expensive" in human time and duplicated effort) there as well. Beyond automated install/updates and dependencies (that now can be sort-of-managed with add-ons basically derived from apt tools or rpm tools) Slackware's other major problem is simply its up-to-dateness. I don't know numbers, but I think it is way, way behind in number of users these days to both Debian and RH-derived distros, not to mention all the rest. I'd be surprised if it were as high as fifth in user base. This basically means that there is a time lag between package developments and releases in the other distros where the user (and hence DEVELOPER) base reside. Then there is a further delay in getting builds in that work with the existing dependencies, because there is no dependency system to speak of. Time lags of this sort are windows of opportunity when security exploits are discovered. They also annoy users, who ask "why is X available in distro Y but not here?" I think of Slackware as being a great hacker distro, a good distro for somebody who wants to work close to the metal (and very hard) to manage their sources, but not the best distro for trouble-free, scalable maintenance of a large network of systems OR for individual users installing a personal standalone workstation. These two points aren't (I think) "religion" -- they are practical costs associated with using the distro for clusters or workstation LANs or personal workstations that need to be considered when picking a distro for any of those purposes. When I considered them, I switched. The human costs are real; people pay money for them or they come out of a fixed opportunity cost time budget. One person can manage a staggeringly large, surprisingly heterogeneous network of RH-derived systems with kickstart with very little effort -- what effort one expends scales up to the entire network. Debian is reportedly similarly manageable at scale, although I have less experience there. I have never heard anyone say "Yeah, Slackware, that's the best distro to use if you have just one person and she has to manage four hundred systems in a mix of cluster, lab and desktop LAN settings. rgb > > Perry > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From perry at piermont.com Wed Jul 2 07:57:02 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] A press release In-Reply-To: (Robert G. Brown's message of "Wed\, 2 Jul 2008 10\:50\:40 -0400 \(EDT\)") References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <87bq1hgpep.fsf@snark.cb.piermont.com> Message-ID: <87skusbagx.fsf@snark.cb.piermont.com> "Robert G. Brown" writes: >> Precisely. It pays to allow people to use what they want. Fewer >> religious battles that way. Whether one distro or another has an >> advantage isn't the point -- people have their own tastes and it >> doesn't pay to tell them "no" without good reason. > > It isn't all about religion. There are two "real" problems with > Slackware. One is its packaging system, the other (related) is > maintenance. I wasn't mentioning Slackware. The "major" distros are all pretty similar in features, but I wouldn't count Slackware that way. Perry From rgb at phy.duke.edu Wed Jul 2 08:12:05 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486B7AB9.9050202@aei.mpg.de> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> Message-ID: On Wed, 2 Jul 2008, Carsten Aulbert wrote: > Which corresponds exactly to the maximum achievable mounts of 358 right > now. Besides, I'm far from being an expert on TCP/IP, but is it possible > for a local process to bind to a port which is already in use but to > another host? I don't think so, but may be wrong. AFAIK, no they don't. The way TCP daemons that listen on a well-known/privileged port work is that they accept a connection on that port, then fork a connection on a higher unprivileged (>1023) port on both ends so that the daemon can listen once again. You can see this by running e.g. netstat -a. Many daemons have a limit that can be set on the number of simultaneous connections they can manage. However, this is for TCP ports that maintain a persistent connection. UDP ports are "connectionless" and hence somewhat different. They tend to make a connection, receive a command/request for some service, immediately deliver the result, and end the connection. NFS used to be built on top of UDP, and honestly I don't know what it does and how it (NFSv3) does it on TCP and am too lazy to look it up, but the RFCs are there to be read. rgb > > Cheers > > Carsten > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From mark.kosmowski at gmail.com Wed Jul 2 08:19:42 2008 From: mark.kosmowski at gmail.com (Mark Kosmowski) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] Re: energy costs and poor grad students In-Reply-To: <486B8634.6020309@scalableinformatics.com> References: <486B8634.6020309@scalableinformatics.com> Message-ID: On 7/2/08, Joe Landman wrote: > Hi Mark > > Mark Kosmowski wrote: > > I'm in the US. I'm almost, but not quite ready for production runs - > > still learning the software / computational theory. I'm the first > > person in the research group (physical chemistry) to try to learn > > plane wave methods of solid state calculation as opposed to isolated > > atom-centered approximations and periodic atom centered calculations. > > > > Heh... my research group in grad school went through that transition in the > mid 90s. Went from an LCAO-type simulation to CP like methods. We needed a > t3e to run those (then). > > Love to compare notes and see which code you are using someday. > On-list/off-list is fine. Right now I'm using CPMD. This is the first package I've looked at and wrestled with the 32-bit limitations of memory allocation prior to the debut of the Opterons. I was at the cusp of buying UltraSparc hardware at student pricing to go forward when the Opterons were released to market, so I decided to go with the PC hardware I was already familiar with. We're comparing calculations to inelastic neutron scattering experiments and it looks like abinit or quantum espresso might be a better choice for this to do vibrational analysis at q-space other than the gamma point. Speaking of, I only have an eighth of a clue about understanding k-points (and, by extension, q-space). If anyone can suggest some reading for this topic that even a part-time chemistry student can understand it would be greatly appreciated. > > > It is turning out that the package I have spent the most time learning > > is perhaps not the best one for what we are doing. For a variety of > > reasons, many of which more off-topic than tac nukes and energy > > efficient washing machines ;) , I'm doing my studies part-time while > > working full-time in industry. > > > > More power to ya! I did mine that way too ... the writing was the hardest > part. Just don't lose focus, or stop believing you can do it. When the > light starts getting visible at the end of the process, it is quite > satisfying. > > I have other words to describe this, but they require a beer lever to get > them out of me ... I make mead on occaision - if you're ever in central NY (Syracuse - Rome - Utica area)... Speaking of satisfaction, I did teach myself enough Fortran to add to the CPMD code to give an output format natively readable by aClimax (used to calculate harmonics from fundamental frequencies for INS). This is/will be included in the recently/soon to be released version of CPMD. Heck, there's one or two pages of dissertation right there. :) > > > I think I have come to a compromise that can keep me in business. > > Until I have a better understanding of the software and am ready for > > production runs, I'll stick to a small system that can be run on one > > node and leave the other two powered down. I've also applied for an > > adjunt instructor position at a local college for some extra cash and > > good experience. When I'm ready for production runs I can either just > > bite the bullet and pay the electricity bill or seek computer time > > elsewhere. > > > > Give us a shout when you want to try the time on a shared resource. Some > folks here may be able to make good suggestions. RGB is a physics guy at > Duke, doing lots of simulations, and might know of resources. Others here > might as well. > > Joe > > Sounds good. The big thing is getting a bit better understanding of the theory, especially DFT dispersion correction to account for hydrogen bonding. I'm thinking that I will learn about DFT dispersion correction with CPMD to at least get a reasonable understanding and then consider learning one of the other packages to do q-space calculations. > > > > Thanks for the encouragement, > > > > Mark E. Kosmowski > > > > On 7/1/08, ariel sabiguero yawelak wrote: > > > > > Well Mark, don't give up! > > > I am not sure which one is your application domain, but if you require > 24x7 > > > computation, then you should not be hosting that at home. > > > On the other hand, if you are not doing real computation and you just > have a > > > testbed at home, maybe for debugging your parallel applications or > something > > > similar, you might be interested in a virtualized solution. Several > years > > > ago, I used to "debug" some neural networks at home, but training > sessions > > > (up to two weeks of training) happened at the university. > > > I would suggest to do something like that. > > > You can always scale-down your problem in several phases and save the > > > complete data-set / problem for THE RUN. > > > > > > You are not being a heretic there, but suffering energy costs ;-) > > > In more places that you may believe, useful computing nodes are being > > > replaced just because of energy costs. Even in some application domains > you > > > can even loose computational power if you move from 4 nodes into a > single > > > quad-core (i.e. memory bandwidth problems). I know it is very nice to be > > > able to do everything at home.. but maybe before dropping your studies > or > > > working overtime to pay the electricity bill, you might want to > reconsider > > > the fact of collapsing your phisical deploy into a single virtualized > > > cluster. (or just dispatch several threads/processes in a single > system). > > > If you collapse into a single system you have only 1 mainboard, one HDD, > one > > > power source, one processor (physically speaking), .... and you can > achieve > > > almost the performance of 4 systems in one, consuming the power of.... > well > > > maybe even less than a single one. I don't want to go into discussions > about > > > performance gain/loose due to the variation of the hardware > architecture. > > > Invest some bucks (if you haven't done that yet) in a good power source. > > > Efficiency of OEM unbranded power sources is realy pathetic. may be > 45-50% > > > efficiency, while a good power source might be 75-80% efficient. Use the > > > energy for computing, not for heating your house. > > > What I mean is that you could consider just collapsing a complete > "small" > > > cluster into single system. If your application is CPU-bound and not I/O > > > bound, VMware Server could be an option, as it is free software > > > (unfortunately not open, even tough some patches can be done on the > > > drivers). I think it is not possible to publish benchmarking data about > > > VMware, but I can tell you that in long timescales, the performance you > get > > > in the host OS is similar than the one of the guest OS. There are a lot > of > > > problems related to jitter, from crazy clocks to delays, but if your > > > application is not sensitive to that, then you are Ok. > > > Maybe this is not a solution, but you can provide more information > regarding > > > your problem before quitting... > > > > > > my 2 cents.... > > > > > > ariel > > > > > > Mark Kosmowski escribi?: > > > > > > > > > > At some point there a cost-benefit analysis needs to be performed. If > > > > my cluster at peak usage only uses 4 Gb RAM per CPU (I live in > > > > single-core land still and do not yet differentiate between CPU and > > > > core) and my nodes all have 16 Gb per CPU then I am wasting RAM > > > > resources and would be better off buying new machines and physically > > > > transferring the RAM to and from them or running more jobs each > > > > distributed across fewer CPUs. Or saving on my electricity bill and > > > > powering down some nodes. > > > > > > > > As heretical as this last sounds, I'm tempted to throw in the towel on > > > > my PhD studies because I can no longer afford the power to run my > > > > three node cluster at home. Energy costs may end up being the straw > > > > that breaks this camel's back. > > > > > > > > Mark E. Kosmowski > > > > > > > > > > > > > > > > > > > > > From: "Jon Aquilina" > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > not sure if this applies to all kinds of senarios that clusters are > used > > > > > > > > > > > > in > > > > > > > > > > > > but isnt the more ram you have the better? > > > > > > > > > > On 6/30/08, Vincent Diepeveen wrote: > > > > > > > > > > > > > > > > > > > > > Toon, > > > > > > > > > > > > Can you drop a line on how important RAM is for weather > forecasting in > > > > > > latest type of calculations you're performing? > > > > > > > > > > > > Thanks, > > > > > > Vincent > > > > > > > > > > > > > > > > > > On Jun 30, 2008, at 8:20 PM, Toon Moene wrote: > > > > > > > > > > > > Jim Lux wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Yep. And for good reason. Even a big DoD job is still tiny in > > > > > > > > > > > > > > > > > > > > > > > > > Nvidia's > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > scale of operations. We face this all the time with NASA work. > > > > > > > > Semiconductor manufacturers have no real reason to produce > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > special purpose > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > or customized versions of their products for space use, > because > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > they can > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > sell all they can make to the consumer market. More than once, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I've had a > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > phone call along the lines of this: > > > > > > > > "Jim: I'm interested in your new ABC321 part." > > > > > > > > "Rep: Great. I'll just send the NDA over and we can talk about > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > it." > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > "Jim: Great, you have my email and my fax # is..." > > > > > > > > "Rep: By the way, what sort of volume are you going to be > using?" > > > > > > > > "Jim: Oh, 10-12.." > > > > > > > > "Rep: thousand per week, excellent..." > > > > > > > > "Jim: No, a dozen pieces, total, lifetime buy, or at best > maybe > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > every > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > year." > > > > > > > > "Rep: Oh..." > > > > > > > > {Well, to be fair, it's not that bad, they don't hang up on > you.. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Since about a year, it's been clear to me that weather > forecasting > > > > > > > > > > > > > > > > > > > > > > > > > (i.e., > > > > > > > > > > > > > > > > > > > > > > > > > running a more or less sophisticated atmospheric model to > provide > > > > > > > > > > > > > > > > > > > > > > > > > weather > > > > > > > > > > > > > > > > > > > > > > > > > predictions) is going to be "mainstream" in the sense that every > > > > > > > > > > > > > > > > > > > > > > > > > business > > > > > > > > > > > > > > > > > > > > > > > > > that needs such forecasts for its operations can simply run them > > > > > > > > > > > > > > > > > > > > > > > > > in-house. > > > > > > > > > > > > > > > > > > > > > > > > > Case in point: I bought a $1100 HP box (the obvious target > group > > > > > > > > > > > > > > > > > > > > > > > > > being > > > > > > > > > > > > > > > > > > > > > > > > > teenage downloaders) which performs the HIRLAM limited area > model > > > > > > > > > > > > > > > > > > > > > > > > > *on the > > > > > > > > > > > > > > > > > > > > > > > > > grid that we used until October 2006* in December last year. > > > > > > > > > > > > > > It's about twice as slow as our then-operational 50-CPU Sun Fire > > > > > > > > > > > > > > > > > > > > > > > > > 15K. > > > > > > > > > > > > > > > > > > > > > > > > > I wonder what effect this will have on CPU developments ... > > > > > > > > > > > > > > -- > > > > > > > Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 > > > > > > > > > > > > > > > > > > > > > > > > > 214290 > > > > > > > > > > > > > > > > > > > > > > > > > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > > > > > > > At home: http://moene.indiv.nluug.nl/~toon/ > > > > > > > Progress of GNU Fortran: > > > > > > > > > > > > > > > > > > > > > > > > > http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > Beowulf mailing list, Beowulf@beowulf.org > > > > > > To change your subscription (digest mode or unsubscribe) visit > > > > > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Jonathan Aquilina > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > Beowulf mailing list, Beowulf@beowulf.org > > > > To change your subscription (digest mode or unsubscribe) visit > > > > > > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org > > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > -- > Joseph Landman, Ph.D > Founder and CEO > Scalable Informatics LLC, > email: landman@scalableinformatics.com > web : http://www.scalableinformatics.com > http://jackrabbit.scalableinformatics.com > phone: +1 734 786 8423 > fax : +1 866 888 3112 > cell : +1 734 612 4615 > From prentice at ias.edu Wed Jul 2 08:22:54 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> Message-ID: <486B9D4E.80405@ias.edu> Mark Hahn wrote: >>>> does it necessarily have to be a redhat clone. can it also be a debian >>>> based >>>> clone? >>> >>> but why? is there some concrete advantage to using Debian? >>> I've never understood why Debian users tend to be very True Believer, >>> or what it is that hooks them. >> >> And the Debian users can say the same thing about Red Hat users. Or SUSE > > very nice! an excellent parody of the True Believer response. > > but I ask again: what are the reasons one might prefer using debian? > really, I'm not criticizing it - I really would like to know why it > would matter whether someone (such as ClusterVisionOS (tm)) would use > debian or another distro. > >From my interactions with others re: Debian, it's usually about true opensourceness, since Debian claims that every package distributed by them is GPLed, or some how meets some open source legal criteria. Also, I don't think there's any plan for Debian to go corporate, release and enterprise version, and effectively bite the had that feeds it, like Red Hat and SUSE did. Those are not technical issues, but philosophical/legal/political issues. Me? I use RH and it's derivatives for a couple of reasons. Here they are in historical order: 1. When I started learning Linux on my own, all the Linux authorities (websites, LJ, etc) recommended RH b/c RPM made it easy to install software, and if you bought a boxed version, you got the Metro-X X-server, which supported much more video hardware than XFree86 did at the time, and had an easy to use GUI to configure X. 2. Now that I'm a professional system admin who often has to support commercial apps, I find I have to use a RH-based distro for two reasons: A. Most commercial software "supports" only Red Hat. Some go so far as to refuse to install if RH is not detected. The most extreme case of this is EMC PowerPath, whose kernel modules won't install if it's not a RH (or SUSE) kernel. B. Red Hat has done such a good job of spreading FUD about the other Linux distros, management has a cow if you tell them you're installing something other than RH. This is why I consider Red Hat the Microsoft of Linux. None of those are technical issues, either. Since the term "Linux" applies to the kernel only in the strictest sense, there should be no technical reasons to choose one distro over another. Issues like nice GUI management tools are human issues not technical issues. -- Prentice From perry at piermont.com Wed Jul 2 08:23:27 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: (Robert G. Brown's message of "Wed\, 2 Jul 2008 11\:12\:05 -0400 \(EDT\)") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> Message-ID: <87hcb8b98w.fsf@snark.cb.piermont.com> "Robert G. Brown" writes: > On Wed, 2 Jul 2008, Carsten Aulbert wrote: >> Which corresponds exactly to the maximum achievable mounts of 358 right >> now. Besides, I'm far from being an expert on TCP/IP, but is it possible >> for a local process to bind to a port which is already in use but to >> another host? I don't think so, but may be wrong. > > AFAIK, no they don't. The way TCP daemons that listen on a > well-known/privileged port work is that they accept a connection on that > port, then fork a connection on a higher unprivileged (>1023) port on > both ends so that the daemon can listen once again. Try netstat on a heavily loaded SMTP box. You'll see all these connections from some random foreign port to port 25 locally -- lots of connections to port 25 at the same time. You don't switch to a different port number after the connection comes in, you stay on it. You can in theory talk to up to (nearly) 2^48 different foreign host/port combos off of local port 25, because every remote host/remote port pair makes for a different 4-tuple. > Many daemons have a limit that can be set on the number of > simultaneous connections they can manage. That's a resource issue, not a TCP architecture issue per se. You might not have enough memory, CPU, etc. to handle more than a certain number of connections. By the way, you can now design daemons to handle tens of thousands of simultaneous connections with clean event driven design on a modern multiprocessor with plenty of memory. This is way off topic, though. > However, this is for TCP ports that maintain a persistent connection. > UDP ports are "connectionless" and hence somewhat different. I'm assuming they're doing NFS over TCP. If they're using UDP, things are somewhat different because of the existence of "connectionless" UDP. However, they *should* use TCP for performance. (I know people used to claim the opposite, but it turns out you really want TCP so you get proper congestion control.) Perry -- Perry E. Metzger perry@piermont.com From peter.st.john at gmail.com Wed Jul 2 08:25:19 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] Re: energy costs and poor grad students In-Reply-To: References: Message-ID: Mark, Would it be feasible to downclock your three nodes? All you physicists know better than I, that the power draw and heat production are not linear in GHz. A 1 GHz processor is less than half the cost per tick than a 2GHz, so if power budget is more urgent for you than time to completion then that might help; continue running all of your nodes, but slower. But I've never done this myself. OTOH as a mathematician I don't have to :-) See http://xkcd.com/435/ ("Purity") Peter On 7/2/08, Mark Kosmowski wrote: > > I'm in the US. I'm almost, but not quite ready for production runs - > still learning the software / computational theory. I'm the first > person in the research group (physical chemistry) to try to learn > plane wave methods of solid state calculation as opposed to isolated > atom-centered approximations and periodic atom centered calculations. > > It is turning out that the package I have spent the most time learning > is perhaps not the best one for what we are doing. For a variety of > reasons, many of which more off-topic than tac nukes and energy > efficient washing machines ;) , I'm doing my studies part-time while > working full-time in industry. > > I think I have come to a compromise that can keep me in business. > Until I have a better understanding of the software and am ready for > production runs, I'll stick to a small system that can be run on one > node and leave the other two powered down. I've also applied for an > adjunt instructor position at a local college for some extra cash and > good experience. When I'm ready for production runs I can either just > bite the bullet and pay the electricity bill or seek computer time > elsewhere. > > Thanks for the encouragement, > > Mark E. Kosmowski > > On 7/1/08, ariel sabiguero yawelak wrote: > > Well Mark, don't give up! > > I am not sure which one is your application domain, but if you require > 24x7 > > computation, then you should not be hosting that at home. > > On the other hand, if you are not doing real computation and you just > have a > > testbed at home, maybe for debugging your parallel applications or > something > > similar, you might be interested in a virtualized solution. Several years > > ago, I used to "debug" some neural networks at home, but training > sessions > > (up to two weeks of training) happened at the university. > > I would suggest to do something like that. > > You can always scale-down your problem in several phases and save the > > complete data-set / problem for THE RUN. > > > > You are not being a heretic there, but suffering energy costs ;-) > > In more places that you may believe, useful computing nodes are being > > replaced just because of energy costs. Even in some application domains > you > > can even loose computational power if you move from 4 nodes into a single > > quad-core (i.e. memory bandwidth problems). I know it is very nice to be > > able to do everything at home.. but maybe before dropping your studies or > > working overtime to pay the electricity bill, you might want to > reconsider > > the fact of collapsing your phisical deploy into a single virtualized > > cluster. (or just dispatch several threads/processes in a single system). > > If you collapse into a single system you have only 1 mainboard, one HDD, > one > > power source, one processor (physically speaking), .... and you can > achieve > > almost the performance of 4 systems in one, consuming the power of.... > well > > maybe even less than a single one. I don't want to go into discussions > about > > performance gain/loose due to the variation of the hardware architecture. > > Invest some bucks (if you haven't done that yet) in a good power source. > > Efficiency of OEM unbranded power sources is realy pathetic. may be > 45-50% > > efficiency, while a good power source might be 75-80% efficient. Use the > > energy for computing, not for heating your house. > > What I mean is that you could consider just collapsing a complete "small" > > cluster into single system. If your application is CPU-bound and not I/O > > bound, VMware Server could be an option, as it is free software > > (unfortunately not open, even tough some patches can be done on the > > drivers). I think it is not possible to publish benchmarking data about > > VMware, but I can tell you that in long timescales, the performance you > get > > in the host OS is similar than the one of the guest OS. There are a lot > of > > problems related to jitter, from crazy clocks to delays, but if your > > application is not sensitive to that, then you are Ok. > > Maybe this is not a solution, but you can provide more information > regarding > > your problem before quitting... > > > > my 2 cents.... > > > > ariel > > > > Mark Kosmowski escribi?: > > > > > At some point there a cost-benefit analysis needs to be performed. If > > > my cluster at peak usage only uses 4 Gb RAM per CPU (I live in > > > single-core land still and do not yet differentiate between CPU and > > > core) and my nodes all have 16 Gb per CPU then I am wasting RAM > > > resources and would be better off buying new machines and physically > > > transferring the RAM to and from them or running more jobs each > > > distributed across fewer CPUs. Or saving on my electricity bill and > > > powering down some nodes. > > > > > > As heretical as this last sounds, I'm tempted to throw in the towel on > > > my PhD studies because I can no longer afford the power to run my > > > three node cluster at home. Energy costs may end up being the straw > > > that breaks this camel's back. > > > > > > Mark E. Kosmowski > > > > > > > > > > > > > From: "Jon Aquilina" > > > > > > > > > > > > > > > > > > > > > not sure if this applies to all kinds of senarios that clusters are > used > > in > > > > but isnt the more ram you have the better? > > > > > > > > On 6/30/08, Vincent Diepeveen wrote: > > > > > > > > > > > > > Toon, > > > > > > > > > > Can you drop a line on how important RAM is for weather forecasting > in > > > > > latest type of calculations you're performing? > > > > > > > > > > Thanks, > > > > > Vincent > > > > > > > > > > > > > > > On Jun 30, 2008, at 8:20 PM, Toon Moene wrote: > > > > > > > > > > Jim Lux wrote: > > > > > > > > > > > > > > > > Yep. And for good reason. Even a big DoD job is still tiny in > > Nvidia's > > > > > > > > > > > > > > > > > > > scale of operations. We face this all the time with NASA work. > > > > > > > Semiconductor manufacturers have no real reason to produce > > special purpose > > > > > > > or customized versions of their products for space use, because > > they can > > > > > > > sell all they can make to the consumer market. More than once, > > I've had a > > > > > > > phone call along the lines of this: > > > > > > > "Jim: I'm interested in your new ABC321 part." > > > > > > > "Rep: Great. I'll just send the NDA over and we can talk about > > it." > > > > > > > "Jim: Great, you have my email and my fax # is..." > > > > > > > "Rep: By the way, what sort of volume are you going to be > using?" > > > > > > > "Jim: Oh, 10-12.." > > > > > > > "Rep: thousand per week, excellent..." > > > > > > > "Jim: No, a dozen pieces, total, lifetime buy, or at best maybe > > every > > > > > > > year." > > > > > > > "Rep: Oh..." > > > > > > > {Well, to be fair, it's not that bad, they don't hang up on > you.. > > > > > > > > > > > > > > > > > > > > > > > > > > > Since about a year, it's been clear to me that weather > forecasting > > (i.e., > > > > > > running a more or less sophisticated atmospheric model to provide > > weather > > > > > > predictions) is going to be "mainstream" in the sense that every > > business > > > > > > that needs such forecasts for its operations can simply run them > > in-house. > > > > > > > > > > > > Case in point: I bought a $1100 HP box (the obvious target group > > being > > > > > > teenage downloaders) which performs the HIRLAM limited area model > > *on the > > > > > > grid that we used until October 2006* in December last year. > > > > > > > > > > > > It's about twice as slow as our then-operational 50-CPU Sun Fire > > 15K. > > > > > > > > > > > > I wonder what effect this will have on CPU developments ... > > > > > > > > > > > > -- > > > > > > Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 > > 214290 > > > > > > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > > > > > > At home: http://moene.indiv.nluug.nl/~toon/ > > > > > > Progress of GNU Fortran: > > http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Beowulf mailing list, Beowulf@beowulf.org > > > > > To change your subscription (digest mode or unsubscribe) visit > > > > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Jonathan Aquilina > > > > > > > > > > > _______________________________________________ > > > Beowulf mailing list, Beowulf@beowulf.org > > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > > > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080702/ec8b64c0/attachment.html From prentice at ias.edu Wed Jul 2 08:28:53 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486A6D59.7020704@scalableinformatics.com> Message-ID: <486B9EB5.6020906@ias.edu> Mark Hahn wrote: >> Hmmm.... for me, its all about the kernel. Thats 90+% of the battle. >> Some distros use good kernels, some do not. I won't mention who I >> think is in the latter category. > > I was hoping for some discussion of concrete issues. for instance, > I have the impression debian uses something other than sysvinit - does > that work out well? is it a problem getting commercial packages > (pathscale/pgi/intel compilers, gaussian, etc) to run? > > the couple debian people I know tend to have more ideological motives > (which I do NOT impugn, except that I am personally more swayed by > practical, concrete reasons.) I agree. I follow the same pragmatic rational paradigm. -- Prentice From atchley at myri.com Wed Jul 2 08:32:54 2008 From: atchley at myri.com (Scott Atchley) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486B8C1E.2090007@tamu.edu> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <486B6501.5000108@aei.mpg.de> <04BB8220-B185-42A2-8E34-DA61066B6D51@myri.com> <486B8C1E.2090007@tamu.edu> Message-ID: On Jul 2, 2008, at 10:09 AM, Gerry Creager wrote: >> Although I believe Lustre's robustness is very good these days, I >> do not believe that it will not work in your setting. I think that >> they currently do not recommend mounting a client on a node that is >> also working as a server as you are doing with NFS. I believe it is >> due to memory contention leading to deadlock. > > Lustre is good enough that it's the parallel FS at TACC for the > Ranger cluster. And, I've had no real problems as a user thereof. > We're brining up glustre on our new cluster here ( > CentOS/RHEL5, not debian ). We looked at zfs but didn't > have sufficient experience to go that path. I believe that all the large DOE labs are using Lustre and would not if it were not reliable. My only concern was Carsten not having dedicated server nodes and mounting directly on those nodes. I may be off-base and hopefully one of the Lustre/SUN people might correct me if so. :-) Scott From rgb at phy.duke.edu Wed Jul 2 08:46:51 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <87hcb8b98w.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> Message-ID: On Wed, 2 Jul 2008, Perry E. Metzger wrote: > You don't switch to a different port number after the connection comes > in, you stay on it. You can in theory talk to up to (nearly) 2^48 > different foreign host/port combos off of local port 25, because every > remote host/remote port pair makes for a different 4-tuple. Ah. I should have known that. >> Many daemons have a limit that can be set on the number of >> simultaneous connections they can manage. > > That's a resource issue, not a TCP architecture issue per se. You > might not have enough memory, CPU, etc. to handle more than a certain > number of connections. > > By the way, you can now design daemons to handle tens of thousands of > simultaneous connections with clean event driven design on a modern > multiprocessor with plenty of memory. This is way off topic, though. Not on a cluster list. Networking in a very real sense IS the topic. I've written forking daemons (which is why I should have known, or remembered, about the four-tuple thing:-) because they are an essential component of IPCs in a network-based cluster or cluster distributed apps. Even though PVM and MPI make it easy to write portable code (and may well provide you with better performance than you can easily get on your own) there may well be occasions for cluster software writers to need to write their own networking, in band or out of band. >> However, this is for TCP ports that maintain a persistent connection. >> UDP ports are "connectionless" and hence somewhat different. > > I'm assuming they're doing NFS over TCP. If they're using UDP, things > are somewhat different because of the existence of "connectionless" > UDP. However, they *should* use TCP for performance. (I know people > used to claim the opposite, but it turns out you really want TCP so > you get proper congestion control.) Yah. To make UDP reliable, you have to load it down with most of the stuff in TCP anyway; it isn't clear that it was ever a great choice. IIRC PVM was originally built on UDP for similar reasons, but I think -- am not sure but think -- it is TCP today because it wasn't worth the hassle. I'm too lazy to crank up a PVM app to find out, though...;-) rgb -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From Bogdan.Costescu at iwr.uni-heidelberg.de Wed Jul 2 08:53:31 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> Message-ID: On Wed, 2 Jul 2008, Robert G. Brown wrote: > The way TCP daemons that listen on a well-known/privileged port work > is that they accept a connection on that port, then fork a > connection on a higher unprivileged (>1023) port on both ends so > that the daemon can listen once again. 'man 7 socket' and look up SO_REUSEADDR. I don't quite know what you mean by 'forking a connection'; when the daemon encounters a fork() all open file descriptors (including sockets) are being kept in both the parent and the child. The child (usually the part of the daemon that processes the content that comes on that connection) gets the same 4-tuple as the parent. The parent closes its file handle so that only the child is then active on that connection. > You can see this by running e.g. netstat -a. I seriously doubt that you have seen such a behaviour. Empirical evidence which might pass easier than theoretical one: on the e-mail server that I admin, there is an iptable rule to only allow incoming connections to port 25 - if connections would suddenly be migrated to different ports they would be blocked and I would not receive any e-mails from this list. But I do, especially during the past few days... (not that I complain :-)) -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From perry at piermont.com Wed Jul 2 09:33:06 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: (Robert G. Brown's message of "Wed\, 2 Jul 2008 11\:46\:51 -0400 \(EDT\)") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> Message-ID: <87vdzo9rgd.fsf@snark.cb.piermont.com> "Robert G. Brown" writes: > On Wed, 2 Jul 2008, Perry E. Metzger wrote: >> By the way, you can now design daemons to handle tens of thousands of >> simultaneous connections with clean event driven design on a modern >> multiprocessor with plenty of memory. This is way off topic, though. > > Not on a cluster list. Well, it actually kind of is. Typically, a box in an HPC cluster is running stuff that's compute bound and who's primary job isn't serving vast numbers of teeny high latency requests. That's much more what a web server does. However... > I've written forking daemons (which is why I should have known, or > remembered, about the four-tuple thing:-) because they are an essential > component of IPCs in a network-based cluster or cluster distributed > apps. One is best off *not* forking, actually. There's a good site on concurrency management for high performance servers. It is a bit old now but covers the topic well: http://www.kegel.com/c10k.html Myself, I'm a believer in event driven code. One thread, one core. All other concurrency management should be handled by events, not by multiple threads. Thread context switching is very very expensive, and threads are very expensive. Doing event driven programming wins overwhelmingly in such contexts. It is hard to impossible, on a modern machine, to handle tens of thousands of connections with forking or threads, but it is easy with events. I'm a fan of Niels Provos' "libevent" for such purposes. There are a lot of other libraries that plug in to it well, too. -- Perry E. Metzger perry@piermont.com From perry at piermont.com Wed Jul 2 09:37:55 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: (Bogdan Costescu's message of "Wed\, 2 Jul 2008 17\:53\:31 +0200 \(CEST\)") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> Message-ID: <87iqvo9r8c.fsf@snark.cb.piermont.com> Bogdan Costescu writes: > 'man 7 socket' and look up SO_REUSEADDR. Incidently, I believe this may be part of the problem for the NFS client code in Linux. -- Perry E. Metzger perry@piermont.com From rgb at phy.duke.edu Wed Jul 2 10:54:37 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> Message-ID: On Wed, 2 Jul 2008, Bogdan Costescu wrote: > On Wed, 2 Jul 2008, Robert G. Brown wrote: > >> The way TCP daemons that listen on a well-known/privileged port work is >> that they accept a connection on that port, then fork a connection on a >> higher unprivileged (>1023) port on both ends so that the daemon can listen >> once again. > > 'man 7 socket' and look up SO_REUSEADDR. I don't quite know what you mean by > 'forking a connection'; when the daemon encounters a fork() all open file > descriptors (including sockets) are being kept in both the parent and the > child. The child (usually the part of the daemon that processes the content > that comes on that connection) gets the same 4-tuple as the parent. The > parent closes its file handle so that only the child is then active on that > connection. I'm stating it badly and incorrectly, confusing port with socket. See the following code. Server listens, bound to a specific port. When a connection is initiated by a (possibly remote) client, it accepts it (creating a socket with its own FD), leaving the original server socket FD unaffected. It then forks and the child CLOSES the original socket lest there be trouble. The server/parent similarly closes the client fd. The client typically got a "random" (kernel chosen) port on ITS side from the list of available unprotected ports when it formed its original socket, and it forms one side of the stream connection, with the server "accept" socket being the other. What I was trying to convey remarkably poorly is that once you've created a daemon and bound it to a port, if you try to start up a second daemon on that port you'll get a EADDRINUSE on the bind (and fail the loop that checks below), and so if you DON'T fork off the sockets with listen/accept you'll usually block the port indefinitely while handling each connection. I haven't tried (at least, not deliberately:-) not going through the asymmetric close so that the two processes both have all the FDs, but I'd guess bad things would happen if I did, a crap shoot race condition as to which process gets the data or worse. OTOH, some applications (esp nfsd and httpd) DO fork several child processes with the original open socket fd so that if incoming requests for a connection come while one of them is "busy" with the creation of a child of its own to handle the connection, another will pick it up round robin. Unless I'm misunderstanding how they work this or why. mail is even more interesting, as imapd has to stick around to manage each persistent imap connection, so an imapd server has umpty zillion instances of imapd. I don't know exactly what smtp daemons do -- postfix or sendmail. Anyway, some generic forking daemon code, adopted IIRC from Stevens originally and hacked around some to avoid TIME_WAIT and so on: server_fd = socket(AF_INET,SOCK_STREAM,0); if (server_fd < 0){ fprintf(stderr,"socket: %.100s", strerror(errno)); exit(1); } /* * Set socket options. We try to make the port reusable and have it * close as fast as possible without waiting in unnecessary wait states * on close. */ setsockopt(server_fd, SOL_SOCKET, SO_REUSEADDR, (void *)&on, sizeof(on)); linger.l_onoff = 1; /* Linger for just a bit */ linger.l_linger = 0; /* do NOT linger -- exit and discard data. */ setsockopt(server_fd, SOL_SOCKET, SO_LINGER, (void *)&linger, sizeof(linger)); serverlen = sizeof(serverINETaddress); bzero( (char*) &serverINETaddress,serverlen); /* clear structure */ serverINETaddress.sin_family = AF_INET; /* Internet domain */ serverINETaddress.sin_addr.s_addr = htonl(INADDR_ANY); /* Accept all */ serverINETaddress.sin_port = htons(port); /* Server port number */ serverSockAddrPtr = (struct sockaddr*) &serverINETaddress; /* * Bind the socket to the desired port. Try up to six times (30sec) IF the * port is in use */ retries = 6; errno = 0; /* To zero any possible garbage value */ while(retries--){ if(bind(server_fd,serverSockAddrPtr,serverlen) < 0) { if(errno != EADDRINUSE){ close(server_fd); fprintf(stderr,"bind: %.100s\n", strerror(errno)); fprintf(stderr,"socket bind to port %d failed: %d.\n", port,errno); exit(255); } } else break; /* printf("Got no port: %s\n",strerror(errno)); */ sleep(5); } if(errno){ if(errno == EADDRINUSE){ fprintf(stderr,"Timeout (tried to bind six times five seconds apart)\n"); } close(server_fd); fprintf(stderr,"bind to port %d failed: %.100s\n",port,strerror(errno)); exit(0); } /* * Socket exists. Service it. Queue up to n_connxns incoming connections * or die. Default 10 matches the limits in the default xinetd. */ if(listen(server_fd,nconnxns) < 0){ fprintf(stderr,"listen: %.100s", strerror(errno)); exit(255); } /* Arrange SIGCHLD to be caught. */ signal(SIGCHLD, sigchld_handler); /* * Initialize client structures. */ clientlen = sizeof(clientINETaddress); clientSockAddrPtr = (struct sockaddr*) &clientINETaddress; /* * Loop "forever", or until daemon crashes or is killed with a signal. */ while(1){ /* Accept a client connection */ if((verbose == D_ALL) || (verbose == D_DAEMON)){ printf("D_DAEMON: Accepting Client connection...\n"); } /* * Wait in select until there is a connection. Presumably this is * more efficient than just blocking on the accept */ FD_ZERO(&fdset); FD_SET(server_fd, &fdset); ret = select(server_fd + 1, &fdset, NULL, NULL, NULL); if (ret < 0 || !FD_ISSET(server_fd, &fdset)) { if (errno == EINTR) continue; fprintf(stderr,"select: %.100s", strerror(errno)); continue; } /* * A call is waiting. Accept it. */ client_fd = accept(server_fd,clientSockAddrPtr,&clientlen); if (client_fd < 0){ if (errno == EINTR) continue; fprintf(stderr,"accept: %.100s", strerror(errno)); continue; } if((verbose == D_ALL) || (verbose == D_DAEMON)){ printf("D_DAEMON: ...client connection made.\n"); } /* * IF I GET HERE... * ...I'm a real daemon. I therefore fork and have the child process * the connection. The parent continues listening and can service * multiple connections in parallel. */ /* * CHILD. Close the listening (server) socket, and start using the * accepted (client) socket. We break out of the (infinite) loop to * handle the connection. */ if ((pid = fork()) == 0){ close(server_fd); break; } /* * PARENT. Stay in the loop. Close the client socket (it's the child's) * but leave the server socket open. */ if (pid < 0) fprintf(stderr,"fork: %.100s", strerror(errno)); else if((verbose == D_ALL) || (verbose == D_DAEMON)){ printf("D_DAEMON: Forked child %d to handle socket %d.\n", pid,client_fd); } close(client_fd); } /* No need to wait for children -- I'm the child */ signal(SIGCHLD, SIG_DFL); /* Dissociate from calling process group and control terminal */ setsid(); > >> You can see this by running e.g. netstat -a. > > I seriously doubt that you have seen such a behaviour. Empirical evidence > which might pass easier than theoretical one: on the e-mail server that I > admin, there is an iptable rule to only allow incoming connections to port 25 > - if connections would suddenly be migrated to different ports they would be > blocked and I would not receive any e-mails from this list. But I do, > especially during the past few days... (not that I complain :-)) > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From rgb at phy.duke.edu Wed Jul 2 11:03:31 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <87vdzo9rgd.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> <87vdzo9rgd.fsf@snark.cb.piermont.com> Message-ID: On Wed, 2 Jul 2008, Perry E. Metzger wrote: > > "Robert G. Brown" writes: >> On Wed, 2 Jul 2008, Perry E. Metzger wrote: >>> By the way, you can now design daemons to handle tens of thousands of >>> simultaneous connections with clean event driven design on a modern >>> multiprocessor with plenty of memory. This is way off topic, though. >> >> Not on a cluster list. > > Well, it actually kind of is. Typically, a box in an HPC cluster is > running stuff that's compute bound and who's primary job isn't serving > vast numbers of teeny high latency requests. That's much more what a > web server does. However... I'd have to disagree. On some clusters, that is quite true. On others, it is very much not true, and whole markets of specialized network hardware that can manage vast numbers of teeny communications requests with acceptably low latency have come into being. And in between, there is, well, between, and TCP/IP at gigabit speeds is at least a contender for ways to fill it. >> I've written forking daemons (which is why I should have known, or >> remembered, about the four-tuple thing:-) because they are an essential >> component of IPCs in a network-based cluster or cluster distributed >> apps. > > One is best off *not* forking, actually. There's a good site on > concurrency management for high performance servers. It is a bit old > now but covers the topic well: http://www.kegel.com/c10k.html > > Myself, I'm a believer in event driven code. One thread, one core. All > other concurrency management should be handled by events, not by > multiple threads. Thread context switching is very very expensive, and > threads are very expensive. Doing event driven programming wins > overwhelmingly in such contexts. It is hard to impossible, on a > modern machine, to handle tens of thousands of connections with > forking or threads, but it is easy with events. > > I'm a fan of Niels Provos' "libevent" for such purposes. There are a > lot of other libraries that plug in to it well, too. Interesting. Makes sense, but a lot of boilerplate code for daemons has always used the fork approach. Of course, things were "smaller" back when the approach was dominant. The forking approach is easy to program and reminiscent of pipe code and so on. rgb -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From perry at piermont.com Wed Jul 2 11:37:41 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:21 2009 Subject: dealing with lots of sockets (was Re: [Beowulf] automount on high ports) In-Reply-To: (Robert G. Brown's message of "Wed\, 2 Jul 2008 14\:03\:31 -0400 \(EDT\)") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> <87vdzo9rgd.fsf@snark.cb.piermont.com> Message-ID: <87skus874a.fsf_-_@snark.cb.piermont.com> "Robert G. Brown" writes: >> Well, it actually kind of is. Typically, a box in an HPC cluster is >> running stuff that's compute bound and who's primary job isn't serving >> vast numbers of teeny high latency requests. That's much more what a >> web server does. However... > > I'd have to disagree. On some clusters, that is quite true. On others, > it is very much not true, and whole markets of specialized network > hardware that can manage vast numbers of teeny communications requests > with acceptably low latency have come into being. And in between, there > is, well, between, and TCP/IP at gigabit speeds is at least a contender > for ways to fill it. I have to admit my experience here is limited. I'll take your word for it that there are systems where huge numbers of small, high latency requests are processed. (I thought that teeny stuff in HPC land was almost always where you brought in the low latency fabric and used specialized protocols, but...) >> Myself, I'm a believer in event driven code. One thread, one core. All >> other concurrency management should be handled by events, not by >> multiple threads.[....] > Interesting. Makes sense, but a lot of boilerplate code for daemons has > always used the fork approach. Of course, things were "smaller" back > when the approach was dominant. The forking approach is easy to program > and reminiscent of pipe code and so on. Sure, but it is way inefficient. Every single process you fork means another data segment, another stack segment, which means lots of memory. Every process you fork also means that concurrency is achieved only by context switching, which means loads of expense on changing MMU state and more. Even thread switching is orders of magnitude worse than a procedure call. Invoking an event is essentially just a procedure call, so that wins big time. Event driven systems can also avoid locking if you keep global data structures to a minimum, in a way you really can't manage well with threaded systems. That makes it easier to write correct code. The price you pay is that you have to think in terms of events, and few programmers have been trained that way. Perry -- Perry E. Metzger perry@piermont.com From cousins at umit.maine.edu Wed Jul 2 11:50:17 2008 From: cousins at umit.maine.edu (Steve Cousins) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: <200806281735.m5SHZ8vS025843@bluewest.scyld.com> References: <200806281735.m5SHZ8vS025843@bluewest.scyld.com> Message-ID: > Just under 60MB/sec seems to be the maximum tape transport read/write > limit. Pretty reliably the first write from the beginning of tape was a > bit slower than writes started further into the tape. I believe LTO-3 is rated at 80 MB/sec without compression. Testing it on our HP unit in an Overland library I get: WRITE: dd if=/dev/zero of=/dev/nst0 bs=512k count=10k 10240+0 records in 10240+0 records out 5368709120 bytes (5.4 GB) copied, 71.8723 seconds, 74.7 MB/s READ: dd of=/dev/null if=/dev/nst0 bs=512k count=10k 10240+0 records in 10240+0 records out 5368709120 bytes (5.4 GB) copied, 69.2487 seconds, 77.5 MB/s I used a 512K block size because that is what I use with our backups and it has given optimal performance since the DLT-7000 days. Good luck, Steve ______________________________________________________________________ Steve Cousins, Ocean Modeling Group Email: cousins@umit.maine.edu Marine Sciences, 452 Aubert Hall http://rocky.umeoce.maine.edu Univ. of Maine, Orono, ME 04469 Phone: (207) 581-4302 From bernard at vanhpc.org Wed Jul 2 12:24:03 2008 From: bernard at vanhpc.org (Bernard Li) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: <87wsk4ed20.fsf@snark.cb.piermont.com> References: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> <3CB66E9F377C4961B5457896137EAD1B@geoffPC> <486A4296.4050501@hope.edu> <87y74lfabq.fsf@snark.cb.piermont.com> <87wsk4ed20.fsf@snark.cb.piermont.com> Message-ID: On Wed, Jul 2, 2008 at 4:32 AM, Perry E. Metzger wrote: > "Jon Aquilina" writes: >> if i use blender how nicely does it work in a cluster? > > I believe it works quite well. As far as I know blender does not have any built-in "clustering" capabilities. But what you do is render different frames on different cores (embarrassingly parallel) using a queuing/scheduling system. DrQueue seems to be quite popular with the rendering folks: http://drqueue.org/cwebsite/ Cheers, Bernard From coutinho at dcc.ufmg.br Wed Jul 2 12:34:48 2008 From: coutinho at dcc.ufmg.br (Bruno Coutinho) Date: Wed Nov 25 01:07:21 2009 Subject: dealing with lots of sockets (was Re: [Beowulf] automount on high ports) In-Reply-To: <87skus874a.fsf_-_@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> <87vdzo9rgd.fsf@snark.cb.piermont.com> <87skus874a.fsf_-_@snark.cb.piermont.com> Message-ID: 2008/7/2 Perry E. Metzger : > > "Robert G. Brown" writes: > >> Well, it actually kind of is. Typically, a box in an HPC cluster is > >> running stuff that's compute bound and who's primary job isn't serving > >> vast numbers of teeny high latency requests. That's much more what a > >> web server does. However... > > > > I'd have to disagree. On some clusters, that is quite true. On others, > > it is very much not true, and whole markets of specialized network > > hardware that can manage vast numbers of teeny communications requests > > with acceptably low latency have come into being. And in between, there > > is, well, between, and TCP/IP at gigabit speeds is at least a contender > > for ways to fill it. > > I have to admit my experience here is limited. I'll take your word for > it that there are systems where huge numbers of small, high latency > requests are processed. (I thought that teeny stuff in HPC land was > almost always where you brought in the low latency fabric and used > specialized protocols, but...) > > >> Myself, I'm a believer in event driven code. One thread, one core. All > >> other concurrency management should be handled by events, not by > >> multiple threads.[....] > libevent can be used for event-based servers. http://www.monkey.org/~provos/libevent/ > > > Interesting. Makes sense, but a lot of boilerplate code for daemons has > > always used the fork approach. Of course, things were "smaller" back > > when the approach was dominant. The forking approach is easy to program > > and reminiscent of pipe code and so on. This site describe several approaches to solve this problem: http://www.kegel.com/c10k.html Look for Chromium's X15. It can handle thousands of simultaneous conections and can saturate gigabit networks even with lots of slow clients. > > Sure, but it is way inefficient. Every single process you fork means > another data segment, another stack segment, which means lots of > memory. Every process you fork also means that concurrency is achieved > only by context switching, which means loads of expense on changing > MMU state and more. Even thread switching is orders of magnitude worse > than a procedure call. Invoking an event is essentially just a > procedure call, so that wins big time. As fas I know, process creation can take up to 1,000,000 cycles. > > > Event driven systems can also avoid locking if you keep global data > structures to a minimum, in a way you really can't manage well with > threaded systems. That makes it easier to write correct code. > > The price you pay is that you have to think in terms of events, and > few programmers have been trained that way. > > Perry > -- > Perry E. Metzger perry@piermont.com > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080702/9a4a154d/attachment.html From lindahl at pbm.com Wed Jul 2 13:04:49 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <87d4lweagv.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> Message-ID: <20080702200448.GA17424@bx9.net> On Wed, Jul 02, 2008 at 08:28:48AM -0400, Perry E. Metzger wrote: > None > of this should cause you to run out of ports, period. If you don't > understand that, refer back to my original message. A TCP socket is a > unique 4-tuple. The host:port 2-tuples are NOT unique and not an > exhaustible resource. There is is no way that your case is going to > even remotely exhaust the 4-tuple space. Perry, Go look at code that actually uses priv ports to connect out. Normally the port is picked in the connect() call, and that means you can have all the 4-tuples. But for priv ports, you have to loop trying specific candidate ports under 1024 until you get one, and then connect out from it. (Here's where Linux doesn't try all 1024, because it doesn't want to use ports that are someone else's fixed port.) The kernel doesn't know at assignment time who you are connecting out to. In the end, this means that the port numbers are reused slowly, and you have to wait a TIME_WAIT time before reusing them. Now I'm week on the details today, but this was an issue that I dealt with long ago with PBS, which insists on using priv ports. So I ended up hacking the kernel on the PBS master to have a reduced TIME_WAIT time. Problem solved, yukko. -- greg From mathog at caltech.edu Wed Jul 2 13:54:29 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? Message-ID: Steve Cousins wrote > David Mathog wrote: > > Just under 60MB/sec seems to be the maximum tape transport read/write > > limit. Pretty reliably the first write from the beginning of tape was a > > bit slower than writes started further into the tape. > > I believe LTO-3 is rated at 80 MB/sec without compression. Testing it on > our HP unit in an Overland library I get: > > WRITE: > > dd if=/dev/zero of=/dev/nst0 bs=512k count=10k > 10240+0 records in > 10240+0 records out > 5368709120 bytes (5.4 GB) copied, 71.8723 seconds, 74.7 MB/s > > READ: > > dd of=/dev/null if=/dev/nst0 bs=512k count=10k > 10240+0 records in > 10240+0 records out > 5368709120 bytes (5.4 GB) copied, 69.2487 seconds, 77.5 MB/s Rats. I wonder what the difference is now? If you don't already have it, please grab a copy of Exabyte's ltoTool from here: http://www.exabyte.com/support/online/downloads/downloads.cfm?did=1344&prod_id=581 % /usr/local/src/ltotool/ltoTool -C 1 /dev/nst0 ltoTool V4.63 -- Copyright (c) 1996-2006, Exabyte Corp. Tape Drive identified as LTO3(HP) Enabling compression...OK Done % /usr/local/src/ltotool/ltoTool -i /dev/nst0 ltoTool V4.63 -- Copyright (c) 1996-2006, Exabyte Corp. Tape Drive identified as LTO3(HP) /dev/nst0 - Vendor : HP /dev/nst0 - Product ID: Ultrium 3-SCSI /dev/nst0 - Firmware : D21D /dev/nst0 - Serialnum : HU10708TGG % dd if=/dev/zero of=/dev/nst0 bs=512k count=10k 10240+0 records in 10240+0 records out 5368709120 bytes (5.4 GB) copied, 38.6474 s, 139 MB/s % /usr/local/src/ltotool/ltoTool -C 0 /dev/nst0 ltoTool V4.63 -- Copyright (c) 1996-2006, Exabyte Corp. Tape Drive identified as LTO3(HP) Disabling compression...OK Done % dd if=/dev/zero of=/dev/nst0 bs=512k count=10k 10240+0 records in 10240+0 records out 5368709120 bytes (5.4 GB) copied, 91.9329 s, 58.4 MB/s Done So mine is not as fast as yours in the exact same test. HP's LTT tool shows an LTO 3 cartridge in the drive. (Does this drive even work with an LTO 2 or LTO 4?) % ulimit unlimited % uname -a 2.6.24-19-generic #1 SMP Wed Jun 4 15:10:52 UTC 2008 x86_64 GNU/Linux % cat /etc/issue Ubuntu 8.04 \n \l The system has 24 GB of RAM, dual Opteron 2218, and no cpufreq adjustment running (the BIOS on this one does not support power adjustment). The relevant SCSI messages from the last boot in /var/log/messages are: scsi6 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0 aic7901: Ultra320 Wide Channel A, SCSI Id=7,PCI-X 101-133Mhz, 512 SCBs scsi 6:0:4:0: Sequential-Access HP Ultrium 3-SCSI D21D PQ: 0 ANSI: 3 target6:0:4: asynchronous scsi6:A:4:0: Tagged Queuing enabled. Depth 32 target6:0:4: Beginning Domain Validation target6:0:4: wide asynchronous target6:0:4: FAST-160 WIDE SCSI 320.0 MB/s DT IU RTI PCOMP (6.25 ns, offset 64) target6:0:4: Domain Validation skipping write tests target6:0:4: Ending Domain Validation The module driving the Adaptec is "aic79xx", apparently with no special options configured anywhere for when it loads. Not sure which kernel parameters are relevant (if any). This is really unlikely to be relevant, but... % dd --version dd (coreutils) 6.10 Copyright (etc.) % ldd `which dd` linux-vdso.so.1 => (0x00007fff08bfe000) librt.so.1 => /lib/librt.so.1 (0x00007fdd00779000) libc.so.6 => /lib/libc.so.6 (0x00007fdd00417000) libpthread.so.0 => /lib/libpthread.so.0 (0x00007fdd001fb000) /lib64/ld-linux-x86-64.so.2 (0x00007fdd00982000) Thanks, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From mathog at caltech.edu Wed Jul 2 14:13:53 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? (Steve Cousins) Message-ID: Steve Cousins wrote: > > Just under 60MB/sec seems to be the maximum tape transport read/write > > limit. Pretty reliably the first write from the beginning of tape was a > > bit slower than writes started further into the tape. > > I believe LTO-3 is rated at 80 MB/sec without compression. I just checked that. The spec page here: http://h18006.www1.hp.com/products/storageworks/ultrium920/index.html says: Higher performance with dynamic data rate matching - Ultrium 920 Tape Drive 120MB/sec compressed data transfer using 2:1 compression, Or with compression off 60MB/sec, assuming that the 120MB/s was rate limited by the physical tape write speed and not at all by the compression. On the other hand, the specs for the CARTRIDGE here: http://www.hboutlet.com.au/catalog/product_info.php?products_id=181385 list 80MB/s uncompressed, which is the number you cited. Also here: http://en.wikipedia.org/wiki/Linear_Tape-Open they cite 80MB/s. Do different LTO-3 drives have different maximum tape write speeds? Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From perry at piermont.com Wed Jul 2 15:31:51 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <20080702200448.GA17424@bx9.net> (Greg Lindahl's message of "Wed\, 2 Jul 2008 13\:04\:49 -0700") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <20080702200448.GA17424@bx9.net> Message-ID: <87hcb76hpk.fsf@snark.cb.piermont.com> Greg Lindahl writes: > Go look at code that actually uses priv ports to connect out. Normally > the port is picked in the connect() call, and that means you can have > all the 4-tuples. But for priv ports, you have to loop trying specific > candidate ports under 1024 until you get one, and then connect out > from it. (Here's where Linux doesn't try all 1024, because it doesn't > want to use ports that are someone else's fixed port.) The kernel > doesn't know at assignment time who you are connecting out to. In the > end, this means that the port numbers are reused slowly, and you have > to wait a TIME_WAIT time before reusing them. It isn't quite that bad. You can use one of the SO_REUSE* calls in the code to make things less dire. Apparently the kernel doesn't do that for NFS client connection establishment, though. There is probably some code to fix here. Anyway, you may notice that I handed the original poster a hacky patch that will let him use unprivileged ports. I still don't know if it is necessary, but it may make his life less bad, we'll see. > Now I'm week on the details today, but this was an issue that I dealt > with long ago with PBS, which insists on using priv ports. So I ended > up hacking the kernel on the PBS master to have a reduced TIME_WAIT > time. Problem solved, yukko. -- Perry E. Metzger perry@piermont.com From lindahl at pbm.com Wed Jul 2 15:37:14 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <87hcb76hpk.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <20080702200448.GA17424@bx9.net> <87hcb76hpk.fsf@snark.cb.piermont.com> Message-ID: <20080702223714.GA5908@bx9.net> On Wed, Jul 02, 2008 at 06:31:51PM -0400, Perry E. Metzger wrote: > It isn't quite that bad. You can use one of the SO_REUSE* calls in the > code to make things less dire. Apparently the kernel doesn't do that > for NFS client connection establishment, though. There is probably > some code to fix here. That's what I thought at first, too. But since you only have a 2-tuple and not a 4-tuple when it comes time to pick the port number, SO_REUSEADDR doesn't do anything. -- greg From cousins at umit.maine.edu Wed Jul 2 15:37:14 2008 From: cousins at umit.maine.edu (Steve Cousins) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: References: Message-ID: On Wed, 2 Jul 2008, David Mathog wrote: > Rats. > > I wonder what the difference is now? If you don't already have it, > please grab a copy of Exabyte's ltoTool from here: > > http://www.exabyte.com/support/online/downloads/downloads.cfm?did=1344&prod_id=581 > > % /usr/local/src/ltotool/ltoTool -C 1 /dev/nst0 > ltoTool V4.63 -- Copyright (c) 1996-2006, Exabyte Corp. > > Tape Drive identified as LTO3(HP) > Enabling compression...OK > > Done > > % /usr/local/src/ltotool/ltoTool -i /dev/nst0 > ltoTool V4.63 -- Copyright (c) 1996-2006, Exabyte Corp. > > Tape Drive identified as LTO3(HP) > /dev/nst0 - Vendor : HP > /dev/nst0 - Product ID: Ultrium 3-SCSI > /dev/nst0 - Firmware : D21D > /dev/nst0 - Serialnum : HU10708TGG Hi David, I get: ltoTool V4.63 -- Copyright (c) 1996-2006, Exabyte Corp. Tape Drive identified as LTO3(HP) /dev/nst0 - Vendor : HP /dev/nst0 - Product ID: Ultrium 3-SCSI /dev/nst0 - Firmware : G24H /dev/nst0 - Serialnum : HU105278YC So, based on the serial number it looks like yours is newer than mine but mine possibly has a newer firmware. It's hard to tell though since there are probably different firmwares depending on what vendor/library it is in. For the other information I have: ulimit: unlimited sh-3.1# uname -a Linux triton 2.6.20-1.2320.fc5.asl.1 #1 SMP Thu Aug 9 13:21:16 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux (yes it is an old distribution (FC5) and kernel but it is stable. uptime of 295 days until boot drive had a problem this week and I had to switch it out) dmesg: scsi4 : ioc0: LSI53C1030, FwRev=01030a00h, Ports=1, MaxQ=222, IRQ=28 scsi 4:0:1:0: Sequential-Access HP Ultrium 3-SCSI G24H PQ: 0 ANSI: 3 target4:0:1: FAST-160 WIDE SCSI 320.0 MB/s DT IU RTI PCOMP (6.25 ns, offset 64) scsi 4:0:6:0: Medium Changer OVERLAND NEO Series 0507 PQ: 0 ANSI: 2 target4:0:6: FAST-10 WIDE SCSI 20.0 MB/s ST (100 ns, offset 15) scsi5 : ioc1: LSI53C1030, FwRev=01030a00h, Ports=1, MaxQ=222, IRQ=29 It is a dual Opteron 252 machine with 4GB of RAM and an LSI PCI-X two channel controller. So we are both running with 2.6 Ghz Opterons but you have twice as many cores and probably higher bandwidth. My motherboard is a Tyan S2882 I believe. dd --version shows: dd (coreutils) 5.97 Copyright (C) 2006 Free Software Foundation, Inc. This is free software. You may redistribute copies of it under the terms of the GNU General Public License . There is NO WARRANTY, to the extent permitted by law. Written by Paul Rubin, David MacKenzie, and Stuart Kemp. What happens if you turn off CTQ. I don't think CTQ will get you anything on a tape drive. Am I mistaken? Steve > % dd if=/dev/zero of=/dev/nst0 bs=512k count=10k > 10240+0 records in > 10240+0 records out > 5368709120 bytes (5.4 GB) copied, 38.6474 s, 139 MB/s > > % /usr/local/src/ltotool/ltoTool -C 0 /dev/nst0 > ltoTool V4.63 -- Copyright (c) 1996-2006, Exabyte Corp. > > Tape Drive identified as LTO3(HP) > Disabling compression...OK > > Done > % dd if=/dev/zero of=/dev/nst0 bs=512k count=10k > 10240+0 records in > 10240+0 records out > 5368709120 bytes (5.4 GB) copied, 91.9329 s, 58.4 MB/s > > Done > > > So mine is not as fast as yours in the exact same test. HP's LTT > tool shows an LTO 3 cartridge in the drive. (Does this drive even > work with an LTO 2 or LTO 4?) > > % ulimit > unlimited > % uname -a > 2.6.24-19-generic #1 SMP Wed Jun 4 15:10:52 UTC 2008 x86_64 GNU/Linux > % cat /etc/issue > Ubuntu 8.04 \n \l > > The system has 24 GB of RAM, dual Opteron 2218, and no cpufreq > adjustment running (the BIOS on this one does not support power > adjustment). The relevant SCSI messages from the last boot > in /var/log/messages are: > > scsi6 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0 > > aic7901: Ultra320 Wide Channel A, SCSI Id=7,PCI-X 101-133Mhz, 512 SCBs > scsi 6:0:4:0: Sequential-Access HP Ultrium 3-SCSI D21D PQ: 0 ANSI: 3 > target6:0:4: asynchronous > scsi6:A:4:0: Tagged Queuing enabled. Depth 32 > target6:0:4: Beginning Domain Validation > target6:0:4: wide asynchronous > target6:0:4: FAST-160 WIDE SCSI 320.0 MB/s DT IU RTI PCOMP (6.25 ns, > offset 64) > target6:0:4: Domain Validation skipping write tests > target6:0:4: Ending Domain Validation > > The module driving the Adaptec is "aic79xx", apparently with no > special options configured anywhere for when it loads. > > Not sure which kernel parameters are relevant (if any). > > This is really unlikely to be relevant, but... > > % dd --version > dd (coreutils) 6.10 > Copyright (etc.) > % ldd `which dd` > linux-vdso.so.1 => (0x00007fff08bfe000) > librt.so.1 => /lib/librt.so.1 (0x00007fdd00779000) > libc.so.6 => /lib/libc.so.6 (0x00007fdd00417000) > libpthread.so.0 => /lib/libpthread.so.0 (0x00007fdd001fb000) > /lib64/ld-linux-x86-64.so.2 (0x00007fdd00982000) > > Thanks, > > David Mathog > mathog@caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > From cousins at umit.maine.edu Wed Jul 2 15:43:17 2008 From: cousins at umit.maine.edu (Steve Cousins) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? (Steve Cousins) In-Reply-To: References: Message-ID: On Wed, 2 Jul 2008, David Mathog wrote: > Steve Cousins wrote: > > Do different LTO-3 drives have different maximum tape write speeds? I don't know. I've always heard 80 MB/sec. lto.org shows: http://www.lto.org/technology/ugen.php?section=0&subsec=ugen for lto-3 "up to 160 MB/sec" which of course is with 2:1 compression and therefore 80 MB/sec uncompressed. Steve From rgb at phy.duke.edu Wed Jul 2 16:44:58 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:21 2009 Subject: dealing with lots of sockets (was Re: [Beowulf] automount on high ports) In-Reply-To: <87skus874a.fsf_-_@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> <87vdzo9rgd.fsf@snark.cb.piermont.com> <87skus874a.fsf_-_@snark.cb.piermont.com> Message-ID: On Wed, 2 Jul 2008, Perry E. Metzger wrote: > > "Robert G. Brown" writes: >>> Well, it actually kind of is. Typically, a box in an HPC cluster is >>> running stuff that's compute bound and who's primary job isn't serving >>> vast numbers of teeny high latency requests. That's much more what a >>> web server does. However... >> >> I'd have to disagree. On some clusters, that is quite true. On others, >> it is very much not true, and whole markets of specialized network >> hardware that can manage vast numbers of teeny communications requests >> with acceptably low latency have come into being. And in between, there >> is, well, between, and TCP/IP at gigabit speeds is at least a contender >> for ways to fill it. > > I have to admit my experience here is limited. I'll take your word for > it that there are systems where huge numbers of small, high latency > requests are processed. (I thought that teeny stuff in HPC land was > almost always where you brought in the low latency fabric and used > specialized protocols, but...) Not so much high latency, but there are many different messaging patterns. Some are BW dominated with large messages, some are many small latency dominated messages and use specialized networks, and some are in between -- medium sized messages, medium sized amounts. YMMV is a standard watchword. Some people do fine with TCP/IP over ethernet for whatever message size. I'm not quite sure what you mean by "vast numbers of teeny high latency requests" so I'm not sure if we really are disagreeing or agreeing in different words. If you mean that the problem of HA computing on a human timescale is different than the typical HPC problem then we agree very much, but then I don't see the point in the context of the current discussion. > Sure, but it is way inefficient. Every single process you fork means > another data segment, another stack segment, which means lots of > memory. Every process you fork also means that concurrency is achieved > only by context switching, which means loads of expense on changing > MMU state and more. Even thread switching is orders of magnitude worse > than a procedure call. Invoking an event is essentially just a > procedure call, so that wins big time. Sure, but for a lot of applications, one doesn't have a single server with umpty zillion connections -- which may be what you may mean with your "high latency teensy message" point above. If the connection is persistent, the overhead associated with task switching is just part of the normal multitasking of the OS. In cluster computing, one may have only a small set of these connections to any particular host, or one may have lots -- many to many communications, master-slave communications. Similarly, many daemon-driven tasks tend to be quite bounded. If a server load average is down under 0.1 nearly all the time, nobody cares, if the overhead of communication in a parallel application is down in the sub-1% range, people don't care much. But then, few cluster applications are built on forking daemons...;-) Still, it is important to understand why there are a lot of applications that are. In the old days, there were limits on how many processes, and open connections, and open files, and nearly any other related thing you could have at the same time, because memory was limited. Kernel resources (if nothing else) have to be allocated for each one, and kernel overhead associated with all of the connections, files, etc could scale up to where it more or less shut down a system. Nowadays, with my LAPTOP having 4 GB, multiple cores, far more scalable MP kernels, the limits are a lot more flexible, and it may well be better to maintain many persistent connections within a single application and make it essentially an extension of the kernel with the kernel managing the "multitasking" overhead of message reception per connection and then avoiding the additional multitasking associated with farming the information out per connection to a forked copy of a server process. As I said, very interesting and a good idea -- I'm learning from you -- but a good idea for certain applications, possibly more trouble than it's worth for others? Or maybe not. If you make writing event driven network code as easy, and as well documented, as writing standard socket code and standard daemon code, the forking daemon may become obsolete. Maybe it IS obsolete. So, what do you think? Should one "never" write a forking daemon, or inetd? [Incidentally, does this mean that you are similarly negative about forking applications in general, since similar resource constraints apply to ALL forks, right? Or should one use event driven servers only for big servers with no particular hurry on returning messages for any given connection? I'm guessing that when writing such a server, one has to do some of the work that the kernel would do for you for forked processes -- ensure that no connection is starved for timeslices or network slices, manage priorities if necessary, smoothly multitask any underlying computation associated with providing the data. After all, the MOST efficient server is one with the server code built into the kernel -- DOS plus an application, as it were. Why bother with the overhead of a general purpose multitasking operating system when you can handle all the multitasking native within your one monolithic application? Ditto networking -- why replicate general purpose features of the network stack in the kernel and network structs when you'll never need them for your ONE application? Usually one trades off the ease of programming and use in a general purpose environment against some penalty, as general purpose environments require more state information and overhead to maintain and operate. So are you arguing that there are no tradeoffs, and one should "always" write server network code (or code in a suitably segmented application) on an event model, or that it is a better one for some class of client applications, some pattern of use? This still is (I think) OT, as master-slave parallel applications are fairly common, with a toplevel master doling out units of work to the slaves and then collecting the results. I think that it is probably more usual to write the code for this as a non-forking application anyway, but I can still imagine exceptions. IIRC, some of these things are the motivation for e.g. Scyld and bproc. If anyone else on list is bored with this, let me know we can take it offline. > Event driven systems can also avoid locking if you keep global data > structures to a minimum, in a way you really can't manage well with > threaded systems. That makes it easier to write correct code. > > The price you pay is that you have to think in terms of events, and > few programmers have been trained that way. What do you mean by events? Things picked out with a select statement, e.g. I/O waiting to happen on a file descriptor? Signals? I think the bigger problem is that a lot of the events in question are basically (fundamentally) kernel interrupts, I/O being driven by one or more asynchronous processes, and you're right, a lot of programmers never learn to manage this because it is actually pretty difficult. One has to handle blocking vs non-blocking issues, raw I/O (in many cases), a scheduler of sorts to ensure that connections aren't starved (unless you are content to process events in FIFO order, letting an event piggy or buggy/crashed process hang the entire pending queue). Forking provides you with a certain amount of "automatic" robust parallelism. Without it, one has to make the code a lot more robust; if a forked connection crashes, it crashes just one connection, not the server or any of the rest of the existing connections. The kernel DOES do a lot of things for you on a forked process that you have to do for yourself in event driven code, and it isn't exactly trivial to provide it either well, efficiently, or robustly (where the kernel is perforce all three, within the limits imposed by its general purpose design). As I said, people wrote lots of applications on UDP because they thought "hmmm, I don't need ALL the overhead associated with making TCP robust, I'll use lightweight UDP instead and write my own packet sequencer, my own retransmit, etc." Then they discovered that by the time they ended up with something that was reliable, they hadn't really saved much -- or may well have ended up with something even slower than TCP. People work(ed) HARD on making TCP fairly efficient and making it handle edge cases. Doing it on your own is unlikely to match either one, unless you are an uberprogrammer. You sound like you probably are, but I'm not sure everyone is...;-) I'm not arguing, mind you -- I already believe that writing an event driven server (or client, or both in a more symmetric model) makes sense for a certain class of applications, including many/most of the ones relevant to cluster computing. I'm asking if one should NEVER write a forking daemon because the libraries you mention above provide schedulers and can manage dropped connections or hung resources or because you think that the programmer should always be able to add them as needed, or if there is a problem scale and server type for which it makes sense, and others for which it is overkill or for which the services provided by the kernel for forked processes (or threads, a rose by any other name...) are worth their cost. An event driven application IS basically a kernel, in a manner of speaking. Should every daemon be a kernel, or can some use the existing kernel for kernel-like functionality and focus on just provisioning a single connection well? rgb > > Perry > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From larry.stewart at sicortex.com Wed Jul 2 17:48:32 2008 From: larry.stewart at sicortex.com (Lawrence Stewart) Date: Wed Nov 25 01:07:21 2009 Subject: dealing with lots of sockets (was Re: [Beowulf] automount on high ports) In-Reply-To: References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> <87vdzo9rgd.fsf@snark.cb.piermont.com> <87skus874a.fsf_-_@snark.cb.piermont.com> Message-ID: <127E7A7E-F115-4349-8DB8-568FC882EB7B@sicortex.com> > >> Sure, but it is way inefficient. Every single process you fork means >> another data segment, another stack segment, which means lots of >> memory. Every process you fork also means that concurrency is >> achieved >> only by context switching, which means loads of expense on changing >> MMU state and more. Even thread switching is orders of magnitude >> worse >> than a procedure call. Invoking an event is essentially just a >> procedure call, so that wins big time. My experience is likely (a) dated or (b) inapplicable, but what's the point of a group if you can't toss it out? Back in 1994, with 90 MHz pentiums, NCSA's httpd was the leading webserver with a design that forked a new process for every request. This works, and provides nice isolation for those cases where your application is buggy. It is also a poor-man's threading system in that it lets the application not worry about blocking behavior of network sockets and so forth. It was a trifle slow however, being limited to 40 or so requests per second. My obligatory internet startup wrote a new single-threaded single- process web server based on select(2) with careful attention to the blocking or not nature of the kernel calls and were able to handle some hundreds of connections per second on the same hardware and over 1000 open connections before breaking the stack. Alas it was never made open source and the company is gone. More recently at SiCortex, We've been using libevent to write single threaded applications that do multithreaded things. On our 16 megabyte 70 MHz freescale embedded boot processors, this is very handy for reducing the memory footprint. On the x86 front end, a single process has no difficulty multiplexing 1000 streams of console data this way. I'd hate to have a process for each one of those! We're also using conserver for console access and that is also written with a single linux process multiplexing 50 or so consoles. I don't know whether conserver's internals are threads or events. So if anyone wants to try an easy to use event library, I can recommend libevent. The learning curve is modest. It does require a little turning inside out to do things like have a tftp client as a libevent task but its not bad. -Larry From lindahl at pbm.com Wed Jul 2 18:05:37 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:21 2009 Subject: dealing with lots of sockets (was Re: [Beowulf] automount on high ports) In-Reply-To: <127E7A7E-F115-4349-8DB8-568FC882EB7B@sicortex.com> References: <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> <87vdzo9rgd.fsf@snark.cb.piermont.com> <87skus874a.fsf_-_@snark.cb.piermont.com> <127E7A7E-F115-4349-8DB8-568FC882EB7B@sicortex.com> Message-ID: <20080703010537.GA14390@bx9.net> On Wed, Jul 02, 2008 at 08:48:32PM -0400, Lawrence Stewart wrote: > Back in 1994, with 90 MHz pentiums, NCSA's httpd was the leading > webserver with a design that forked a new process for every request. Apache eventually moved to a model where forked processes handled several requests serially before eventually dying and being re-forked. This reduces the fork overhead per request to something reasonable. Recently there's a threaded version, but that's not the default. Our web crawler at Blekko is event-driven: the work is divided up into short subroutines which do non-blocking things, and when blocking is needed, you return to the "system" indicating what code to execute when the answer you're waiting for comes back. This is just event-driven programming inside-out. Works great, too, because the code is prettier than your typical event-driven code. Now Legion had pretty code, but the fact that all of the contexts shared a single stack meant that only the guy at the top of the stack could execute. But I digress. -- greg From perry at piermont.com Wed Jul 2 18:06:55 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:21 2009 Subject: [Beowulf] Re: dealing with lots of sockets In-Reply-To: (Robert G. Brown's message of "Wed\, 2 Jul 2008 19\:44\:58 -0400 \(EDT\)") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> <87vdzo9rgd.fsf@snark.cb.piermont.com> <87skus874a.fsf_-_@snark.cb.piermont.com> Message-ID: <87mykz4vyo.fsf@snark.cb.piermont.com> "Robert G. Brown" writes: > I'm not quite sure what you mean by "vast numbers of teeny high > latency requests" so I'm not sure if we really are disagreeing or > agreeing in different words. I mostly have worried about such schemes in the case of, say, 10,000 people connecting to a web server, sending an 80 byte request, and getting back a few k several hundred ms later. (I've also dealt a bit with transaction systems with more stringent throughput requirements, but rarely with things that require an ack really, really fast.) That said, I'm pretty sure event systems win over threads if you're coordinating pretty much anything... >> Sure, but it is way inefficient. Every single process you fork means >> another data segment, another stack segment, which means lots of >> memory. Every process you fork also means that concurrency is achieved >> only by context switching, which means loads of expense on changing >> MMU state and more. Even thread switching is orders of magnitude worse >> than a procedure call. Invoking an event is essentially just a >> procedure call, so that wins big time. > > Sure, but for a lot of applications, one doesn't have a single server > with umpty zillion connections Well, often one doesn't build things that way, but that's sort of a choice, isn't it. Your machine has only one or two or eight processors, and any other processes/threads above that which you create are not actually operating in parallel but are just a programming abstraction. It is perfectly possible to structure almost any application so there is just the one thread per core and you otherwise handle the programming abstraction with events instead of additional threads, processes or what have you. > If the connection is persistent, the overhead associated with task > switching is just part of the normal multitasking of the OS. That overhead is VERY high. Incredibly high. Most people don't really understand how high it is. If you compare the performance of an http server that manages 10,000 simultaneous connections with events, versus one that handles it with threads, you'll see there is no comparison -- events always beat threads into the ground, because you can't get away from threads requiring a new stack for each thread, and you can't get away from the fact that context switching is far more expensive than a procedure dispatch. > Similarly, many daemon-driven tasks tend to be quite bounded. If a > server load average is down under 0.1 nearly all the time, nobody cares, That implies almost nothing is in the run queue. For an HPC system, one hopes that the load is hovering around 1. Less means you're wasting processor, more means you're spending too much time context switching. But I digress.. > Still, it is important to understand why there are a lot of applications > that are. In the old days, there were limits on how many processes, and > open connections, and open files, and nearly any other related thing you > could have at the same time, because memory was limited. Believe it or not, memory is still limited, and context switch time is still pretty bad. Changing MMU contexts is unpleasant. Even if you don't have to do that, because you're using another thread in the same MMU context rather than a process, the overhead is still quite painful. Seeing is believing. There are lots of good papers out there on concurrency strategies for systems with vast numbers of sockets to manage, and there is no doubt what the answer is -- threads suck compared to events, full stop. Event systems scale linearly for far longer. > Or maybe not. If you make writing event driven network code as easy, > and as well documented, as writing standard socket code and standard > daemon code, the forking daemon may become obsolete. Maybe it IS > obsolete. It is pretty easy. The only problem is getting your mind wrapped around it and getting experience with it. Most people have been writing fully linear programs for a whole career. If you tell them to try events, or try functional programming, or other things they're not used to, they almost always scream in agony for weeks until they get used to it. "Weeks" is often more overhead than people are willing to suffer. That said, I am comfortable with both of those paradigms... > So, what do you think? Should one "never" write a forking daemon, or > inetd? It depends. If you're doing something where there is going to be one socket talking to the system a tiny percentage of the time, why would you bother building an event driven server? If you're building something to serve files to 20,000 client machines over persistent TCP connections and the network interface is going to be saturated, hell yes, you should never use 20,000 threads for that, write the thing event driven or you'll die. It is all about the right tool for the job. Apps that are all about massive concurrent communication need events. Apps that are about very little concurrent communication probably don't need them. >> Event driven systems can also avoid locking if you keep global data >> structures to a minimum, in a way you really can't manage well with >> threaded systems. That makes it easier to write correct code. >> >> The price you pay is that you have to think in terms of events, and >> few programmers have been trained that way. > > What do you mean by events? Things picked out with a select statement, > e.g. I/O waiting to happen on a file descriptor? Signals? More the former, not the latter. Event driven programming typically uses registered callbacks that are triggered by a central "Event Loop" when events happen. In such a system, one never blocks for anything -- all activity is performed in callbacks, and one simply returns from a callback if one can't proceed further. The programming paradigm is quite alien to most people. I'd read the libevent man page to get a vague introduction. -- Perry E. Metzger perry@piermont.com From carsten.aulbert at aei.mpg.de Wed Jul 2 22:40:48 2008 From: carsten.aulbert at aei.mpg.de (Carsten Aulbert) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <20080702223714.GA5908@bx9.net> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <20080702200448.GA17424@bx9.net> <87hcb76hpk.fsf@snark.cb.piermont.com> <20080702223714.GA5908@bx9.net> Message-ID: <486C6660.5070705@aei.mpg.de> Hi all, Greg Lindahl wrote: > On Wed, Jul 02, 2008 at 06:31:51PM -0400, Perry E. Metzger wrote: > >> It isn't quite that bad. You can use one of the SO_REUSE* calls in the >> code to make things less dire. Apparently the kernel doesn't do that >> for NFS client connection establishment, though. There is probably >> some code to fix here. > > That's what I thought at first, too. But since you only have a 2-tuple > and not a 4-tuple when it comes time to pick the port number, > SO_REUSEADDR doesn't do anything. > A solution proposed by the nfs guys is pretty simple: Change the values of /proc/sys/sunrpc/{min,max}_resvport appropriately. But they don't know which ceiling will be next. But we will test it. Thanks for now and I'll read through the other side thread about forking vs. threading vs. serialization :) Carsten From tjrc at sanger.ac.uk Thu Jul 3 01:34:19 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: <486B9D4E.80405@ias.edu> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486B9D4E.80405@ias.edu> Message-ID: <05A873CF-6D66-4B3B-9E63-B74B7D36D10B@sanger.ac.uk> On 2 Jul 2008, at 4:22 pm, Prentice Bisbal wrote: > 2. Now that I'm a professional system admin who often has to support > commercial apps, I find I have to use a RH-based distro for two > reasons: > A. Most commercial software "supports" only Red Hat. Some go so far as > to refuse to install if RH is not detected. The most extreme case of > this is EMC PowerPath, whose kernel modules won't install if it's > not a > RH (or SUSE) kernel. We have that problem as well, with HP SFS. The way we get around is simply that we run our older Lustre clients using Debian Sarge with a SuSE 9 kernel, which is perfectly possible, if a bit icky. > > B. Red Hat has done such a good job of spreading FUD about the other > Linux distros, management has a cow if you tell them you're installing > something other than RH. Fortunately, our management doesn't have that fear. In fact their fear is usually the other way around: Following the reams of broken promises from certain large UNIX vendors in particular about the future of certain products and features, which required us to copy petabytes of storage to new filesystems, which took more than six months. Our IT management now has a completely rational fear of buying commercial UNIX products, because the company might be bought out, change its focus, charge you a fortune for continued support, or all of the above. Consequently, we go for open source stuff whenever possible, and as far as the distribution was concerned, Debian was the obvious choice -- and pretty much the only choice at the time, since Fedora did not exist, and I'm still not sure how separate from Red Hat Fedora and CentOS really are, but that's probably just my ignorance. The fact that we could easily demonstrate that Debian did everything we needed of it on a technical level made the decision a very comfortable one for the management. Another aspect to our choice of Debian is that pretty much all of the bioinformatics software written here is itself open-sourced and given away, and if you, as a small genomics lab, want to run mirrors of various chunks of our stuff, you can, without worrying that any part of it might have licensing issues. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From eagles051387 at gmail.com Thu Jul 3 04:26:13 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: References: <3CB66E9F377C4961B5457896137EAD1B@geoffPC> <486A4296.4050501@hope.edu> <87y74lfabq.fsf@snark.cb.piermont.com> <87wsk4ed20.fsf@snark.cb.piermont.com> Message-ID: now what happens if someone comes to me for rendering services and they used maya will the maya file be able to use the software mentioned above or would i need some other software for that? On 7/2/08, Bernard Li wrote: > > On Wed, Jul 2, 2008 at 4:32 AM, Perry E. Metzger > wrote: > > > "Jon Aquilina" writes: > >> if i use blender how nicely does it work in a cluster? > > > > I believe it works quite well. > > As far as I know blender does not have any built-in "clustering" > capabilities. But what you do is render different frames on different > cores (embarrassingly parallel) using a queuing/scheduling system. > DrQueue seems to be quite popular with the rendering folks: > > http://drqueue.org/cwebsite/ > > Cheers, > > Bernard > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080703/e76e7673/attachment.html From eagles051387 at gmail.com Thu Jul 3 04:32:22 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:22 2009 Subject: Commodity supercomputing, was: Re: NDAs Re: [Beowulf] In-Reply-To: References: Message-ID: dont throw in the towel just for that. try and see if you can get research funding through the university you are attending On 7/1/08, Mark Kosmowski wrote: > > And I forgot to change the subject. Apologies. > > On 7/1/08, Mark Kosmowski wrote: > > At some point there a cost-benefit analysis needs to be performed. If > > my cluster at peak usage only uses 4 Gb RAM per CPU (I live in > > single-core land still and do not yet differentiate between CPU and > > core) and my nodes all have 16 Gb per CPU then I am wasting RAM > > resources and would be better off buying new machines and physically > > transferring the RAM to and from them or running more jobs each > > distributed across fewer CPUs. Or saving on my electricity bill and > > powering down some nodes. > > > > As heretical as this last sounds, I'm tempted to throw in the towel on > > my PhD studies because I can no longer afford the power to run my > > three node cluster at home. Energy costs may end up being the straw > > that breaks this camel's back. > > > > Mark E. Kosmowski > > > > > From: "Jon Aquilina" > > > > > > > > not sure if this applies to all kinds of senarios that clusters are > used in > > > but isnt the more ram you have the better? > > > > > > On 6/30/08, Vincent Diepeveen wrote: > > > > > > > > Toon, > > > > > > > > Can you drop a line on how important RAM is for weather forecasting > in > > > > latest type of calculations you're performing? > > > > > > > > Thanks, > > > > Vincent > > > > > > > > > > > > On Jun 30, 2008, at 8:20 PM, Toon Moene wrote: > > > > > > > > Jim Lux wrote: > > > >> > > > >> Yep. And for good reason. Even a big DoD job is still tiny in > Nvidia's > > > >>> scale of operations. We face this all the time with NASA work. > > > >>> Semiconductor manufacturers have no real reason to produce special > purpose > > > >>> or customized versions of their products for space use, because > they can > > > >>> sell all they can make to the consumer market. More than once, I've > had a > > > >>> phone call along the lines of this: > > > >>> "Jim: I'm interested in your new ABC321 part." > > > >>> "Rep: Great. I'll just send the NDA over and we can talk about it." > > > >>> "Jim: Great, you have my email and my fax # is..." > > > >>> "Rep: By the way, what sort of volume are you going to be using?" > > > >>> "Jim: Oh, 10-12.." > > > >>> "Rep: thousand per week, excellent..." > > > >>> "Jim: No, a dozen pieces, total, lifetime buy, or at best maybe > every > > > >>> year." > > > >>> "Rep: Oh..." > > > >>> {Well, to be fair, it's not that bad, they don't hang up on you.. > > > >>> > > > >> > > > >> Since about a year, it's been clear to me that weather forecasting > (i.e., > > > >> running a more or less sophisticated atmospheric model to provide > weather > > > >> predictions) is going to be "mainstream" in the sense that every > business > > > >> that needs such forecasts for its operations can simply run them > in-house. > > > >> > > > >> Case in point: I bought a $1100 HP box (the obvious target group > being > > > >> teenage downloaders) which performs the HIRLAM limited area model > *on the > > > >> grid that we used until October 2006* in December last year. > > > >> > > > >> It's about twice as slow as our then-operational 50-CPU Sun Fire > 15K. > > > >> > > > >> I wonder what effect this will have on CPU developments ... > > > >> > > > >> -- > > > >> Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 > 214290 > > > >> Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > > > >> At home: http://moene.indiv.nluug.nl/~toon/ > > > >> Progress of GNU Fortran: > http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html > > > >> > > > > > > > > _______________________________________________ > > > > Beowulf mailing list, Beowulf@beowulf.org > > > > To change your subscription (digest mode or unsubscribe) visit > > > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > > > > > > > > > > > -- > > > Jonathan Aquilina > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080703/28ed5052/attachment.html From hvidal at tesseract-tech.com Thu Jul 3 05:43:35 2008 From: hvidal at tesseract-tech.com (H.Vidal, Jr.) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: References: <3CB66E9F377C4961B5457896137EAD1B@geoffPC> <486A4296.4050501@hope.edu> <87y74lfabq.fsf@snark.cb.piermont.com> <87wsk4ed20.fsf@snark.cb.piermont.com> Message-ID: <486CC977.3000800@tesseract-tech.com> In my experience with 3D, the renderer is of particular note. For example, it is entirely possible to model with one piece of software, export geometry, and render with another piece of software. And the production team will probably have very specific ideas of which renderer is what they want, especially since tools such as RenderMan are themselves programmable, and cannot just be 'switched' with another rendering tool. And so, based on your questions, it sounds like you really need to study, understand, and educate yourself on your proposed product or service before thinking about batch rendering. You are, as Americans would say, putting the cart before the horse. So, as suggested, go study the 'front end' of the problem in much greater detail, talk to potential customers, understand more what you are intending to do, then come back and find out about clusters. The Beowulf mailing list has lots of archived comments and questions on this topic. That's a hint..... If you are just acting as a hobbyist or moving along a learning curve, go look at open source rendering tools. If you are going to sell services on open source rendering tools, see comments above. In any case, google and plain reading are your friends.... Good luck. hv Jon Aquilina wrote: > now what happens if someone comes to me for rendering services and they > used maya will the maya file be able to use the software mentioned above > or would i need some other software for that? > > On 7/2/08, *Bernard Li* > > wrote: > > On Wed, Jul 2, 2008 at 4:32 AM, Perry E. Metzger > wrote: > > > "Jon Aquilina" > writes: > >> if i use blender how nicely does it work in a cluster? > > > > I believe it works quite well. > > As far as I know blender does not have any built-in "clustering" > capabilities. But what you do is render different frames on different > cores (embarrassingly parallel) using a queuing/scheduling system. > DrQueue seems to be quite popular with the rendering folks: > > http://drqueue.org/cwebsite/ > > Cheers, > > Bernard > > > > > -- > Jonathan Aquilina > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From prentice at ias.edu Thu Jul 3 06:09:27 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: <87y74lfabq.fsf@snark.cb.piermont.com> References: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> <3CB66E9F377C4961B5457896137EAD1B@geoffPC> <486A4296.4050501@hope.edu> <87y74lfabq.fsf@snark.cb.piermont.com> Message-ID: <486CCF87.70206@ias.edu> Perry E. Metzger wrote: > "Jon Aquilina" writes: >> my idea is more of for my thesis. > > If you're trying to do 3d animation on the cheap and you want > something that's already cluster capable, I'd try Blender. It is open > source and it has already made some reasonable length movies. Not > being an animation type, I know nothing about how nice it is compared > to commercial products, but it is hard to beat the price. > > Perry And it's been around for a while, so it should be very mature. I don't know anything about rendering, but I downloaded blender after reading an article in Linux Journal in 1999, and it was mature back then. I only knew enough to run the demo. It took HOURS to render on my 486! I guess a link would be helpful, so here's one: http://www.blender.org/ -- Prentice From prentice at ias.edu Thu Jul 3 06:19:31 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <20080702084458.GA12879@gretchen.aei.uni-hannover.de> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <66EB3DC0-B281-4869-BB8E-A55E577C44FE@sanger.ac.uk> <20080702084458.GA12879@gretchen.aei.uni-hannover.de> Message-ID: <486CD1E3.4080204@ias.edu> Henning Fehrmann wrote: > On Wed, Jul 02, 2008 at 09:19:50AM +0100, Tim Cutts wrote: >> On 2 Jul 2008, at 8:26 am, Carsten Aulbert wrote: >> >>> OK, we have 1342 nodes which act as servers as well as clients. Every >>> node exports a single local directory and all other nodes can mount this. >>> >>> What we do now to optimize the available bandwidth and IOs is spread >>> millions of files according to a hash algorithm to all nodes (multiple >>> copies as well) and then run a few 1000 jobs opening one file from one >>> box then one file from the other box and so on. With a short autofs >>> timeout that ought to work. Typically it is possible that a single >>> process opens about 10-15 files per second, i.e. making 10-15 mounts per >>> second. With 4 parallel process per node that's 40-60 mounts/second. >>> With a timeout of 5 seconds we should roughly have 200-300 concurrent >>> mounts (on average, no idea abut the variance). >> Please tell me you're not serious! The overheads of just performing the NFS mounts are going to kill you, never mind all the network traffic going >> all over the place. >> >> Since you've distributed the files to the local disks of the nodes, surely the right way to perform this work is to schedule the computations so that >> each node works on the data on its own local disk, and doesn't have to talk networked storage at all? Or don't you know in advance which files a >> particular job is going to need? > > Yes, this is the problem. The amount of files is too big to store it > everywhere (few TByte and 50 million files). Mounting a view NFS server does not provide > the bandwidth. > On the other hand, the coreswitch should be able to handle the flows non > blocking. We think that nfs mounts are the fastest possibility to > distribute the demanded files to the nodes. > > Henning Sounds like you need a parallel filesystem of some sort. Have you looked at that option? I know, they cost $$$$. -- Prentice From prentice at ias.edu Thu Jul 3 06:38:11 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> Message-ID: <486CD643.1050904@ias.edu> Tim Cutts wrote: > > On 2 Jul 2008, at 6:06 am, Mark Hahn wrote: > >>>> I was hoping for some discussion of concrete issues. for instance, >>>> I have the impression debian uses something other than sysvinit - >>>> does that work out well? >>>> >>> Debian uses standard sysvinit-style scripts in /etc/init.d, >>> /etc/rc0.d, ... >> >> thanks. I guess I was assuming that mainstream debian was like ubuntu. > > It's sort of the other way around. Remember that Ubuntu is based off a > six-monthly snapshot of Debian's testing track, which is why Hardy looks > a lot more like the upcoming Debian Lenny than it does like Debian Etch. > >> interesting - I wonder why. the main difference would be that the rpm >> format encodes dependencies... > > The difficulty is that many ISVs tend to do a fairly terrible job of > packaging their applications as RPM's or DEB's, for example creating > init scripts which don't obey the distribution's policies, or making > willy-nilly modifications to configuration files all over the place, > even in other packages (which in the Debian world is a *big* no-no, > that's why many Debian/Ubuntu packages have now moved to the conf.d type > of configuration directory, so that other packages can drop in little > independent snippets of configuration) > > I have seen, for example, .deb packages from a Large Company With Which > We Are All Familiar which essentially attempted to convert your system > into a Red Hat system by moving all your init scripts around and > whatnot, so once you'd installed this abomination, you'd totally wrecked > the ability of many of the main distro packages to be updated ever > again. Oh, and of course uninstalling the package didn't put anything > back the way it had been before. > > Like you, I tend to use tarballs if they are available, and if I want to > turn them into packages I do it myself, and make sure they are policy > compliant for the distro. > > So this, while not a statement in favour of either flavour of distro, is > definitely a warning to be very wary of what packages that have come > from sources other than the distro itself might do (which of course, > you'd be wary of anyway for security reasons). > > Tim > > Here's another reason to use tarballs: I have /usr/local shared to all my systems with with NFS. If want to install the lastest version of firefox, you can just do this: cd /usr/local tar zxvf firefox-x.xxx.tar.gz cd /usr/local/bin ln -s ../firefox-x.xxx/firefox . Now all users can use the latest version of firefox (/usr/local/bin is in their path, and comes before /usr/bin, usr/X11R6/bin, etc.) With RPM, deb, or whatever, I'd have to use func or ssh and a shell script w/ a loop to install on all systems (assuming nightly 'yum update' cron job won't work in this case) This is incredibly helpful with Python, Perl, R, and other languages which have additional modules or libraries. Installing additional modules can be very easy (CPAN module for perl, for example). These modules aren't included in RPM format (that I know of), and when you upgrade, perl or python, the RPMs clobber whatever modules you installed in /usr. Compiling Perl can be time consuming vs. just installing the RPM, but once installed, if I run '/usr/local/bin/perl -MCPAN -e shell' as root, I can install all the perl modules needed just once, and they won't be clobbered by an RPM update. In the end, this is much more efficient. And CPAN manages dependencies automatically, too. We use RT where I work, which requires a few Perl modules. On Friday, June 13 (Yes, Friday the 13! - it figures. ). I had to stop and restart our web server that provides RT. The perl packages had recently been updated. When apache restarted, it couldn't find the necessary perl modules, and RT wouldn't function. It took me HOURS to track the problem down to a couple missing perl modules. -- Prentice From prentice at ias.edu Thu Jul 3 07:01:04 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: References: <3CB66E9F377C4961B5457896137EAD1B@geoffPC> <486A4296.4050501@hope.edu> <87y74lfabq.fsf@snark.cb.piermont.com> <87wsk4ed20.fsf@snark.cb.piermont.com> <20080702125625.GE47386@gby2.aoes.com> Message-ID: <486CDBA0.7000403@ias.edu> Jon Aquilina wrote: > like you said in regards to maya money is a factor for me. if i do > descide to setup a rendering cluster my problem is going to be finding > someone who can make a small video in blender for me so i can render it. Blender should come with a few small scene files you can render. It did about 10 years a go when I tinkered with it. If not, I'm sure someone in the Blender community would be willing to share with you. -- Prentice From perry at piermont.com Thu Jul 3 07:04:19 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] Re: dealing with lots of sockets In-Reply-To: <127E7A7E-F115-4349-8DB8-568FC882EB7B@sicortex.com> (Lawrence Stewart's message of "Wed\, 2 Jul 2008 20\:48\:32 -0400") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> <87vdzo9rgd.fsf@snark.cb.piermont.com> <87skus874a.fsf_-_@snark.cb.piermont.com> <127E7A7E-F115-4349-8DB8-568FC882EB7B@sicortex.com> Message-ID: <87vdzncbdo.fsf@snark.cb.piermont.com> Lawrence Stewart writes: > My obligatory internet startup wrote a new single-threaded single- > process web server based on select(2) with careful attention to the > blocking or not nature of the kernel calls and were able to handle > some hundreds of connections per second on the same hardware and > over 1000 open connections before breaking the stack. Alas it was > never made open source and the company is gone. There are others out there now, so the one can find a reasonable event driven http server pretty easily. > More recently at SiCortex, We've been using libevent to write single > threaded applications that do multithreaded things. On our 16 > megabyte 70 MHz freescale embedded boot processors, this is very > handy for reducing the memory footprint. On the x86 front end, a > single process has no difficulty multiplexing 1000 streams of > console data this way. I'd hate to have a process for each one of > those! You couldn't possibly manage a process for every one of those -- the only way to get the sort of performance you're talking about is with events. You picked right with events. > So if anyone wants to try an easy to use event library, I can > recommend libevent. The learning curve is modest. It does require > a little turning inside out to do things like have a tftp client as > a libevent task but its not bad. Libevent was the result of my describing to Niels Provos the way we wrote ticker plants and trading software at a particular hedge fund I was at in the early 1990s. We used libXt, the X toolkit library, as our event driven programming environment on SERVERS. It turned out to work rather well. I was at the Atlanta Linux Showcase many years ago when Niels was a grad student and he presented a paper showing how much better events were than threads or other methods for managing large loads. I explained to him afterwards what we had done, and he proceeded to write a pretty amazing piece of open source software. I don't take any credit for it at all, but I am happy that something came out of my experiences, because the software we built at that hedge fund also got lost to history, just as your http server was. It is good that the ideas survived, at least. Perry -- Perry E. Metzger perry@piermont.com From prentice at ias.edu Thu Jul 3 07:05:48 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] Re: energy costs and poor grad students In-Reply-To: References: Message-ID: <486CDCBC.8030706@ias.edu> Mark Kosmowski wrote: > I think I have come to a compromise that can keep me in business. > Until I have a better understanding of the software and am ready for > production runs, I'll stick to a small system that can be run on one > node and leave the other two powered down. I've also applied for an > adjunt instructor position at a local college for some extra cash and > good experience. When I'm ready for production runs I can either just > bite the bullet and pay the electricity bill or seek computer time > elsewhere. Mark, For MPI testing/debugging, you can create a few virtual machine on one node using VWware or Xen. VMWare is free, unless you want all the bells and whistles. This would be lousy performance for production runs, but would be great for debugging MPI problems in your code, and save you energy. Of course, this wouldn't help with hardware optimizations. -- Prentice From landman at scalableinformatics.com Thu Jul 3 07:20:45 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: <486CD643.1050904@ias.edu> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> <486CD643.1050904@ias.edu> Message-ID: <486CE03D.40901@scalableinformatics.com> Prentice Bisbal wrote: > Here's another reason to use tarballs: I have /usr/local shared to all eeek!! something named local is shared??? FWIW: we do the same thing, but put everything into /apps, and all nodes have mounted /apps ... requires a little ./configure -prefix=/apps/... magic, but it works well. > my systems with with NFS. If want to install the lastest version of > firefox, you can just do this: > > cd /usr/local > > tar zxvf firefox-x.xxx.tar.gz > > cd /usr/local/bin > > ln -s ../firefox-x.xxx/firefox . > > Now all users can use the latest version of firefox (/usr/local/bin is > in their path, and comes before /usr/bin, usr/X11R6/bin, etc.) Oddly enough, I am not a huge fan of dumping lots of binaries into one path. Part of the reason is the package management one ... all you need is one renegade package and a packager that things [s]he is smart, and ... > With RPM, deb, or whatever, I'd have to use func or ssh and a shell > script w/ a loop to install on all systems (assuming nightly 'yum > update' cron job won't work in this case) If you don't know about pdsh ... it will increase your karma. We install it on every cluster we build. Makes life soooo much easier. > This is incredibly helpful with Python, Perl, R, and other languages > which have additional modules or libraries. Installing additional > modules can be very easy (CPAN module for perl, for example). These > modules aren't included in RPM format (that I know of), and when you > upgrade, perl or python, the RPMs clobber whatever modules you installed > in /usr. Yup. Bioperl is a great example. Of course to install that shared, you need perl/modules installed shared. > > Compiling Perl can be time consuming vs. just installing the RPM, but > once installed, if I run '/usr/local/bin/perl -MCPAN -e shell' as root, > I can install all the perl modules needed just once, and they won't be > clobbered by an RPM update. In the end, this is much more efficient. And > CPAN manages dependencies automatically, too. We usually build our own Perl these days. Having had some interesting ... experiences ... with vendor compiled versions, we decided to forgo their "assistance" and do it ourselves. Seems to work much better. And we have the process (including the module builds) automated quite nicely now. We use it for SICE ... er ... DragonFly. > > We use RT where I work, which requires a few Perl modules. On Friday, > June 13 (Yes, Friday the 13! - it figures. ). I had to stop and restart > our web server that provides RT. The perl packages had recently been > updated. When apache restarted, it couldn't find the necessary perl > modules, and RT wouldn't function. It took me HOURS to track the problem > down to a couple missing perl modules. Yup. This is why we use a different tree than the vendor supplied ones. We can tie back into the vendor supplied web server .... or use our own (usually do the latter). Upgrades can be a crap shoot, even on the best of intentioned systems. Right now we have run head first into a Ubuntu problem with OpenVPN on a system, where we upgraded the server after the OpenSSL fiasco, and suddenly CRLs no longer worked. Fixed a config file by hand. Next upgrade? Same bug. Sigh. Package management is good ... for ... um ... er ... what was that again? Making sure your stuff doesn't break when you update/upgrade? Oh. Maybe they will get around to making sure that actually is the case? FWIW: we have seen the *same* sorts of problem with RPM, apt, yum, suse's monstrosities (zmd and others), ... they are all broken in subtle ways that most folks don't run into. Its only when you have a system you needed to make specific changes to, that these changes get lost on the next "up"grade. We need more of a change management system. Mercurial for systems config. Grrr.... Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615 From laytonjb at charter.net Thu Jul 3 07:26:28 2008 From: laytonjb at charter.net (Jeffrey B. Layton) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] Re: energy costs and poor grad students In-Reply-To: <486CDCBC.8030706@ias.edu> References: <486CDCBC.8030706@ias.edu> Message-ID: <486CE194.2060503@charter.net> Prentice Bisbal wrote: > Mark Kosmowski wrote: > > >> I think I have come to a compromise that can keep me in business. >> Until I have a better understanding of the software and am ready for >> production runs, I'll stick to a small system that can be run on one >> node and leave the other two powered down. I've also applied for an >> adjunt instructor position at a local college for some extra cash and >> good experience. When I'm ready for production runs I can either just >> bite the bullet and pay the electricity bill or seek computer time >> elsewhere. >> > > Mark, > > For MPI testing/debugging, you can create a few virtual machine on one > node using VWware or Xen. VMWare is free, unless you want all the bells > and whistles. > You don't need to go this far. Just set up the hostfile to use the same host name several times. Just make sure you don't start swapping :) Jeff From tjrc at sanger.ac.uk Thu Jul 3 07:31:53 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: <486CD643.1050904@ias.edu> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> <486CD643.1050904@ias.edu> Message-ID: <3BC700F2-A6AA-4659-B8E9-5E53398FCB55@sanger.ac.uk> On 3 Jul 2008, at 2:38 pm, Prentice Bisbal wrote: > Here's another reason to use tarballs: I have /usr/local shared to all > my systems with with NFS. Heh. Your view of local is different from mine. On my systems /usr/ local is local to the individual system. We do have NFS mounted software of the kind you describe, but we stopped putting it in /usr/ local because users got confused thinking it was really local to the machine. We now have a separate automounted /software directory for all that stuff. > With RPM, deb, or whatever, I'd have to use func or ssh and a shell > script w/ a loop to install on all systems (assuming nightly 'yum > update' cron job won't work in this case) Well, you don't, actually. You can maintain a local repository of your custom packages, and then use something like cfengine or puppet to make sure everything is kept up to date. I need the cfengine stuff anyway to keep various configuration files in sync, so extending it to package management was a no-brainer. > > This is incredibly helpful with Python, Perl, R, and other languages > which have additional modules or libraries. Installing additional > modules can be very easy (CPAN module for perl, for example). These > modules aren't included in RPM format (that I know of), and when you > upgrade, perl or python, the RPMs clobber whatever modules you > installed > in /usr. Yes, I agree, and that's what we do here /software/bin/perl is our supported perl version. But you're not correct about the CPAN modules being clobbered, at least on Debian. Debian's perl packages are configured such that locally installed CPAN modules go into a different tree from the package's own versions of the modules, so yours don't get clobbered on upgrade. And if you really do insist on changing a file which belongs to a package, you can still tell Debian to leave it alone on package upgrade by marking it as diverted with 'dpkg-divert'. The debian guys really did put a lot of thought into how dpkg works. > > Compiling Perl can be time consuming vs. just installing the RPM, but > once installed, if I run '/usr/local/bin/perl -MCPAN -e shell' as > root, > I can install all the perl modules needed just once, and they won't be > clobbered by an RPM update. In the end, this is much more efficient. > And > CPAN manages dependencies automatically, too. I agree. In the case of perl, you're absolutely right. > > We use RT where I work, which requires a few Perl modules. On Friday, > June 13 (Yes, Friday the 13! - it figures. ). I had to stop and > restart > our web server that provides RT. The perl packages had recently been > updated. When apache restarted, it couldn't find the necessary perl > modules, and RT wouldn't function. It took me HOURS to track the > problem > down to a couple missing perl modules. *shrug* I use RT as well, but it's pre-packaged for Debian, so I just use their version and don't have to worry about the dependencies. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From prentice at ias.edu Thu Jul 3 07:55:39 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] /usr/local over NFS is okay, Joe In-Reply-To: <486CE03D.40901@scalableinformatics.com> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> <486CD643.1050904@ias.edu> <486CE03D.40901@scalableinformatics.com> Message-ID: <486CE86B.90104@ias.edu> Joe Landman wrote: > > > Prentice Bisbal wrote: > >> Here's another reason to use tarballs: I have /usr/local shared to all > > eeek!! something named local is shared??? Nothing wrong with that. "local" doesn't necessarily mean local to the physical machine. It can mean for the local site. I do everything according to standards, and adhere to them strictly. Sharing /usr/local is is actually codified in the FHS: 4.8.2. /usr/local : Local hierarchy 4.8.2.1. Purpose The /usr/local hierarchy is for use by the system administrator when installing software locally. It needs to be safe from being overwritten when the system software is updated. It may be used for programs and data that are shareable amongst a group of hosts, but not found in /usr. Locally installed software must be placed within /usr/local rather than /usr unless it is being installed to replace or upgrade software in /usr. You can share out /opt, too, but I use /opt for system that's installed locally on a machine and not exported to others. I find this is easier than putting it in /usr/local, since most packaged 3rd party software insist on going in /opt, anwyway. > > FWIW: we do the same thing, but put everything into /apps, and all nodes > have mounted /apps ... > > requires a little ./configure -prefix=/apps/... My configure kung-fu is very strong. I usually do this, so I can install multiple versions of the same software: ./configure --prefix/usr/local/foo-xx.yy --exec-prefix=/usr/local/foo-zxx.yy/x86 If compiling for x86_64, then --exec-prefix=/usr/local/foo-zxx.yy/x86_64. I have dozens of applications compiled for 32-bit and 64-bit on the same /usr/local. I just put 64-bit binaries (actually symlinks) in /usr/local/bin64, and make sure that comes first in the path on 64-bit systems (ditto for lib64, etc.) I do lots of other hocus-pocus, but I'm digressing enough already. > magic, but it works well. > >> my systems with with NFS. If want to install the lastest version of >> firefox, you can just do this: >> >> cd /usr/local >> >> tar zxvf firefox-x.xxx.tar.gz >> >> cd /usr/local/bin >> >> ln -s ../firefox-x.xxx/firefox . >> >> Now all users can use the latest version of firefox (/usr/local/bin is >> in their path, and comes before /usr/bin, usr/X11R6/bin, etc.) > > Oddly enough, I am not a huge fan of dumping lots of binaries into one > path. Part of the reason is the package management one ... all you need > is one renegade package and a packager that things [s]he is smart, and ... I don't dump the binaries into one path. I put symlinks into /usr/local/bin{,64}. All the binaries go into /usr/local/foo-xx.yy and stay there: ./configure --prefix=/usr/local/foo-xx.yy make make install cd /usr/local ln -s foo-xx.yy foo #this makes the non-versioned dir the default # just follow me here, okay? cd /usr/local/foo/bin for file in *; do ln -s ../foo/bin/$file /usr/local/bin/$file; done # do the same for lib, include, man,... # If I want to have multiple versions of foo available: cd /usr/local/foo-xx.yy for file in *; do ln -s ../foo-xx.yy/bin/${file} \ /usr/local/bin/${file}-xx.yy; done Users can call the latest or default version by calling 'foo'. If they want an earlier version, they call foo-. If I want to delete foo-xx.yy, I just do rm /usr/local/foo-xx.yy, and then delete the broken links in /usr/local{bin,lib,incude,man}. This can easily be done before deleting the install dir with scripts. If the links are left around, they take up little disk space, since they are only inodes. If I keep an earlier version of foo around, I change /usr/local/foo to point to it. From prentice at ias.edu Thu Jul 3 07:57:52 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] Re: energy costs and poor grad students In-Reply-To: <486CE194.2060503@charter.net> References: <486CDCBC.8030706@ias.edu> <486CE194.2060503@charter.net> Message-ID: <486CE8F0.7010602@ias.edu> > You don't need to go this far. Just set up the hostfile to use the same > host name several times. Just make sure you don't start swapping :) > > Jeff > Unless the problem is configuring interhost communications correctly. -- Prentice From prentice at ias.edu Thu Jul 3 08:10:50 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: <3BC700F2-A6AA-4659-B8E9-5E53398FCB55@sanger.ac.uk> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> <486CD643.1050904@ias.edu> <3BC700F2-A6AA-4659-B8E9-5E53398FCB55@sanger.ac.uk> Message-ID: <486CEBFA.7080902@ias.edu> Tim Cutts wrote: > > On 3 Jul 2008, at 2:38 pm, Prentice Bisbal wrote: > >> Here's another reason to use tarballs: I have /usr/local shared to all >> my systems with with NFS. > > Heh. Your view of local is different from mine. On my systems > /usr/local is local to the individual system. We do have NFS mounted > software of the kind you describe, but we stopped putting it in > /usr/local because users got confused thinking it was really local to > the machine. We now have a separate automounted /software directory for > all that stuff. See my other post. The FHS says it's okay for both /opt and /usr/local to be shared over NFS, but I wouldn't do both. For me /usr/local = NFS share, /opt = local to machine. Why do users need to know what's local and what isn't? All that matters is they need to know the path to a file. (It's logical location, and not it's physical location). That's the beauty of the Unix filesystem hierarchy: everything is arranged logically, not physically. No drive letters, etc. In a properly configured environment, things should just work for the users. I'm speaking in general terms, for HPC where disk or network I/O can be significant factors physical location is important. But that's usually for *data*, not the binary running, which is usually read once, and stays in memory for the remainder of it's execution. -- Prentice From landman at scalableinformatics.com Thu Jul 3 08:24:23 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] /usr/local over NFS is okay, Joe In-Reply-To: <486CE86B.90104@ias.edu> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> <486CD643.1050904@ias.edu> <486CE03D.40901@scalableinformatics.com> <486CE86B.90104@ias.edu> Message-ID: <486CEF27.8090507@scalableinformatics.com> Prentice Bisbal wrote: > Joe Landman wrote: >> >> Prentice Bisbal wrote: >> >>> Here's another reason to use tarballs: I have /usr/local shared to all >> eeek!! something named local is shared??? > > Nothing wrong with that. "local" doesn't necessarily mean local to the > physical machine. It can mean for the local site. I do everything Yeah, it is ambiguous to a degree, but I figure that something named /local is actually going to be physically local. It helps tremendously when a user calls up with a problem, say that they can't see a file they placed in /local/... on all nodes. Usually they get quiet for a moment after saying that aloud, and then say "oh, never mind". :) [...] > I don't dump the binaries into one path. I put symlinks into > /usr/local/bin{,64}. All the binaries go into /usr/local/foo-xx.yy and > stay there: We used to do this, but things kept getting overwritten by zealous package management tools. So we started using modules and showing people how to add paths by hand if they were adamant about not using modules ... -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615 From eagles051387 at gmail.com Thu Jul 3 08:32:16 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: <486CC977.3000800@tesseract-tech.com> References: <486A4296.4050501@hope.edu> <87y74lfabq.fsf@snark.cb.piermont.com> <87wsk4ed20.fsf@snark.cb.piermont.com> <486CC977.3000800@tesseract-tech.com> Message-ID: i have a little bit of clustering experience.if anything im still contemplating making a kubuntu derivative that is geared towards rendering clusters with the software and what not. i just dont have any experience with whats available for the rendering cluster market and linux that is why im asking all these questions. i did do some googling the other day only to find a list of commercial products. On Thu, Jul 3, 2008 at 2:43 PM, H.Vidal, Jr. wrote: > In my experience with 3D, the renderer is of particular note. > For example, it is entirely possible to model with one piece > of software, export geometry, and render with another piece > of software. And the production team will probably have very > specific ideas of which renderer is what they want, especially > since tools such as RenderMan are themselves programmable, and > cannot just be 'switched' with another rendering tool. > > And so, based on your questions, it sounds like you really need > to study, understand, and educate yourself on your proposed product > or service before thinking about batch rendering. You are, > as Americans would say, putting the cart before the horse. > > So, as suggested, go study the 'front end' of the problem in much > greater detail, talk to potential customers, understand more > what you are intending to do, then come back and find out about > clusters. The Beowulf mailing list has lots of archived comments > and questions on this topic. That's a hint..... > > If you are just acting as a hobbyist or moving along a learning > curve, go look at open source rendering tools. If you are going > to sell services on open source rendering tools, see comments > above. In any case, google and plain reading are your friends.... > > Good luck. > > hv > > Jon Aquilina wrote: > >> now what happens if someone comes to me for rendering services and they >> used maya will the maya file be able to use the software mentioned above or >> would i need some other software for that? >> >> On 7/2/08, *Bernard Li* > >> wrote: >> >> On Wed, Jul 2, 2008 at 4:32 AM, Perry E. Metzger > > wrote: >> >> > "Jon Aquilina" > > writes: >> >> if i use blender how nicely does it work in a cluster? >> > >> > I believe it works quite well. >> >> As far as I know blender does not have any built-in "clustering" >> capabilities. But what you do is render different frames on different >> cores (embarrassingly parallel) using a queuing/scheduling system. >> DrQueue seems to be quite popular with the rendering folks: >> >> http://drqueue.org/cwebsite/ >> >> Cheers, >> >> Bernard >> >> >> >> >> -- >> Jonathan Aquilina >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080703/ff172f51/attachment.html From kus at free.net Thu Jul 3 08:53:03 2008 From: kus at free.net (Mikhail Kuzminsky) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] MPI: over OFED and over IBGD Message-ID: Is there some MPI realization/versions which may be installed one some nodes - to work over Mellanox IBGD 1.8.0 (Gold Distribution) IB stack and on other nodes - for work w/OFED-1.2 ? Mikhail Kuzminsky Computer Assistance to Chemical Research Center Zelinsky Institute of Organic Chemistry Moscow From jlb17 at duke.edu Thu Jul 3 09:09:49 2008 From: jlb17 at duke.edu (Joshua Baker-LePain) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: <05A873CF-6D66-4B3B-9E63-B74B7D36D10B@sanger.ac.uk> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486B9D4E.80405@ias.edu> <05A873CF-6D66-4B3B-9E63-B74B7D36D10B@sanger.ac.uk> Message-ID: On Thu, 3 Jul 2008 at 9:34am, Tim Cutts wrote > On 2 Jul 2008, at 4:22 pm, Prentice Bisbal wrote: >> B. Red Hat has done such a good job of spreading FUD about the other >> Linux distros, management has a cow if you tell them you're installing >> something other than RH. Erm, do you have any examples of that? All I see is RH a) trying to sell their product (nothing wrong with that) and b) in general, being a pretty good member of the OSS community. > Fedora did not exist, and I'm still not sure how separate from Red Hat Fedora > and CentOS really are, but that's probably just my ignorance. The fact that CentOS is in no way officially associated with Red Hat. At all. They use the freely available RHEL SRPMs to build the distribution, and they report bugs upstream when they find them. But that's it. As for Fedora, see . -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF From Shainer at mellanox.com Thu Jul 3 09:41:01 2008 From: Shainer at mellanox.com (Gilad Shainer) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] MPI: over OFED and over IBGD In-Reply-To: Message-ID: <9FA59C95FFCBB34EA5E42C1A8573784F013422B0@mtiexch01.mti.com> Mikhail Kuzminsky wrote: > Is there some MPI realization/versions which may be installed > one some nodes - to work over Mellanox IBGD 1.8.0 (Gold > Distribution) IB stack and on other nodes - for work w/OFED-1.2 ? IBGD is out of date, and AFAIK none of the latest versions of the various MPI were tested against it. I would recommend to update the install to OFED from IBGD, and if you need some help let me know. If you must keep it, than MVAPICH 0.9.6 might work. Gilad. From Bogdan.Costescu at iwr.uni-heidelberg.de Thu Jul 3 09:45:31 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] automount on high ports In-Reply-To: References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> Message-ID: On Wed, 2 Jul 2008, Robert G. Brown wrote: > if you try to start up a second daemon on that port you'll get a > EADDRINUSE on the bind While we talk about theoretical possibilities, this statement is not always true. You could specify something else than INADDR_ANY here: > serverINETaddress.sin_addr.s_addr = htonl(INADDR_ANY); /* Accept all */ or bind it to a specific network interface (SO_BINDTODEVICE). Then you can bind a second daemon to the same port, but with a different (and again not INADDR_ANY) local address or network interface. Many daemons can do this nowadays (named, ntpd, etc.). -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From rgb at phy.duke.edu Thu Jul 3 09:44:41 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] Re: dealing with lots of sockets In-Reply-To: <87mykz4vyo.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> <87vdzo9rgd.fsf@snark.cb.piermont.com> <87skus874a.fsf_-_@snark.cb.piermont.com> <87mykz4vyo.fsf@snark.cb.piermont.com> Message-ID: On Wed, 2 Jul 2008, Perry E. Metzger wrote: > Seeing is believing. There are lots of good papers out there on > concurrency strategies for systems with vast numbers of sockets to > manage, and there is no doubt what the answer is -- threads suck > compared to events, full stop. Event systems scale linearly for far > longer. Sure, but: > It depends. If you're doing something where there is going to be one > socket talking to the system a tiny percentage of the time, why would > you bother building an event driven server? If you're building Or many sockets, but with a task granularity that makes your possibly megaclock/millisecond task switching overhead irrelevant. I'd have to rebuild and rerun lmbench to get an accurate measure of the current context switch time, but milliseconds seems far too long. That sounds like the inverse of the timeslice, not the actual CS time, which I'm pretty sure has been MICROseconds, not milliseconds, since back in the 2.0 kernel on e.g. 200 MHz hardware. My laptop is clocking over 2000 CS's/second sitting nearly idle -- that's just the "noise" of the system's normal interactive single user function, and this when its clock is at its idling 800 MHz (out of 2.5 GHz). On the physics department home directory (NFS) server I'm clocking only 7000-7500 CS/sec at a load average of 0.3 (dual core opteron at 2.8 GHz). Since nfsstat doesn't seem to do rates yet (and has int counters instead of long long uints, grrr) it is a bit difficult to see exactly what this is derived from in real time as far as load goes, but almost all of it seems to be the processing of interrupts, as in the interrupt count and the context switch count go in close parallel. Now, I'm trying to understand the advantage you describe, so bear with me. See what you think of the following: The kernel processes interrupts as efficiently as possible, with upper half and lower half handlers, but either one requires that the CPU stop userspace tasks and load up the kernel's interrupt handler, which requires moving in and out of kernel mode. We aren't going to quibble about a factor of two here and I think that this is well within a factor of two of a context switch time as most of the state of the CPU still has to be saved so I'm calling it a CS even if it is maybe 80% of a CS as far as time goes, depending on how badly one is thrashing the caches. Network based requests are associated with packets from different sources; packets require interrupts to process, interrupts from distinct sources require context switches to process (do they not? -- I'm not really sure here but I recall seeing context switch counts generally rise roughly linearly with interrupt rates even on single threaded network apps) so I would EXPECT the context switch load imposed by a network app to be within a LINEAR factor of two to four independent of whether it was run as a fork or run via events with one exception, described below. I'm estimating the load as packet arrives (I,CS), app gets CPU (CS), select on FD triggers it's "second stage interrupt/packet handler" which computes a result and writes to the network (I,CS), causes the process to block on select, kernel does next task (CS). So I count something like two interrupts and four context switches per network transaction for a single, single threaded network server application handling a single, single packet request with a single, single packet reply. If it has to go to disk, add at least one interrupt/context switch. So it is four or five context switches, some of which may be lighter weight than others, and presumes that the process handling the connection already exists. If the process was created by a fork or as a thread, there is technical stuff about whether it is a kernel thread (expensive) or user thread (much lighter weight, in fact, much more like a procedure call since the forked processes share a memory space and presumably do NOT need to actually save state when switching between them). They still might require a large chunk of a CS time to switch between them because I don't know how the kernel task switcher manages select statements on the entire cluster of children associated with the toplevel parent -- if it sweeps through them without changing context there is one level of overhead, if it fully changes context there is another, but again I think we're talking 10-20% differences. Now, if it is a SINGLE process with umpty open file descriptors and an event loop that uses SOME system call -- and I don't quite understand how a userspace library can do anything but use the same systems calls I could implement in my own code to check for data to be read on a FD, e.g. select or some sort of non-blocking IO poll -- each preexisting, persistent connection (open FD) requires at least 2-3 of the I/CS pairs in order to handle a request. In other words, it saves the CS associated with switching the toplevel task (and permits the kernel to allocate timeslices to its task list with a larger granularity, saving the cost of going in/out of kernel mode to initiate the task/context switch). This saves around two CS out of four or five, a factor of around two improvement. Not that exciting, really. The place where you get really nailed by forks or threads is in their creation. Creating a process is very expensive, a couple of orders of magnitude more expensive than switching processes, and yeah, even order ms. So if you are handling NON-persistent connections -- typical webserver behavior: make a connection, send a request for a file, receive the file requested, break the connection -- handling this with a unique fork or thread per request is absolute suicide. So there a sensible strategy is to pre-initiate enough threads to be able to handle the incoming request stream round-robin, so that as each thread takes a request, processes it, and resumes listening for the next request. This requires the overhead of creating a FD (inevitable for each connection), dealing with the interrupt/CSs required to process the request and deliver the data, then close/free the FD. If the number of daemons required to process connections at incoming saturation is small enough that the overhead associated with processing the task queue doesn't get out of hand, this should scale very nearly as well as event processing, especially if the daemons are a common fork and share an address space. The last question is just how efficiently the kernel processes many blocked processes. Here I don't know the answer, and before looking it up I'll post the question here where probably, somebody does;-) If the connections are PERSISTENT -- e.g. imap connections forked by a mail server for mail clients that connect and stay connected for hours or days -- then as a general rule there will be no I/O waiting on the connections because humans type or click slowly and erratically, mostly a poissonian load distribution. If the kernel has a way of flagging applications that are blocked on a select on an FD without doing an actual context switch into the application, the scheduler can rip through all the blocked tasks without a CS per task, at an overhead rate within a CS or two per ACTIVE task (one where there IS I/O waiting) of the hyperefficient event-driven server that basically stays on CPU except for when the CPU goes back to the kernel anyway to handle the packet stream during interrupts and to advance the timer and so on. I don't KNOW if the kernel manages runnable or blocked at quite this level -- it does seem that there are fields in the task process table that flag it, though, so I'd guess that it does. It seems pretty natural to skip blocked processes without an actual CS in and out just to determine that they are blocked, since many processes spend a lot of time waiting on I/O that can take ms to unblock (e.g. non-cached disk I/O). So I'm not certain that having a large number of idle, blocked processes (waiting on I/O on a FD with a select, for example) is a problem with context switches per se. > something to serve files to 20,000 client machines over persistent TCP > connections and the network interface is going to be saturated, hell > yes, you should never use 20,000 threads for that, write the thing > event driven or you'll die. Here there are a couple of things. One is that 20K processes MIGHT take 20K context switches just to see if they are blocked on I/O. If they do, then you are definitely dead. 20K processes also at the very least require 20K entries in the kernel process table, and even looping over them in the scheduler to check for an I/O flag with no CS is going to start to take time and eat a rather large block of memory and maybe even thrash the cache. So I absolutely agree, 20K mostly-idle processes on a running system -- even a multicore with lots of memory -- is a bad idea even if they are NOT processing network requests. Fortunately, this is so obvious that I don't think anybody sane would ever try to do this. Second, 20K NON-persistent connections on an e.g. webserver would be absolute insanity, as it adds the thread creation/destruction overhead to the cost of processing the single-message-per-connection interrupts. It just wouldn't work, so people wouldn't do that. IIRC there were a few daemons that did that back in the 80's (when I was managing Suns) and there were rules of thumb on running them. "Don't" is the one I recall, at least if one had more than a handful of hosts connecting. Running 10-20 parallel daemons might work, and people do that -- httpd, nfsd. Running an event driven server daemon (or parallel/network application) would work, and people do that -- pvmd does that, I believe. Which one works the best? I'm perfectly happy to believe that the event driven server could manage roughly twice as many make/break single message connections as a pile of daemons, if the processes aren't bottlenecked somewhere other than at interrupt/context switches. If we assume that at CS takes order of 1-10 usec on a modern system, and it takes a SMALLER amount of time to do the processing associated with a request, then you'll get the advantage. If each request takes (say) order of 100 usec to handle, then you'll be bottlenecked at less than 10,000 requests per second anyway, and I don't think that you'd see any advantage at all, although this depends strongly on whether or not one can block all the daemons somewhere OTHER than the network. The question then is -- what kind of traffic is e.g. an NFS server or a mail server as opposed to a web server? NFS service requires (typically) at least an fstat per file, and may or may not require physical disk access with millisecond scale latencies. Caching reduces this by orders of magnitude, but some patterns of access (especially write access or a nasty mix of many small requests -- latency bound disk accesses) don't necessarily cache well. It is not at all clear, then, that an event driven NFS server would ultimately scale out better than a small pile of NFS daemons as the bottleneck could easily end up being the disk, not the context switch or interrupt burden associated with the network. Mail servers ditto, as they too are basically file servers, but ones for which caching is of little or no advantage. Event driven servers might get you the ability to support as much as a factor of two more connections without dying, but it is more likely that other bottlenecks would kill your performance at about the same number of connections either way. To bring the whole thing around OT again, a very reasonable question is what kind of application one is likely to encounter in parallel computing and which of the three models discussed (forking per connection, forking a pile of daemons to handle connections round robin, single server/daemon handling a table of FDs) is likely to be best. I'd argue that in the case of parallel computing it is ALMOST completely irrelevant -- all three would work well. If one starts up a single e.g. pvmd or lamd, which forks off connected parallel applications on request, then typically there will a) only be roughly 1 such fork per core per system, because the system will run maximally efficiently if it can just keep streaming memory streaming in and out of L1 and L2; b) they will have a long lifetime, so the cost of the fork itself is irrelevant -- a ms out of hours to days of computing; c) internally the applications are already written to be event driven, in the sense that they maintain their own tables of FDs and manage I/O either at the level of the toplevel daemons (who then provide it as streams to the applications) or within the applications themselves via library calls and structures. I THINK PVM is more the former model and MPI the latter, but there are many MPIs. For other associated cluster stuff -- a scheduler daemon, an information daemon such as xmlsysd in wulfstat -- forking vs non-forking for persistent connections (ones likely to last longer than minutes) is likely to be irrelevantly different. Again, pay the ms to create the fork, pay 6 interrupt/context switches instead of 4 or 5 per requested service with a marginal cost of maybe 10 usec, and unless one is absolutely hammering the daemon and the work done by the daemon has absolutely terrible granularity (so it is only DOING order of 10 or 100 usec of work per return) it is pretty ignorable, especially on a system that is PRESUMABLY spending 99% of its time computing and the daemon is basically handling out of band task monitoring or control services. > It is all about the right tool for the job. Apps that are all about > massive concurrent communication need events. Apps that are about very > little concurrent communication probably don't need them. Absolutely, but do they need libevents, or do they simply need to be sensibly written to manage a table of fds and selects or nonblocking polls? I've grabbed the source for libevents and am looking through it, but again, it seems to me that it is limited to using the systems calls the kernel provides to handle I/O on open FDs, and if so the main reason to use a library rather than the calls directly is ease of coding, not necessarily efficiency. Usually the code would be more efficient if you did the same thing(s) inline, would it not? The one thing I completely agree with is that one absolutely must remain aware of the high cost of creating and destroying threads/processes. Forking is expensive, and forking to handle a high-volume stream of transient connections is dumb. So dumb that it doesn't work, so nobody does this, I think. At least, not for long. > More the former, not the latter. Event driven programming typically > uses registered callbacks that are triggered by a central "Event Loop" > when events happen. In such a system, one never blocks for anything -- > all activity is performed in callbacks, and one simply returns from a > callback if one can't proceed further. The programming paradigm is > quite alien to most people. Fair enough, because most people don't write heavily parallel applications (which includes applications with many parallel I/O streams, not just HPC). But people who do fairly quickly learn to work out the scaling and overhead, do they not, at least "well enough" to achieve some level of performance? Otherwise the applications just fail to work and people don't use them. Evolution in action...;-) This has been a very informative discussion so far, at least for me. Even if my estimates above are all completely out of line and ignore some key thing, all that means is I'll learn even more. The one thing that I wish were written with some sort of internal scheduler/kernel and event mechanism from the beginning is X. It has its own event loop, but event-driven callbacks all block -- there is no internal task parallelism. It is a complete PITA to write an X application that runs a thread continuously but doesn't block the operation of the GUI -- one has to handle state information, use a separate thread, or invert all sorts of things from the normal X paradigm of click and callback. That is, most X apps are written to be serial and X itself is designed to support serial operation, but INTERESTING X apps are parallel, where the UI-linked I/O channels have to be processed "independently" within the X event loop while a separate thread is doing a task loop of interesting work. AFAIK, X only supports its own internal event loop and has horrible kludges to get the illusion of task parallelism unless one just forks a separate thread for the running "work" process and establishes shared state stuctures and so on so that the UI callbacks can safely affect work going on in the work-loop thread without blocking it. > I'd read the libevent man page to get a vague introduction. There doesn't seem to be one in the source tarball I downloaded. Only event.3 and evdns.3, neither of which are terribly informative. In fact, the documentation sucks. There is more space on the website devoted to pictures of good vs terrible scaling with/without libevent than there is documentation of how it works or how to use it, and of course it is difficult to know if the figures are straw men or fair comparisons. There are a few chunks of sample code in samples. I'll take a look and see what I can see when I have time. I'm working on an X-based GUI-controlled application and do have a forking daemon (xmlsysd) that so far seems to work fine at the level of traffic it was designed for and is likely bottlenecked someplace other than CSs long before they become a problem, but this conversation has convinced me that I could rewrite at least the latter in a way that is more efficient even if I do leave it forking per connection (or using xinetd, as it usually does now:-). It is a monitoring daemon, and is fairly lightweight now because one doesn't want to spend resources watching a cluster spend resources. If I redesigned it along lines suggested by the analysis above, I could permit it to manage many more connections with one part of its work accomplished with roughly constant overhead, where now the overhead associated with that work scales linearly with the number of connections. rgb > > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From landman at scalableinformatics.com Thu Jul 3 09:49:00 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486B9D4E.80405@ias.edu> <05A873CF-6D66-4B3B-9E63-B74B7D36D10B@sanger.ac.uk> Message-ID: <486D02FC.9040109@scalableinformatics.com> Joshua Baker-LePain wrote: > On Thu, 3 Jul 2008 at 9:34am, Tim Cutts wrote >> On 2 Jul 2008, at 4:22 pm, Prentice Bisbal wrote: > >>> B. Red Hat has done such a good job of spreading FUD about the other >>> Linux distros, management has a cow if you tell them you're installing >>> something other than RH. > > Erm, do you have any examples of that? All I see is RH a) trying to > sell their product (nothing wrong with that) and b) in general, being a > pretty good member of the OSS community. The only bad things I have seen RH do are their refusal to support good file systems (as the size of disks hit 2GB at the end of the year, this is going to bite them harder than it is now), and some of the choices they have made in their kernel. Other than that, they have been a good OSS citizen. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615 From kus at free.net Thu Jul 3 10:01:43 2008 From: kus at free.net (Mikhail Kuzminsky) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] MPI: over OFED and over IBGD In-Reply-To: <9FA59C95FFCBB34EA5E42C1A8573784F013422B0@mtiexch01.mti.com> Message-ID: In message from "Gilad Shainer" (Thu, 3 Jul 2008 09:41:01 -0700): >Mikhail Kuzminsky wrote: > >> Is there some MPI realization/versions which may be installed >> one some nodes - to work over Mellanox IBGD 1.8.0 (Gold >> Distribution) IB stack and on other nodes - for work w/OFED-1.2 ? >IBGD is out of date, and AFAIK none of the latest versions of the >various MPI were tested against it. It's clear, but I didn't ask about *LATEST* MPI versions ;-) >I would recommend to update the >install to OFED from IBGD, and if you need some help let me know. Thanks you very much for your help ! > If you >must keep it Yes. There is some russian romance w/the words : "You can't understand, you can't understand, you can't understand my sorrows" :-)) >, than MVAPICH 0.9.6 might work. Eh, I used 0.9.5 and 0.9.9 :-) Now will see mvapich archives. Thanks ! Mikhail > >Gilad. > >_______________________________________________ >Beowulf mailing list, Beowulf@beowulf.org >To change your subscription (digest mode or unsubscribe) visit >http://www.beowulf.org/mailman/listinfo/beowulf From mark.kosmowski at gmail.com Thu Jul 3 10:13:12 2008 From: mark.kosmowski at gmail.com (Mark Kosmowski) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] Re: energy costs and poor grad students Message-ID: > Prentice Bisbal wrote: > > Mark Kosmowski wrote: > > > > > >> I think I have come to a compromise that can keep me in business. > >> Until I have a better understanding of the software and am ready for > >> production runs, I'll stick to a small system that can be run on one > >> node and leave the other two powered down. I've also applied for an > >> adjunt instructor position at a local college for some extra cash and > >> good experience. When I'm ready for production runs I can either just > >> bite the bullet and pay the electricity bill or seek computer time > >> elsewhere. > >> > > > > Mark, > > > > For MPI testing/debugging, you can create a few virtual machine on one > > node using VWware or Xen. VMWare is free, unless you want all the bells > > and whistles. > > > > You don't need to go this far. Just set up the hostfile to use the same > host name several times. Just make sure you don't start swapping :) > > Jeff > My problem is RAM. I'm using stable codes and not doing much programming of my own, other than to tweak output formats to suit my needs. I've come up with some solutions. First, I'll spend some time this weekend moving files around and physically swapping DIMMs (I'm gonna have sore thumbs again :( ) to get one machine with somewhere between 8 and 16 Gb. After I do the file transferring I can then run just one workstation with a big amount of RAM. This amount of RAM should keep me in business for even most of my production runs until I get to a certain size of system to be studied. Next, tomorrow I am going to install some laminate flooring for my parents and will endeavor to extort a new laptop out of them - mine no longer communicated to the LCD screen - tearing it apart to see what is wrong was going to be my first unemployment project but the went and extended my employment contract another 6 months. Step three - time to upgrade the entertainment machine to a 64-bit dual core system from the 32-bit ancient chip it has. I'll try to get this to 6 or 8 Gb RAM - then if needed I can use it as a half-node as needed to supplement the workstation without firing up more machines. This will leave two machines powered down and hopefully half the computing power usage. Maybe it would be a better idea to just buy more RAM for the two HDAMA opteron systems and use one of those as a part-time entertainment / vmware windows machine by just getting a PCI-X video card (I'm asking on the OpenSUSE forum whether the HDAMA PCI-X slots will run a PCI-X video card - feel free to comment on this here too). I'll make a RAM inventory this weekend and post the results. If I can get one of these systems to 12 - 16 Gb and the other to 8 - 12 Gb, this may be the best choice. Time to learn vmware and wine. Right now, my CoW (cluster of workstations from RGB's book) uses the oldest 64-bit machine as a "head node". This machine is the slowest and only has 4 DIMM slots, one of which I'm having difficulties with, so this machine is definitely going down. The other two nodes use an HDAMA mother board, each with 8 DIMM slots - I have 2 Gb and 1 Gb DIMMs on hand. I'm thinking that perhaps the best thing to do is just physically move the data drive to the machine slated to be a full-time calculator (the one with the most RAM) and then fix paths as needed. Someone suggested downclocking - would downclocking a step or two in BIOS find a sweet spot as far as speed per energy unit similar to driving in the 45 - 55 mile per hour range? I hope no one is too upset I'm doing this planning on list. From Bogdan.Costescu at iwr.uni-heidelberg.de Thu Jul 3 10:13:33 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] Re: dealing with lots of sockets In-Reply-To: <87mykz4vyo.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> <87vdzo9rgd.fsf@snark.cb.piermont.com> <87skus874a.fsf_-_@snark.cb.piermont.com> <87mykz4vyo.fsf@snark.cb.piermont.com> Message-ID: On Wed, 2 Jul 2008, Perry E. Metzger wrote: > Event driven programming typically uses registered callbacks that > are triggered by a central "Event Loop" when events happen. In such > a system, one never blocks for anything -- all activity is performed > in callbacks, and one simply returns from a callback if one can't > proceed further. And here is one of the problems that event driven programming can't really solve: separation between the central event loop and the code to run when events happen. fork() allows the newly created process to proceed at its own will and possibly doing its own mistakes (like buffer overflows) in its own address space - the parent process is not affected in any way and this allows f.e. daemons to run their core loop with administrative priviledges while the real work can be done as a dumb user. -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From shaeffer at neuralscape.com Thu Jul 3 10:21:04 2008 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: References: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486B9D4E.80405@ias.edu> <05A873CF-6D66-4B3B-9E63-B74B7D36D10B@sanger.ac.uk> Message-ID: <20080703172104.GC21836@synapse.neuralscape.com> On Thu, Jul 03, 2008 at 12:09:49PM -0400, Joshua Baker-LePain wrote: > On Thu, 3 Jul 2008 at 9:34am, Tim Cutts wrote > >On 2 Jul 2008, at 4:22 pm, Prentice Bisbal wrote: > > >>B. Red Hat has done such a good job of spreading FUD about the other > >>Linux distros, management has a cow if you tell them you're installing > >>something other than RH. > > Erm, do you have any examples of that? All I see is RH a) trying to sell > their product (nothing wrong with that) and b) in general, being a pretty > good member of the OSS community. > > >Fedora did not exist, and I'm still not sure how separate from Red Hat > >Fedora and CentOS really are, but that's probably just my ignorance. The > >fact that > > CentOS is in no way officially associated with Red Hat. At all. They use > the freely available RHEL SRPMs to build the distribution, and they report > bugs upstream when they find them. But that's it. > > As for Fedora, see . Hi, Red Hat is indeed an exemplary member of the OSS community. They never violate licenses. What they do do, is they take every advantage they can to differentiate themselves. This results in very unfriendly distributions to modify and customize -- Red Hat wants service contracts that are not really compatible with such activity anyway. Its by design. And quite effective. Fedora Core is dominated by Red Hat employees. It is for all intents and purposes a hybrid distribution that is semi open to the public. It is definitely a beta distribution for RHEL. CentOS is not related to Red Hat in any way. These folks just use the GPL to produce a free clone of RHEL releases. And I might add, it is very well maintained. I highly recommend it to folks who are going to modify and customize RHEL, because your RHEL service contract won't permit that anyway. If you are interested in such issues, you might want to pay attention to the recent and ongoing discussion about systemtap (A Red Hat managed project.) And the consternation of other folks in the OSS community at the difficulty in working with the project independent of Red Hat. In this case, we are talking about the kernel development community. Just follow the thread from here: https://lists.linux-foundation.org/pipermail/ksummit-2008-discuss/2008-June/000149.html Actually a very interesting thread, dealing with more than systemtap. Thanks, Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer@neuralscape.com http://www.neuralscape.com From rgb at phy.duke.edu Thu Jul 3 10:45:44 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] automount on high ports In-Reply-To: References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> Message-ID: On Thu, 3 Jul 2008, Bogdan Costescu wrote: > On Wed, 2 Jul 2008, Robert G. Brown wrote: > >> if you try to start up a second daemon on that port you'll get a EADDRINUSE >> on the bind > > While we talk about theoretical possibilities, this statement is not always > true. You could specify something else than INADDR_ANY here: > >> serverINETaddress.sin_addr.s_addr = htonl(INADDR_ANY); /* Accept all */ > > or bind it to a specific network interface (SO_BINDTODEVICE). Then you can > bind a second daemon to the same port, but with a different (and again not > INADDR_ANY) local address or network interface. Many daemons can do this > nowadays (named, ntpd, etc.). Sure. I meant on a single wire, single IP number (and included sample code), not that you couldn't offer network services on more than one network from a single machine. Ultimately, raw networking is really difficult, and I'll freely admit that even though I've WRITTEN some network apps, I'm far from expert. I code with Stevens in one hand and examples in the other, typing with my nose, and pray. So any of y'all that have written a lot of networking code will have direct experience of edges I have not yet explored. This is one reason that people use PVM and MPI and so on. It can be argued (and has been argued on this list IIRC) that raw networking code will always result in a faster parallel program, all things being equal, because encapsulating it in higher level abstractions always comes at a cost (even though in many practical cases the people who wrote those abstractions were better parallel coders than the person trying to write the code anyway, so things are not equal, so the resulting code will be MORE efficient than what one would get unless one worked really hard and learned all the tricks used in the library to where one could go beyond them). It is pretty easy to write a single task server. There is template code for it. It isn't horribly difficult to write a forking server. There is template code for it. As you go up in complexity and expected load beyond where these will work, you have to learn, and you will find it harder and harder to find good, simple, templated code to start your project with. Such is life. And it isn't easy to learn. Few people teach it because few people know it. There aren't a lot of good books on it that I know of (outside of Stevens). One has to learn the hardest of ways; on the job, by doing, by making mistakes. I certainly have learned the modest bit that I know on my own, and haven't had any need to go beyond it (yet) to the next level. And if God is good to me, I will never have to design a 20 Kconnection webserver, and can die in peace in my state of relative ignorance...;-) rgb > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From perry at piermont.com Thu Jul 3 10:59:06 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] Re: dealing with lots of sockets In-Reply-To: (Bogdan Costescu's message of "Thu\, 3 Jul 2008 19\:13\:33 +0200 \(CEST\)") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> <87vdzo9rgd.fsf@snark.cb.piermont.com> <87skus874a.fsf_-_@snark.cb.piermont.com> <87mykz4vyo.fsf@snark.cb.piermont.com> Message-ID: <8763rmn91x.fsf@snark.cb.piermont.com> Bogdan Costescu writes: > On Wed, 2 Jul 2008, Perry E. Metzger wrote: > >> Event driven programming typically uses registered callbacks that >> are triggered by a central "Event Loop" when events happen. In such >> a system, one never blocks for anything -- all activity is performed >> in callbacks, and one simply returns from a callback if one can't >> proceed further. > > And here is one of the problems that event driven programming can't > really solve: separation between the central event loop and the code > to run when events happen. I don't understand what you mean. > fork() allows the newly created process to proceed at its own will > and possibly doing its own mistakes (like buffer overflows) in its > own address space - the parent process is not affected in any way > and this allows f.e. daemons to run their core loop with > administrative priviledges while the real work can be done as a dumb > user. Oh, that's not an issue at all. For example, say you wanted to run an SMTP daemon as a pure event app but you don't want it to run as root. So, you're screwed because you can't open port 25 as a normal user, right? Well, you can either change privs after opening 25, or you can use fd passing to pass open file descriptors between a small rootly process and the mail processing event driven process. Anyway, yah, bugs are a problem. If you have a bug in an event driven system you bring down 10,000 connections at once instead of 1. You do indeed have to be confident your code doesn't suck. -- Perry E. Metzger perry@piermont.com From perry at piermont.com Thu Jul 3 11:01:07 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486C6660.5070705@aei.mpg.de> (Carsten Aulbert's message of "Thu\, 03 Jul 2008 07\:40\:48 +0200") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <20080702200448.GA17424@bx9.net> <87hcb76hpk.fsf@snark.cb.piermont.com> <20080702223714.GA5908@bx9.net> <486C6660.5070705@aei.mpg.de> Message-ID: <87wsk2lue4.fsf@snark.cb.piermont.com> Carsten Aulbert writes: > A solution proposed by the nfs guys is pretty simple: > > Change the values of > /proc/sys/sunrpc/{min,max}_resvport > appropriately. But they don't know which ceiling will be next. But we > will test it. What about my kernel patch to use unprived ports? Did you try it? Perry From jlb17 at duke.edu Thu Jul 3 11:09:00 2008 From: jlb17 at duke.edu (Joshua Baker-LePain) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: <486D02FC.9040109@scalableinformatics.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486B9D4E.80405@ias.edu> <05A873CF-6D66-4B3B-9E63-B74B7D36D10B@sanger.ac.uk> <486D02FC.9040109@scalableinformatics.com> Message-ID: On Thu, 3 Jul 2008 at 12:49pm, Joe Landman wrote > The only bad things I have seen RH do are their refusal to support good file > systems (as the size of disks hit 2GB at the end of the year, this is going > to bite them harder than it is now), and some of the choices they have made > in their kernel. Other than that, they have been a good OSS citizen. I definitely agree that some of their kernel decisions are... odd. I'm wondering, though, why you think 2TB drives in specific are going to bite them harder than the 1TB models out now. ext3 goes up to 16TB these days, and 2 marketing TB < 2TiB, so you'll still be able to boot off those drives. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF From prentice at ias.edu Thu Jul 3 11:09:27 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486B9D4E.80405@ias.edu> <05A873CF-6D66-4B3B-9E63-B74B7D36D10B@sanger.ac.uk> Message-ID: <486D15D7.40006@ias.edu> Joshua Baker-LePain wrote: > On Thu, 3 Jul 2008 at 9:34am, Tim Cutts wrote >> On 2 Jul 2008, at 4:22 pm, Prentice Bisbal wrote: > >>> B. Red Hat has done such a good job of spreading FUD about the other >>> Linux distros, management has a cow if you tell them you're installing >>> something other than RH. > > Erm, do you have any examples of that? All I see is RH a) trying to > sell their product (nothing wrong with that) and b) in general, being a > pretty good member of the OSS community. You say tomato, I say, uhhh, tomato. > >> Fedora did not exist, and I'm still not sure how separate from Red Hat >> Fedora and CentOS really are, but that's probably just my ignorance. >> The fact that > > CentOS is in no way officially associated with Red Hat. At all. They > use the freely available RHEL SRPMs to build the distribution, and they > report bugs upstream when they find them. But that's it. True, but CentOS is irrefutably tied to RH. If RHEL disappears, so will CentOS, -- Prentice From prentice at ias.edu Thu Jul 3 11:13:46 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] Re: energy costs and poor grad students In-Reply-To: <486CFF55.7080705@charter.net> References: <486CDCBC.8030706@ias.edu> <486CE194.2060503@charter.net> <486CE8F0.7010602@ias.edu> <486CFF55.7080705@charter.net> Message-ID: <486D16DA.90202@ias.edu> Jeffrey B. Layton wrote: > Prentice Bisbal wrote: >>> You don't need to go this far. Just set up the hostfile to use the same >>> host name several times. Just make sure you don't start swapping :) >>> >>> Jeff >>> >>> >> >> Unless the problem is configuring interhost communications correctly. >> > > Then how does using VM's fix this problem? I'm not sure I understand you > comment. > I was thinking like a SysAdmin, not a developer. I've had plenty of experiences where nodes aren't communicating b/c someone hosed up the machines file, .rhosts file, and other stuff like that. I guess it's not really relevant if your focusing only on developing an MPI app, not not administering a cluster. -- Prentice PS - I just realized who I was talking to. Thanks for the articles on parallel filesystems. Very good. I read them all. From prentice at ias.edu Thu Jul 3 11:22:11 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <486B9D4E.80405@ias.edu> <05A873CF-6D66-4B3B-9E63-B74B7D36D10B@sanger.ac.uk> Message-ID: <486D18D3.6090207@ias.edu> Tim Cutts wrote: > > On 3 Jul 2008, at 5:09 pm, Joshua Baker-LePain wrote: > >> On Thu, 3 Jul 2008 at 9:34am, Tim Cutts wrote >>> On 2 Jul 2008, at 4:22 pm, Prentice Bisbal wrote: >> >>>> B. Red Hat has done such a good job of spreading FUD about the other >>>> Linux distros, management has a cow if you tell them you're installing >>>> something other than RH. >> >> Erm, do you have any examples of that? All I see is RH a) trying to >> sell their product (nothing wrong with that) and b) in general, being >> a pretty good member of the OSS community. > > I agree - I've never seen FUD from Red Hat, but then I don't have much > to do with them. > A) I remember reading propaganda (you might call it advertising) saying that RHEL was the only Linux stable and robust enough for enterprise applications. I don't keep all my old Linux Journals or archives of www.redhat.com, so I can't provide concrete examples. Surely someone else must have read stuff like that. Anyone? Anyone... Beuller? I started on Red Hat, and still use RH and CentOS and other RH derivatives, I'm not a fanatic of some other distro (or any distro, for that matter) B) Is making a name for yourself by giving away Red Hat Linux, then suddenly pulling it off the market and trying to force people to pay hundreds of dollars for your "enterprise" version being a good member? I know they give away the SRPMS (because they HAVE to), but have you ever tried rebuilding them on your own? It's not trivial. And Fedora, well... it has it's detractors. -- Prentice From lindahl at pbm.com Thu Jul 3 13:40:13 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: <486CD643.1050904@ias.edu> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> <486CD643.1050904@ias.edu> Message-ID: <20080703204012.GA29534@bx9.net> On Thu, Jul 03, 2008 at 09:38:11AM -0400, Prentice Bisbal wrote: > Here's another reason to use tarballs: I have /usr/local shared to all > my systems with with NFS. If want to install the lastest version of > firefox, you can just do this: FWIW, the "rpm way" to do this is (ok, there's more than one way): * throw the rpm into your local repo, run createrepo * pdsh yum -y update Given that 99% of your software is RPMs, having 1% different can be a pain. As long as you can get rpms, of course. And you can avoid yet another NFS filesystem -- I have none at my company, which reduces the monitoring and fixing that I need to do. I'll also note that properly-configured local perl packages are not installed in a place where RPMs smash them (/usr/lib/perl5/vendor_perl vs. /usr/lib/perl5/site_perl). And you can find many perl rpms at rpmforge and atrpms, with accurate dependency info. -- greg From lindahl at pbm.com Thu Jul 3 13:50:09 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] automount on high ports In-Reply-To: References: <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> Message-ID: <20080703205008.GB29534@bx9.net> On Thu, Jul 03, 2008 at 01:45:44PM -0400, Robert G. Brown wrote: > This is one reason that people use PVM and MPI and so on. It can be > argued (and has been argued on this list IIRC) that raw networking code > will always result in a faster parallel program, all things being equal, > because encapsulating it in higher level abstractions always comes at a > cost Yes, but there is plenty of proof that MPI can be extremely low overhead -- see InfiniPath MPI. Now if you insist on using TCP for your MPI and your "bare metal" code, you may beat your MPI because it's not so good at TCP. But it's probably cheaper to fix the MPI than re-invent the wheel many times. There's plenty of good code you can study for writing good network servers -- the original irc server is a pretty good event-driven non-blocking program. strace is your friend, too, that's the way you can see that MPICH-1's TCP driver isn't so hot. -- greg From gdjacobs at gmail.com Thu Jul 3 13:55:14 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] Re: dealing with lots of sockets In-Reply-To: References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <486B7AB9.9050202@aei.mpg.de> <87hcb8b98w.fsf@snark.cb.piermont.com> <87vdzo9rgd.fsf@snark.cb.piermont.com> <87skus874a.fsf_-_@snark.cb.piermont.com> <87mykz4vyo.fsf@snark.cb.piermont.com> Message-ID: <486D3CB2.1090903@gmail.com> Bogdan Costescu wrote: > On Wed, 2 Jul 2008, Perry E. Metzger wrote: > >> Event driven programming typically uses registered callbacks that are >> triggered by a central "Event Loop" when events happen. In such a >> system, one never blocks for anything -- all activity is performed in >> callbacks, and one simply returns from a callback if one can't proceed >> further. > > And here is one of the problems that event driven programming can't > really solve: separation between the central event loop and the code to > run when events happen. fork() allows the newly created process to > proceed at its own will and possibly doing its own mistakes (like buffer > overflows) in its own address space - the parent process is not affected > in any way and this allows f.e. daemons to run their core loop with > administrative priviledges while the real work can be done as a dumb user. Which is the reason why djbdns, qmail and postfix do things the way they do. Sendmail X will be going this way. From rgb at phy.duke.edu Thu Jul 3 14:06:03 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: <20080703204012.GA29534@bx9.net> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> <486CD643.1050904@ias.edu> <20080703204012.GA29534@bx9.net> Message-ID: On Thu, 3 Jul 2008, Greg Lindahl wrote: > On Thu, Jul 03, 2008 at 09:38:11AM -0400, Prentice Bisbal wrote: > >> Here's another reason to use tarballs: I have /usr/local shared to all >> my systems with with NFS. If want to install the lastest version of >> firefox, you can just do this: > > FWIW, the "rpm way" to do this is (ok, there's more than one way): > > * throw the rpm into your local repo, run createrepo > * pdsh yum -y update > > Given that 99% of your software is RPMs, having 1% different can be a > pain. As long as you can get rpms, of course. And you can avoid yet > another NFS filesystem -- I have none at my company, which reduces > the monitoring and fixing that I need to do. ...and it can break the hell out of the elaborate dependency system if you go installing random libraries in e.g. /usr/local, or worse, overwrite an installed rpm in /usr with a different version. Entropy is a serious enemy to scalable sysadmin. The point of package management is to avoid it, and stay on the thin edge of optimally scalable LAN administration. > I'll also note that properly-configured local perl packages are not > installed in a place where RPMs smash them (/usr/lib/perl5/vendor_perl > vs. /usr/lib/perl5/site_perl). And you can find many perl rpms at > rpmforge and atrpms, with accurate dependency info. We TRY to install "everything" in rpm format. It is pretty easy to wrap up even scripts and third party stuff as an rpm, and the extra work is repaid N times over when you drop the rpm into a repo and it is installed/updated N times automagically. rgb > > -- greg > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From raysonlogin at gmail.com Thu Jul 3 09:57:29 2008 From: raysonlogin at gmail.com (Rayson Ho) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: References: Message-ID: <73a01bf20807030957m2c1f6c6dm4869317395dc2a06@mail.gmail.com> The whole Big Buck Bunny movie was rendering on a Grid Engine cluster (aka. network.com). Big Buck Bunny is open content, and the software used to create the film is opensource. http://en.wikipedia.org/wiki/Big_Buck_Bunny http://www.bigbuckbunny.org/ Rayson On Tue, Jul 1, 2008 at 5:39 AM, Jon Aquilina wrote: > does anyone know of any rendering software that will work with a cluster? > > -- > Jonathan Aquilina > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > From toon at moene.indiv.nluug.nl Tue Jul 1 12:38:36 2008 From: toon at moene.indiv.nluug.nl (Toon Moene) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] Re: "hobbyists" still OT In-Reply-To: <1953527533.79861214451795299.JavaMail.root@zimbra.vpac.org> References: <1953527533.79861214451795299.JavaMail.root@zimbra.vpac.org> Message-ID: <486A87BC.8040202@moene.indiv.nluug.nl> Chris Samuel wrote: > ----- "Prentice Bisbal" wrote: > >> The United States alone produces enough grain to feed the entire >> world. > > It is probably worth pointing out that, as a recent > New Scientist article mentioned, a major part for the > rise in grain prices is due the rising demand for meat > from around the world. One marketeer to another: "New is an old word; find a new word for it." > This is, of course, a very inefficient conversion method > of solar power into human food. The 70s called - they want their argument back. Sigh, I am too old for this world. -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ Progress of GNU Fortran: http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html From toon at moene.indiv.nluug.nl Tue Jul 1 13:48:34 2008 From: toon at moene.indiv.nluug.nl (Toon Moene) Date: Wed Nov 25 01:07:22 2009 Subject: Commodity supercomputing, was: Re: NDAs Re: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point? In-Reply-To: <486A1F7B.9080408@tamu.edu> References: <1210016466.4924.1.camel@Vigor13> <48551E70.7070507@scalableinformatics.com> <4AF41375-3A13-4691-A2A1-D5B853FEC3A4@xs4all.nl> <20080615154227.u8fwdpn08ww4c40k@webmail.jpl.nasa.gov> <6.2.5.6.2.20080616084554.02e4dd18@jpl.nasa.gov> <486923D6.8070907@moene.indiv.nluug.nl> <1214864562.6912.29.camel@Vigor13> <486A1F7B.9080408@tamu.edu> Message-ID: <486A9822.7000902@moene.indiv.nluug.nl> Gerry Creager wrote: > In the US, at least for academic institutions and hobbyists, surface and > upper air observations of the sort you describe are generally available > for incorporation into models for data assimilation. Models are > generally forced and bounded using model data from other atmospheric > models, also available. As I understand it from colleagues in Europe, > getting similar data over there is more problemmatical. Exactly ! And what happens in Europe is that companies take the freely available US data, use it to compete with US companies, and disregard the (meteorological superior) ECMWF data, because it is not free. A colleague of mine held some very unpopular talks in Reading, England, about this (according to his figures, 99 % of the meteorological data used in Europe originates from the US). -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ Progress of GNU Fortran: http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html From gregory.warnes at rochester.edu Tue Jul 1 17:50:10 2008 From: gregory.warnes at rochester.edu (Gregory Warnes) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: <20080701193721.B6843B5404D@mx2.its.rochester.edu> Message-ID: On 7/1/08 3:25PM , "Mark Hahn" wrote: >> > Hmmm.... for me, its all about the kernel. Thats 90+% of the battle. Some >> > distros use good kernels, some do not. I won't mention who I think is in >> the >> > latter category. > > I was hoping for some discussion of concrete issues. for instance, > I have the impression debian uses something other than sysvinit - > does that work out well? > Debian uses standard sysvinit-style scripts in /etc/init.d, /etc/rc0.d, ... > is it a problem getting commercial > packages (pathscale/pgi/intel compilers, gaussian, etc) to run? > I?ve never had any major problems. Most linux vendors supply both RPM?s and .tar.gz installers, and I generally have better luck with the latter, even on RPM based systems anyway. > > the couple debian people I know tend to have more ideological motives > (which I do NOT impugn, except that I am personally more swayed by > practical, concrete reasons.) > My ?conversion? to use of Debian had little to do with ideological motives, and a lot more to do with minimizing the amount of time I had to take away from my research to support the Linux clusters I was maintaining at the time. Side note, one very nice thing about debian is the ability to upgrade a system in-place from one O/S release to another via apt-get dist-upgrade Much nicer than reinstalling the O/S as seems to be (used to be?) the norm with RPM-based systems -Greg > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Gregory R. Warnes, Ph.D Program Director Center for Computational Arts, Sciences, and Engineering University of Rochester Tel: 585-273-2794 Fax: 585-276-2097 Email: gregory.warnes@rochester.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080701/140e0e18/attachment.html From gregory.warnes at rochester.edu Tue Jul 1 22:29:39 2008 From: gregory.warnes at rochester.edu (Gregory Warnes) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: <20080702050726.81FAA7343C2@mx6.its.rochester.edu> Message-ID: On 7/2/08 1:06AM , "Mark Hahn" wrote: >> > I?ve never had any major problems. Most linux vendors supply both RPM?s >> and >> > .tar.gz installers, and I generally have better luck with the latter, even >> > on RPM based systems anyway. > > interesting - I wonder why. the main difference would be that the rpm > format encodes dependencies... > The basic problem is that when folks build the .tar.gz files, they usually do a good job of explaining the dependencies and how to resolve them, while the equivalent RPM installer simply lists the dependencies with no hints about what packages are needed and where to get them. > > >>> >> the couple debian people I know tend to have more ideological motives >>> >> (which I do NOT impugn, except that I am personally more swayed by >>> >> practical, concrete reasons.) >>> >> >> > My ?conversion? to use of Debian had little to do with ideological motives, >> > and a lot more to do with minimizing the amount of time I had to take away >> > from my research to support the Linux clusters I was maintaining at the >> > time. > > again interesting, thanks. what sorts of things in rpm-based distros > consumed your time? > Well, a key component was obtaining, installing, and keeping open-source software components of the system up to date. Most other tasks were pretty equivalent, although things are organized somewhat differently between linux variants. In addition to automatically resolving dependencies for new packages, it keeps track of the dependencies of existing packages. This if one asks for package X that depends on library Y version N, but library Y version M >> > Side note, one very nice thing about debian is the ability to upgrade a >> > system in-place from one O/S release to another via >> > >> > apt-get dist-upgrade >> > >> > Much nicer than reinstalling the O/S as seems to be (used to be?) the norm >> > with RPM-based systems > > I've done major version upgrades using rpm, admittedly in the pre-fedora > days. it _is_ a nice capability - I'm a little surprised desktop-oriented > distros don't emphasize it... > On fundimental difference in philospohy explains both the fundimental differences between RPM and debian packages, and the reason for the lack of emphasis of in-place upgrades of desktop distros: vendor income. It is not in Red Hat?s financial interest to make it easy to upgrade a system in-place by an automated tool. They make money by selling new O/S versions. Consequently, Red Hat explicitly designed the RPM format to discourage in-place upgrades. The debian community, on the other hand, was and is run fundimentally by system administrators, whose best interest centers around minimizing the amount of time necessary to keep systems up to date. Consequently, debian?s package system was designed explicitly to make installation and updating of packages as painless as possible for the admin. Of course, other pressures have forced deviations from these fundimental viewpoints, but the patterns are still clearly visible. -Greg -- Gregory R. Warnes, Ph.D Program Director Center for Computational Arts, Sciences, and Engineering University of Rochester Tel: 585-273-2794 Fax: 585-276-2097 Email: gregory.warnes@rochester.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080702/ba67c773/attachment.html From steffen.grunewald at aei.mpg.de Wed Jul 2 00:01:13 2008 From: steffen.grunewald at aei.mpg.de (Steffen Grunewald) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <87fxqtuzh8.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> Message-ID: <20080702070113.GU11428@casco.aei.mpg.de> On Tue, Jul 01, 2008 at 04:21:55PM -0400, Perry E. Metzger wrote: > > Henning Fehrmann writes: > >> Thus, your problem sounds rather odd. There is no obvious reason you > >> should be limited to 360 connections. > >> > >> Perhaps your problem is not what you think it is at all. Could you > >> explain it in more detail? > > > > I guess it has also something to do with the automounter. I am not able > > to increase this number. > > But even if the automounter would handle more we need to be able to > > use higher ports: > > netstat shows always ports below 1024. > > > > tcp 0 0 client:941 server:nfs > > > > We need to mount up to 1400 nfs exports. > > All NFS clients are connecting to a single port, not to a different > port for every NFS export. You do not need 1400 listening TCP ports on > a server to export 1400 different file systems. Only one port is > needed, whether you are exporting one file system or one million, just > as only one SMTP port is needed whether you are receiving mail from > one client or from one million. That's true for the server side, but not for the client side. Each client- server connection uses another (privileged) port *on the client* which is where the problem shows up. This particular setup comprises 1400 cluster nodes which all act as distributed storage. Files would be spread over all of them, and an application would sequentially access files (time series) which are located on different servers. (Call it NUSA, non-uniform storage architecture.) I guess it's time to go ahead and try a real cluster filesystem, or wait for NFS v4.1 to settle down. I understand that with several tens of TB a re-organisation of all data into a completely new tree would be tricky if not impossible. OTOH such things like glusterfs allow for building cluster fs's without moving data - gluster would just add a set of additional layers ("translators") on top of already existing physical fs's. I have followed glusterfs development for more than a year now, and while they are still working on their redundancy features, it should be useable for "quasi read-only" access. (Note that the underlying fs would be still accessible, for feeding data in; clients could have r/o access to the glusterfs namespace.) Version 1.4 is to be out in a couple of days. See www.gluster.org BTW, since I'm facing the same issue on a somewhat smaller scale, any other suggestion is appreciated. Cheers, Steffen (same institute, different location :) -- Steffen Grunewald * MPI Grav.Phys.(AEI) * Am M?hlenberg 1, D-14476 Potsdam Cluster Admin * http://pandora.aei.mpg.de/merlin/ * http://www.aei.mpg.de/ * e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7233,fax:7298} No Word/PPT mails - http://www.gnu.org/philosophy/no-word-attachments.html From steffen.grunewald at aei.mpg.de Wed Jul 2 00:27:09 2008 From: steffen.grunewald at aei.mpg.de (Steffen Grunewald) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] A press release In-Reply-To: <486A6760.5010006@ias.edu> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> Message-ID: <20080702072709.GX11428@casco.aei.mpg.de> On Tue, Jul 01, 2008 at 01:20:32PM -0400, Prentice Bisbal wrote: > And the Debian users can say the same thing about Red Hat users. Or SUSE > users. And if any still exist, the Slackware users could say the same > thing about the both of them. But then the Slackware users could also > point out that the first Linux distro was Slackware, so they are using > the one true Linux distro... Which isn't true. Don't you remember MCC Interim Linux, back in the old days of 0.95[abc] kernels? It didn't consist of tens of floppies (yet), but it *was* a distro. > If you want to have a religious war about which distro to use, go > somewhere else. I'm sure there are plenty of mailing lists and > newsgroups where I'm sure that happens every day. :-) > This is a mailing list about beowulf clusters, and the last time I > checked, you can create clusters using any Linux distribution you like, > or even non-Linux operating systems, such as IRIX, Solaris, etc. Even Windows... (duck & run) Steffen From softy.lofty.ilp at btinternet.com Wed Jul 2 02:20:09 2008 From: softy.lofty.ilp at btinternet.com (Ian Pascoe) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] Small Distributed Clusters In-Reply-To: Message-ID: Hi all, Firstly before getting into the nitty gritty of my question, a bit of background. Myself and a friend are looking to set up initially two small clusters of 4 boxes each, using old surplus commodity hardware. The main purpose of the cluster is to hold data and perform calculations upon it - the data coming in from external sources. So far we've decided on a Ubuntu Server base with NFS linking the nodes together, and we're looking currently at how to perform the calculations - ie write our own software or adapt existing. However, the question I have relates to linking the two clusters together. For the majority of the time, they will be run automonously, but on occasions we believe they'll need to be run as a cohesive unit with jobs being passed between them, because we don't plan to duplicate the data across the clusters, but back up locally. Both will be connected to the Internet using ADSL and the limitation will be the upload speed of a maximum of 512Kbs. How would people suggest linking the two clusters together using a secure connection? Performance at this point is not in the equation, just the ability to securely connect. BTW Any thoughts too on a SQL Server that would cope well in this scenario? Thanks Ian From greg.byshenk at aoes.com Wed Jul 2 05:56:25 2008 From: greg.byshenk at aoes.com (Greg Byshenk) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: <87wsk4ed20.fsf@snark.cb.piermont.com> References: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> <3CB66E9F377C4961B5457896137EAD1B@geoffPC> <486A4296.4050501@hope.edu> <87y74lfabq.fsf@snark.cb.piermont.com> <87wsk4ed20.fsf@snark.cb.piermont.com> Message-ID: <20080702125625.GE47386@gby2.aoes.com> On Wed, Jul 02, 2008 at 07:32:55AM -0400, Perry E. Metzger wrote: > "Jon Aquilina" writes: > > if i use blender how nicely does it work in a cluster? > I believe it works quite well. The "Helmer" minicluster uses blender, and appears to perform well. Also, Maya's 'muster' engine runs under Linux, and quite successfully. We use it in a mixed environment, where the render pool consists of both Windows workstations and Linux cluster nodes. Note, though, that like other commercial 3D products, Maya is expensive, and may not be suitable for a student project. -- Greg Byshenk From vernard at venger.net Wed Jul 2 07:57:43 2008 From: vernard at venger.net (Vernard Martin) Date: Wed Nov 25 01:07:22 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: References: <7B82EE52C0DC4879AD6AE8FCA937C63C@geoffPC> <3CB66E9F377C4961B5457896137EAD1B@geoffPC> <486A4296.4050501@hope.edu> Message-ID: <486B9767.1090205@venger.net> Jon Aquilina wrote: > my idea is more of for my thesis. if i am goign ot do anything like > this. vernard thanks for the link. whats it like in a cluster > environment? Ah. you are doing it for thesis work. Then money is probably very much a limiting factor. If you just need a renderer then you can try the PVMPOV which is a PVM enabled version of POVray (located at http://pvmpov.sourceforge.net) its a ray-tracer which is very computationally intensive but has one of the largest communities for help out there. It also has support for doing animation. There are many free animations that you can use to test it and possibly for your thesis as long as you give attribution there. There is also PVMegPOV and MPI-Povray as well although i'm not as personally familiar with them. Vernard Martin vcmarti@sph.emory.edu From eagles051387 at gmail.com Thu Jul 3 22:51:45 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] software for compatible with a cluster In-Reply-To: <73a01bf20807030957m2c1f6c6dm4869317395dc2a06@mail.gmail.com> References: <73a01bf20807030957m2c1f6c6dm4869317395dc2a06@mail.gmail.com> Message-ID: kool ill have to try this stuff out on my old laptop which is now my testing machine. On Thu, Jul 3, 2008 at 6:57 PM, Rayson Ho wrote: > The whole Big Buck Bunny movie was rendering on a Grid Engine cluster > (aka. network.com). Big Buck Bunny is open content, and the software > used to create the film is opensource. > > http://en.wikipedia.org/wiki/Big_Buck_Bunny > http://www.bigbuckbunny.org/ > > Rayson > > > > On Tue, Jul 1, 2008 at 5:39 AM, Jon Aquilina > wrote: > > does anyone know of any rendering software that will work with a cluster? > > > > -- > > Jonathan Aquilina > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080704/fd094b5f/attachment.html From eagles051387 at gmail.com Thu Jul 3 23:58:31 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: References: <200806281735.m5SHZ8vS025843@bluewest.scyld.com> Message-ID: this is slightly off topic but im just wondering why spend thousands of dollars when u can just setup another server and backup everything to a raided hard drive array? On 7/2/08, Steve Cousins wrote: > > > > Just under 60MB/sec seems to be the maximum tape transport read/write >> limit. Pretty reliably the first write from the beginning of tape was a >> bit slower than writes started further into the tape. >> > > I believe LTO-3 is rated at 80 MB/sec without compression. Testing it on > our HP unit in an Overland library I get: > > WRITE: > > dd if=/dev/zero of=/dev/nst0 bs=512k count=10k > 10240+0 records in > 10240+0 records out > 5368709120 bytes (5.4 GB) copied, 71.8723 seconds, 74.7 MB/s > > READ: > > dd of=/dev/null if=/dev/nst0 bs=512k count=10k > 10240+0 records in > 10240+0 records out > 5368709120 bytes (5.4 GB) copied, 69.2487 seconds, 77.5 MB/s > > I used a 512K block size because that is what I use with our backups and it > has given optimal performance since the DLT-7000 days. > > Good luck, > > Steve > ______________________________________________________________________ > Steve Cousins, Ocean Modeling Group Email: cousins@umit.maine.edu > Marine Sciences, 452 Aubert Hall http://rocky.umeoce.maine.edu > Univ. of Maine, Orono, ME 04469 Phone: (207) 581-4302 > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080704/a7d510a8/attachment.html From geoff at galitz.org Fri Jul 4 00:10:36 2008 From: geoff at galitz.org (Geoff Galitz) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: References: <200806281735.m5SHZ8vS025843@bluewest.scyld.com> Message-ID: <2BADCF250DCC4A6CAB307A8E4B0A4C32@geoffPC> Backing up to tape allows you to go back to a specific point in history. Particularly useful if you need to recover a file that has become corrupted or you need to rollback to a specific stage and you are unaware of that fact for a few days. Geoff Galitz Blankenheim NRW, Deutschland http://www.galitz.org _____ From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Jon Aquilina Sent: Freitag, 4. Juli 2008 08:59 To: cousins@umit.maine.edu Cc: beowulf@beowulf.org Subject: Re: [Beowulf] Re: OT: LTO Ultrium (3) throughput? this is slightly off topic but im just wondering why spend thousands of dollars when u can just setup another server and backup everything to a raided hard drive array? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080704/8349f6ff/attachment.html From geoff at galitz.org Fri Jul 4 00:23:04 2008 From: geoff at galitz.org (Geoff Galitz) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: References: <20080701193721.B6843B5404D@mx2.its.rochester.edu> Message-ID: <5753D20A3C7E4233B4FAF0D8670A0423@geoffPC> Just a nit: Most RPM based distros allow in-place upgrades between minor point releases using "yum update" or "yum upgrade" (they follow different rules on how to resolve obsolete packages). However, moving between major releases is still recommended via a CD or other non-in-place media, though there are people that have done it in-place you seriously risk inflicting harm to your system in this manner. Geoff Galitz Blankenheim NRW, Deutschland http://www.galitz.org _____ From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Gregory Warnes Sent: Mittwoch, 2. Juli 2008 02:50 To: Mark Hahn Cc: Beowulf Subject: Re: [Beowulf] A press release [stuff snipped] Side note, one very nice thing about debian is the ability to upgrade a system in-place from one O/S release to another via apt-get dist-upgrade Much nicer than reinstalling the O/S as seems to be (used to be?) the norm with RPM-based systems -Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080704/24e26468/attachment.html From carsten.aulbert at aei.mpg.de Fri Jul 4 00:26:04 2008 From: carsten.aulbert at aei.mpg.de (Carsten Aulbert) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <87wsk2lue4.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <20080702200448.GA17424@bx9.net> <87hcb76hpk.fsf@snark.cb.piermont.com> <20080702223714.GA5908@bx9.net> <486C6660.5070705@aei.mpg.de> <87wsk2lue4.fsf@snark.cb.piermont.com> Message-ID: <486DD08C.9070602@aei.mpg.de> Hi Perry, Perry E. Metzger wrote: > What about my kernel patch to use unprived ports? Did you try it? No sorry, this approach with just setting the limits seems much easier than installing 1300 new kernels ;) Sorry Carsten PS: With the new limits it *just* works. -- Dr. Carsten Aulbert - Max Planck Institut f?r Gravitationsphysik Callinstra?e 38, 30167 Hannover, Germany Fon: +49 511 762 17185, Fax: +49 511 762 17193 http://www.top500.org/system/9234 | http://www.top500.org/connfam/6/list/31 From eagles051387 at gmail.com Fri Jul 4 00:28:35 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: <2BADCF250DCC4A6CAB307A8E4B0A4C32@geoffPC> References: <200806281735.m5SHZ8vS025843@bluewest.scyld.com> <2BADCF250DCC4A6CAB307A8E4B0A4C32@geoffPC> Message-ID: would it be possible to back up to tape as well as raided hdd array? On 7/4/08, Geoff Galitz wrote: > > Backing up to tape allows you to go back to a specific point in history. > Particularly useful if you need to recover a file that has become corrupted > or you need to rollback to a specific stage and you are unaware of that fact > for a few days. > > > > > > > > Geoff Galitz > Blankenheim NRW, Deutschland > http://www.galitz.org > ------------------------------ > > *From:* beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] *On > Behalf Of *Jon Aquilina > *Sent:* Freitag, 4. Juli 2008 08:59 > *To:* cousins@umit.maine.edu > *Cc:* beowulf@beowulf.org > *Subject:* Re: [Beowulf] Re: OT: LTO Ultrium (3) throughput? > > > > this is slightly off topic but im just wondering why spend thousands of > dollars when u can just setup another server and backup everything to a > raided hard drive array? > > > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080704/d97a1b98/attachment.html From tjrc at sanger.ac.uk Fri Jul 4 01:17:17 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: <5753D20A3C7E4233B4FAF0D8670A0423@geoffPC> References: <20080701193721.B6843B5404D@mx2.its.rochester.edu> <5753D20A3C7E4233B4FAF0D8670A0423@geoffPC> Message-ID: <2C7CE2AF-E2DB-444E-8F91-559062363FF7@sanger.ac.uk> On 4 Jul 2008, at 8:23 am, Geoff Galitz wrote: > > > Just a nit: > > > > Most RPM based distros allow in-place upgrades between minor point > releases > using "yum update" or "yum upgrade" (they follow different rules on > how to > resolve obsolete packages). However, moving between major releases > is still > recommended via a CD or other non-in-place media, though there are > people > that have done it in-place you seriously risk inflicting harm to > your system > in this manner. > But that's the whole point - why is that the case? It shouldn't be. If upgrading packages wrecks the system, then the package installation scripts are broken. They should spot the upgrade in progress and take appropriate action, depending on the previously installed version. This can be quite a detailed process for Debian packages, which is probably why they have fewer problems than Red Hat in this regard. See http://www.debian.org/doc/debian-policy/ch-maintainerscripts.html#s-mscriptsinstact if you're interested in how it works for Debian packages. OB clustering: For cluster nodes, we never do dist-upgrades, though. A reinstall from scratch is actually faster, so in the context of this list, the ability to upgrade in place isn't terribly important. A FAI install of a basic debian 4.0 image on a cluster node takes about two minutes, so there's not much point in going through the upgrade process, which takes considerably longer. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From ajt at rri.sari.ac.uk Fri Jul 4 04:10:26 2008 From: ajt at rri.sari.ac.uk (Tony Travis) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Small Distributed Clusters In-Reply-To: References: Message-ID: <486E0522.2090600@rri.sari.ac.uk> Ian Pascoe wrote: > Hi all, > > Firstly before getting into the nitty gritty of my question, a bit of > background. > > Myself and a friend are looking to set up initially two small clusters of 4 > boxes each, using old surplus commodity hardware. The main purpose of the > cluster is to hold data and perform calculations upon it - the data coming > in from external sources. > > So far we've decided on a Ubuntu Server base with NFS linking the nodes > together, and we're looking currently at how to perform the calculations - > ie write our own software or adapt existing. Hello, Ian. Sharing files locally via NFS using UDP is fine, but although you can do NFS via TCP it's not recommended because it's an insecure protocol. You can tunnel it, but you might as well use "sshfs", which is what I do. > However, the question I have relates to linking the two clusters together. > For the majority of the time, they will be run automonously, but on > occasions we believe they'll need to be run as a cohesive unit with jobs > being passed between them, because we don't plan to duplicate the data > across the clusters, but back up locally. I suggest you have a look at "dsh" (Dancer's distributed shell) as a simple way to run programs across local and geographically separate nodes in your cluster. This is very simple, but works remarkably well, especially if you use SSH keys for password-less authentication. > Both will be connected to the Internet using ADSL and the limitation will be > the upload speed of a maximum of 512Kbs. Another issue, apart from the 'A' (Assymetric speed) if you're ADSL is that of setting up your routers to permit incoming connections on port 22, and having static IP addresses. This is straight forward, but does need to be done before your clusters can communicate. > How would people suggest linking the two clusters together using a secure > connection? Performance at this point is not in the equation, just the > ability to securely connect. I suggest linking them together via "sshfs" and "ssh/dsh", because doing things like SSI (Single System Image) or MPI (message Passing Interface) over ADSL will require many ports to be open/tunneled and will be slow. > BTW Any thoughts too on a SQL Server that would cope well in this scenario? I've tunneled MySQL via SSH and it works fine. You would be unwise to expose the ports used by SQL Server or MySQL to the Internet because they are very insecure. Of course, you could always use openVPN :-( Tony. -- Dr. A.J.Travis, | mailto:ajt@rri.sari.ac.uk Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751 Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687 From ajt at rri.sari.ac.uk Fri Jul 4 04:44:41 2008 From: ajt at rri.sari.ac.uk (Tony Travis) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] /usr/local over NFS is okay, Joe In-Reply-To: <486CEF27.8090507@scalableinformatics.com> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> <486CD643.1050904@ias.edu> <486CE03D.40901@scalableinformatics.com> <486CE86B.90104@ias.edu> <486CEF27.8090507@scalableinformatics.com> Message-ID: <486E0D29.8060605@rri.sari.ac.uk> Joe Landman wrote: [...] > Yeah, it is ambiguous to a degree, but I figure that something named > /local is actually going to be physically local. It helps tremendously > when a user calls up with a problem, say that they can't see a file they > placed in /local/... on all nodes. Usually they get quiet for a moment > after saying that aloud, and then say "oh, never mind". :) > [...] Hello, Joe. Looks like some SunOS/Solaris veterans (like me) showing their colours here! Sun, as in the network is the computer, developed quite a lot of very good strategies for sharing files via NFS on diskless/dataless clients and this has been inherited in different ways by Linux distributions. In particular, Sun went out of their way to move a lot of things from /bin into /usr/bin precisely so it could be shared by NFS. I also agree with the widely used convention that '/usr/local' means local to the site, not the particular machine. It seems intuitively obvious to me that /local (i.e. in the root filesystem) is intended to be both local to a specific machine, and local to the site, whereas /usr/local is local to the site and may be shared via NFS but is not required to be. I have a bit of a problem with /opt, which is where 'optional' software is supposed to be installed. In the same way, that it is intuitively obvious to me that /opt is where optional software is installed on a specific machine, and /usr/opt may be shared via NFS but is not required to be shared. However, I have rather contradicted myself and done this on all our servers: /usr/local -> /opt/local I did this so I could use the /opt as a mount point in an NFS automounter map: It's not possible to automount /usr/local on /usr because, if you do, you hide the rest of /usr unless you use e.g. "unionfs" and that's a bit too much like hard work for me! Another reason I did this is to keep /usr/local out of the 'system' hierarchy, which makes upgrades easier because you don't need to worry about overwriting /usr/local during an upgrade installation. One thing that I value from my BSD/SunOS/Solaris days is /export, which is where ALL shared (exported) filesystems should be placed on NFS servers. I'm a real supporter of Debian/Ubuntu, but it drives me bonkers that Debian policy is to put home directories in /home. I put them in: /export/home And use /home as a mount point in an automounter map. This way machines can, in the well known BSD/Sun inspired way, share home directories: /home/hostname/username -> hostname:/export/home/username On a stand-alone host, I make a symbolic link: /home -> /export/home If, in future, this host needs to share home directories and mount other host's home directories, I then remove the symbolic link, install the automounter and use /home as the NFS mount point in the automounter map. Naturally, I don't always practice what I preach and recently I've been trying to work out to use the automounter the 'Debian' way ;-) So far I've not come up with anything that beats using /export/home! Tony. -- Dr. A.J.Travis, | mailto:ajt@rri.sari.ac.uk Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751 Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687 From rgb at phy.duke.edu Fri Jul 4 05:13:06 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: References: Message-ID: On Wed, 2 Jul 2008, Gregory Warnes wrote: >> interesting - I wonder why. the main difference would be that the rpm >> format encodes dependencies... >> > The basic problem is that when folks build the .tar.gz files, they usually > do a good job of explaining the dependencies and how to resolve them, while > the equivalent RPM installer simply lists the dependencies with no hints > about what packages are needed and where to get them. Unless your RPM installer is yum, in which case it all simply works (or is no more trouble than it ever is to build a package so that it will simply work). > On fundimental difference in philospohy explains both the fundimental > differences between RPM and debian packages, and the reason for the lack of > emphasis of in-place upgrades of desktop distros: vendor income. It is not > in Red Hat?s financial interest to make it easy to upgrade a system in-place > by an automated tool. They make money by selling new O/S versions. > Consequently, Red Hat explicitly designed the RPM format to discourage > in-place upgrades. ???? Having been around when the founders (who live down the street, so to speak:-) gave talks at some of the old linux expos and on campus and so on, and recalling the early RH books and free distribution system, I think that this last statement is just nonsense. They didn't design it to discourage in place upgrades or encourage it -- they designed it to facilitate in place updates and the creation of a consistent and tested collection of packages, one that could be automatically installed. Kickstart rocked, and continues to rock. Dependency resolution for a la carte package installation sucked, and I do mean sucked, with RPMs until first yellow dog invented yup, and then Seth took over yup, hit a wall of sorts, and transmogrified it into yum. Yum, OTOH, rocks. You want a package, you say yum install package. How hard is that? rgb > > The debian community, on the other hand, was and is run fundimentally by > system administrators, whose best interest centers around minimizing the > amount of time necessary to keep systems up to date. Consequently, debian?s > package system was designed explicitly to make installation and updating of > packages as painless as possible for the admin. > > Of course, other pressures have forced deviations from these fundimental > viewpoints, but the patterns are still clearly visible. > > -Greg > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From tjrc at sanger.ac.uk Fri Jul 4 05:40:30 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] /usr/local over NFS is okay, Joe In-Reply-To: <486E0D29.8060605@rri.sari.ac.uk> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> <486CD643.1050904@ias.edu> <486CE03D.40901@scalableinformatics.com> <486CE86B.90104@ias.edu> <486CEF27.8090507@scalableinformatics.com> <486E0D29.8060605@rri.sari.ac.uk> Message-ID: <2B343F81-02F2-4285-997E-C94500D23FEB@sanger.ac.uk> On 4 Jul 2008, at 12:44 pm, Tony Travis wrote: > One thing that I value from my BSD/SunOS/Solaris days is /export, > which is where ALL shared (exported) filesystems should be placed on > NFS servers. I'm a real supporter of Debian/Ubuntu, but it drives me > bonkers that Debian policy is to put home directories in /home. I > put them in: > > /export/home If you want a Debian system to do that, just: sed -i -e 's:^DHOME=/home:DHOME=/export/home:' /etc/adduser.conf Job done. All users created after that will be in /export/home > Naturally, I don't always practice what I preach and recently I've > been trying to work out to use the automounter the 'Debian' way ;-) There is no automounter "Debian way", at least not in my view, and I maintain one of the automounter packages for them. :-) You're free to do whatever you like. am-utils does have an example configuration it can set up, but the package does not assume you're using it that way, and makes no demands on what you have automounted and where. I have two automount intercept points on my machines; /nfs for home directories and general data directories, and /software for the sort of common software that we've been discussing here. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From tjrc at sanger.ac.uk Fri Jul 4 05:45:18 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Small Distributed Clusters In-Reply-To: <486E0522.2090600@rri.sari.ac.uk> References: <486E0522.2090600@rri.sari.ac.uk> Message-ID: On 4 Jul 2008, at 12:10 pm, Tony Travis wrote: > I suggest you have a look at "dsh" (Dancer's distributed shell) as a > simple way to run programs across local and geographically separate > nodes in your cluster. This is very simple, but works remarkably > well, especially if you use SSH keys for password-less authentication. dsh is very good, I agree, as is clusterssh. They have overlapping, but distinct, purposes. clusterssh gives you multiple xterms but with a single input window so you type into all xterms simultaneously (although you can still put the focus in an individual xterm, and just type commands to that one machine alone). clusterssh is useful for those tasks which are more interactive. dsh is better for rapid parallel running of non-interactive commands on very large numbers of machines... clusterssh gets a little unwieldy with more than 30 or so machines at a time, even if you set the xterm font to eye-wateringly small and have a monitor the size of a football pitch. Both are available as packages in Ubuntu/Debian. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From ajt at rri.sari.ac.uk Fri Jul 4 06:05:15 2008 From: ajt at rri.sari.ac.uk (Tony Travis) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] /usr/local over NFS is okay, Joe In-Reply-To: <2B343F81-02F2-4285-997E-C94500D23FEB@sanger.ac.uk> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> <486CD643.1050904@ias.edu> <486CE03D.40901@scalableinformatics.com> <486CE86B.90104@ias.edu> <486CEF27.8090507@scalableinformatics.com> <486E0D29.8060605@rri.sari.ac.uk> <2B343F81-02F2-4285-997E-C94500D23FEB@sanger.ac.uk> Message-ID: <486E200B.4060403@rri.sari.ac.uk> Tim Cutts wrote: > > On 4 Jul 2008, at 12:44 pm, Tony Travis wrote: > >> One thing that I value from my BSD/SunOS/Solaris days is /export, >> which is where ALL shared (exported) filesystems should be placed on >> NFS servers. I'm a real supporter of Debian/Ubuntu, but it drives me >> bonkers that Debian policy is to put home directories in /home. I put >> them in: >> >> /export/home > > If you want a Debian system to do that, just: > > sed -i -e 's:^DHOME=/home:DHOME=/export/home:' /etc/adduser.conf > > Job done. All users created after that will be in /export/home Hello, Tim. Thanks for the jungle tip, which I already know about... My point was that if you want peer2peer sharing of home directories in the timeless tradition of 4.xBSD and SunOS/Solaris, you need to do more than just decide that home directories should go in /export/home instead of /home. The convention I adopt is that *anything* exported via NFS goes in: /export However, this is not specified in the LFH (although "/export/usr" does appear in an example: http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/ The reason I'm posting this is to remind everyone that Sun worked out some good strategies for doing this sort of thing already and it's not a bad idea to put what you export/share via NFS into /export. >> Naturally, I don't always practice what I preach and recently I've >> been trying to work out to use the automounter the 'Debian' way ;-) > > There is no automounter "Debian way", at least not in my view, and I > maintain one of the automounter packages for them. :-) You're free to > do whatever you like. am-utils does have an example configuration it > can set up, but the package does not assume you're using it that way, > and makes no demands on what you have automounted and where. I have two > automount intercept points on my machines; /nfs for home directories and > general data directories, and /software for the sort of common software > that we've been discussing here. Yes, of course, we are free to put anything anywhere we want in Linux, but if you want other people to understand your conventions without long explanations then BSD/Sun have already set a pretty good example of how to go about it using /export and mount points like /home in automount maps. I'm just a little surprised that the LFH doesn't mention it :-) Tony. -- Dr. A.J.Travis, | mailto:ajt@rri.sari.ac.uk Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751 Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687 From eagles051387 at gmail.com Fri Jul 4 06:47:31 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: <5753D20A3C7E4233B4FAF0D8670A0423@geoffPC> References: <20080701193721.B6843B5404D@mx2.its.rochester.edu> <5753D20A3C7E4233B4FAF0D8670A0423@geoffPC> Message-ID: that also applies to the k/ubuntu as well it used to be you can edit the source list and do a complete dist upgrade. now that has change and requires the alternate installation cd. the first way was not worth the time because it broke stuff more then it was worth. the time spent on that could have been used for a totally clean install. the new method i have yet to try but from what i gather the wiki on it nereds some improvements since its rather ambiguous as to how to go about upgrading On 7/4/08, Geoff Galitz wrote: > > > > Just a nit: > > > > Most RPM based distros allow in-place upgrades between minor point releases > using "yum update" or "yum upgrade" (they follow different rules on how to > resolve obsolete packages). However, moving between major releases is still > recommended via a CD or other non-in-place media, though there are people > that have done it in-place you seriously risk inflicting harm to your system > in this manner. > > > > > > Geoff Galitz > Blankenheim NRW, Deutschland > http://www.galitz.org > ------------------------------ > > *From:* beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] *On > Behalf Of *Gregory Warnes > *Sent:* Mittwoch, 2. Juli 2008 02:50 > *To:* Mark Hahn > *Cc:* Beowulf > *Subject:* Re: [Beowulf] A press release > > > > [stuff snipped] > > > > Side note, one very nice thing about debian is the ability to upgrade a > system in-place from one O/S release to another via > > apt-get dist-upgrade > > Much nicer than reinstalling the O/S as seems to be (used to be?) the norm > with RPM-based systems > > -Greg > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080704/a848ba32/attachment.html From perry at piermont.com Fri Jul 4 06:54:39 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486DD08C.9070602@aei.mpg.de> (Carsten Aulbert's message of "Fri\, 04 Jul 2008 09\:26\:04 +0200") References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <20080702200448.GA17424@bx9.net> <87hcb76hpk.fsf@snark.cb.piermont.com> <20080702223714.GA5908@bx9.net> <486C6660.5070705@aei.mpg.de> <87wsk2lue4.fsf@snark.cb.piermont.com> <486DD08C.9070602@aei.mpg.de> Message-ID: <87zloxiwkg.fsf@snark.cb.piermont.com> Carsten Aulbert writes: >> What about my kernel patch to use unprived ports? Did you try it? > > No sorry, this approach with just setting the limits seems much easier > than installing 1300 new kernels ;) Testing would be done with one machine. It would be foolish to test such a thing on your production network. What if it crashed everything in sight? Once you know you want something, though, you should be able to install it quickly. If installing kernels on your entire cluster is difficult, you are not managing you cluster properly. What if you really need to install new kernels? You should be able to replace arbitrary software on thousands of machines with a single easy command. If you can't, you aren't spending enough time on system automation. Perry From carsten.aulbert at aei.mpg.de Fri Jul 4 07:11:51 2008 From: carsten.aulbert at aei.mpg.de (Carsten Aulbert) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <87zloxiwkg.fsf@snark.cb.piermont.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <20080702200448.GA17424@bx9.net> <87hcb76hpk.fsf@snark.cb.piermont.com> <20080702223714.GA5908@bx9.net> <486C6660.5070705@aei.mpg.de> <87wsk2lue4.fsf@snark.cb.piermont.com> <486DD08C.9070602@aei.mpg.de> <87zloxiwkg.fsf@snark.cb.piermont.com> Message-ID: <486E2FA7.4070100@aei.mpg.de> Perry E. Metzger wrote: > Testing would be done with one machine. It would be foolish to test > such a thing on your production network. What if it crashed everything > in sight? > Sure, testing always needs to start at the count of 1, then 2, 10, .... > Once you know you want something, though, you should be able to > install it quickly. If installing kernels on your entire cluster is > difficult, you are not managing you cluster properly. What if you > really need to install new kernels? > Did it yesterday, if I had pressed it it would have been done within ~10-15 minutes (along with other updates) > You should be able to replace arbitrary software on thousands of > machines with a single easy command. If you can't, you aren't spending > enough time on system automation. easy with dsh and fai softupdate :) Cheers Carsten From landman at scalableinformatics.com Fri Jul 4 07:41:26 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486E2FA7.4070100@aei.mpg.de> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <20080702200448.GA17424@bx9.net> <87hcb76hpk.fsf@snark.cb.piermont.com> <20080702223714.GA5908@bx9.net> <486C6660.5070705@aei.mpg.de> <87wsk2lue4.fsf@snark.cb.piermont.com> <486DD08C.9070602@aei.mpg.de> <87zloxiwkg.fsf@snark.cb.piermont.com> <486E2FA7.4070100@aei.mpg.de> Message-ID: <486E3696.7030309@scalableinformatics.com> Carsten Aulbert wrote: > easy with dsh and fai softupdate :) trivial with pdsh pdsh apt-get install package or pdsh yum install package Clusters/systems of arbitrary size. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From tjrc at sanger.ac.uk Fri Jul 4 08:30:39 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <486E3696.7030309@scalableinformatics.com> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <20080702200448.GA17424@bx9.net> <87hcb76hpk.fsf@snark.cb.piermont.com> <20080702223714.GA5908@bx9.net> <486C6660.5070705@aei.mpg.de> <87wsk2lue4.fsf@snark.cb.piermont.com> <486DD08C.9070602@aei.mpg.de> <87zloxiwkg.fsf@snark.cb.piermont.com> <486E2FA7.4070100@aei.mpg.de> <486E3696.7030309@scalableinformatics.com> Message-ID: <7CCC45A5-FF25-41A8-BD10-E52A0DA9DAD5@sanger.ac.uk> On 4 Jul 2008, at 3:41 pm, Joe Landman wrote: > Carsten Aulbert wrote: > >> easy with dsh and fai softupdate :) > > trivial with pdsh > > pdsh apt-get install package Actually, that one could get you in a mess if the package is going to to ask questions. You might want to shut apt-get up. I actually use a small wrapper script for apt-get -- actually, I use aptitude these days, because its dependency handling is better -- which we call niagi: #!/bin/sh # # niagi - noninteractive aptitude install # DEBIAN_FRONTEND=noninteractive \ /usr/bin/aptitude -R -y \ -o Dpkg::Options::="--force-confdef" \ -o Dpkg::Options::="--force-confold" \ install "$@" This forces aptitude not to ask you what to do with the configuration files, if they've been locally modified, but also forces it to be conservative, always use your existing configuration file, if it's already present. Otherwise it configures the package defaults. Combine this little script with cfengine, dsh or whatever, and you have a winner. You can even use it to remove things, because aptitude install accepts suffixes to tell it to do other things. For example, say you wanted to replaced lprng with cups, you can do that in one fell swoop with: aptitude install lprng- cupsys (this is another reason I switched from apt-get to aptitude, although I'd caution against using aptitude like this in sarge - the version in etch and later is fine, but the old sarge one can bite occasionally) Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From landman at scalableinformatics.com Fri Jul 4 08:34:54 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <7CCC45A5-FF25-41A8-BD10-E52A0DA9DAD5@sanger.ac.uk> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <20080702200448.GA17424@bx9.net> <87hcb76hpk.fsf@snark.cb.piermont.com> <20080702223714.GA5908@bx9.net> <486C6660.5070705@aei.mpg.de> <87wsk2lue4.fsf@snark.cb.piermont.com> <486DD08C.9070602@aei.mpg.de> <87zloxiwkg.fsf@snark.cb.piermont.com> <486E2FA7.4070100@aei.mpg.de> <486E3696.7030309@scalableinformatics.com> <7CCC45A5-FF25-41A8-BD10-E52A0DA9DAD5@sanger.ac.uk> Message-ID: <486E431E.7040807@scalableinformatics.com> Tim Cutts wrote: > > On 4 Jul 2008, at 3:41 pm, Joe Landman wrote: > >> Carsten Aulbert wrote: >> >>> easy with dsh and fai softupdate :) >> >> trivial with pdsh >> >> pdsh apt-get install package > > Actually, that one could get you in a mess if the package is going to to > ask questions. You might want to shut apt-get up. I actually use a > small wrapper script for apt-get -- actually, I use aptitude these days, > because its dependency handling is better -- which we call niagi: oooohhhh apt-foo kung-fu ... wisdom received and greatly appreciated! -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From carsten.aulbert at aei.mpg.de Fri Jul 4 09:53:36 2008 From: carsten.aulbert at aei.mpg.de (Carsten Aulbert) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] automount on high ports In-Reply-To: <7CCC45A5-FF25-41A8-BD10-E52A0DA9DAD5@sanger.ac.uk> References: <20080701093643.GA17845@gretchen.aei.uni-hannover.de> <878wwlk65p.fsf@snark.cb.piermont.com> <20080701164747.GA15901@gretchen.aei.uni-hannover.de> <87fxqtuzh8.fsf@snark.cb.piermont.com> <486B2DC2.9010604@aei.mpg.de> <87d4lweagv.fsf@snark.cb.piermont.com> <20080702200448.GA17424@bx9.net> <87hcb76hpk.fsf@snark.cb.piermont.com> <20080702223714.GA5908@bx9.net> <486C6660.5070705@aei.mpg.de> <87wsk2lue4.fsf@snark.cb.piermont.com> <486DD08C.9070602@aei.mpg.de> <87zloxiwkg.fsf@snark.cb.piermont.com> <486E2FA7.4070100@aei.mpg.de> <486E3696.7030309@scalableinformatics.com> <7CCC45A5-FF25-41A8-BD10-E52A0DA9DAD5@sanger.ac.uk> Message-ID: <486E5590.7060401@aei.mpg.de> Hi Tim, Tim Cutts wrote: > >> trivial with pdsh >> >> pdsh apt-get install package > Well with dsh it's the same, but "our" way ensures that the nodes will have the exactly same set-up after a reinstallation ;) > DEBIAN_FRONTEND=noninteractive \ > /usr/bin/aptitude -R -y \ > -o Dpkg::Options::="--force-confdef" \ > -o Dpkg::Options::="--force-confold" \ > install "$@" > > This forces aptitude not to ask you what to do with the configuration > files, if they've been locally modified, but also forces it to be > conservative, always use your existing configuration file, if it's > already present. Otherwise it configures the package defaults. Combine > this little script with cfengine, dsh or whatever, and you have a winner. > Brutally you could also use "yes yes" ;) > You can even use it to remove things, because aptitude install accepts > suffixes to tell it to do other things. For example, say you wanted to > replaced lprng with cups, you can do that in one fell swoop with: > > aptitude install lprng- cupsys > > (this is another reason I switched from apt-get to aptitude, although > I'd caution against using aptitude like this in sarge - the version in > etch and later is fine, but the old sarge one can bite occasionally) yes, that's also my reason why I prefer aptitude over apt For completeness: http://www.informatik.uni-koeln.de/fai/ Cheers Carsten From rgb at phy.duke.edu Fri Jul 4 10:24:51 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: <5753D20A3C7E4233B4FAF0D8670A0423@geoffPC> References: <20080701193721.B6843B5404D@mx2.its.rochester.edu> <5753D20A3C7E4233B4FAF0D8670A0423@geoffPC> Message-ID: On Fri, 4 Jul 2008, Geoff Galitz wrote: > > > Just a nit: > > > > Most RPM based distros allow in-place upgrades between minor point releases > using "yum update" or "yum upgrade" (they follow different rules on how to > resolve obsolete packages). However, moving between major releases is still > recommended via a CD or other non-in-place media, though there are people > that have done it in-place you seriously risk inflicting harm to your system > in this manner. Sure, although I've done it (actually, there are a LOT of people that have done it) and I've never heard of anybody actually screwing everything up. To some extent it depends on how the system was managed and how serious the changes are between major releases. If you installed a "standard" system and used only yum to install and update from a standard set of repos, then you have almost certainly avoided RPM hell and have a very high chance of succeeding with an upgrade, with of course some work likely to be required deciding what to do when packages disappear or major libraries move. That work is required for ANY system -- independent of packaging or manager -- when major libraries change and packages disappear and new tools appear. Installing from scratch simply ensures that the tools that are installed are consistent, but it still leaves one dealing with the your favorite one that has disappeared or the new one that you have to figure out or your favorite personal program that has to be rebuilt and maybe even hacked first to accomodate an new library interface. If, on the other hand, you installed your system, then built eighteen pieces of software on your own and installed them, overwriting libraries and configuration files that were installed from RPM, do a couple of rpm --force's, and manage in the process to move yourself deep into RPM hell, well, what is going to be able to safely upgrade that? I tend to reinstall upgrades most of the time instead of upgrade, but that's only because kickstart makes that so easy that it is actually faster AND safer than screwing around with a local upgrade, and sure, there is the possibility of trouble if you do it otherwise, and who likes trouble (even if you've never heard of anybody who has actually HAD trouble). rgb > > > > > > Geoff Galitz > Blankenheim NRW, Deutschland > http://www.galitz.org > > _____ > > From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On > Behalf Of Gregory Warnes > Sent: Mittwoch, 2. Juli 2008 02:50 > To: Mark Hahn > Cc: Beowulf > Subject: Re: [Beowulf] A press release > > > > [stuff snipped] > > > > Side note, one very nice thing about debian is the ability to upgrade a > system in-place from one O/S release to another via > > apt-get dist-upgrade > > Much nicer than reinstalling the O/S as seems to be (used to be?) the norm > with RPM-based systems > > -Greg > > > > > > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From lindahl at pbm.com Fri Jul 4 14:08:15 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: References: <20080702050726.81FAA7343C2@mx6.its.rochester.edu> Message-ID: <20080704210815.GA28048@bx9.net> On Wed, Jul 02, 2008 at 01:29:39AM -0400, Gregory Warnes wrote: > On fundimental difference in philospohy explains both the fundimental > differences between RPM and debian packages, and the reason for the lack of > emphasis of in-place upgrades of desktop distros: vendor income. It is not > in Red Hat?s financial interest to make it easy to upgrade a system in-place > by an automated tool. They make money by selling new O/S versions. > Consequently, Red Hat explicitly designed the RPM format to discourage > in-place upgrades. Please take off your tin hat. Red Hat sells by subscription, so, it doesn't matter which version of RHEL you are running, just the count of servers. See: https://www.redhat.com/apps/store/server/ and note that there are no version numbers mentioned. -- greg From gdjacobs at gmail.com Fri Jul 4 16:19:54 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: References: Message-ID: <486EB01A.5070209@gmail.com> Gregory Warnes wrote: > > > > On 7/1/08 3:25PM , "Mark Hahn" wrote: > > > Hmmm.... for me, its all about the kernel. Thats 90+% of the > battle. Some > > distros use good kernels, some do not. I won't mention who I think > is in the > > latter category. > > I was hoping for some discussion of concrete issues. for instance, > I have the impression debian uses something other than sysvinit - > does that work out well? > > Debian uses standard sysvinit-style scripts in /etc/init.d, /etc/rc0.d, ... > > is it a problem getting commercial > packages (pathscale/pgi/intel compilers, gaussian, etc) to run? > > I?ve never had any major problems. Most linux vendors supply both RPM?s > and .tar.gz installers, and I generally have better luck with the > latter, even on RPM based systems anyway. > > > the couple debian people I know tend to have more ideological motives > (which I do NOT impugn, except that I am personally more swayed by > practical, concrete reasons.) > > My ?conversion? to use of Debian had little to do with ideological > motives, and a lot more to do with minimizing the amount of time I had > to take away from my research to support the Linux clusters I was > maintaining at the time. > > Side note, one very nice thing about debian is the ability to upgrade a > system in-place from one O/S release to another via > > apt-get dist-upgrade > > Much nicer than reinstalling the O/S as seems to be (used to be?) the > norm with RPM-based systems > > -Greg I did in place upgrades for RH8 machines to RH9 using APT-RPM, back in the day. I'm not sure about Yum, as I just haven't had cause to use RH/FC in some time. -- Geoffrey D. Jacobs From csamuel at vpac.org Sat Jul 5 02:49:36 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: <827522005.154971215251080315.JavaMail.root@zimbra.vpac.org> Message-ID: <152656872.154991215251376682.JavaMail.root@zimbra.vpac.org> ----- "Jon Aquilina" wrote: > that also applies to the k/ubuntu as well it used to > be you can edit the source list and do a complete > dist upgrade. now that has change and requires the > alternate installation cd. Er, no it doesn't. https://help.ubuntu.com/community/HardyUpgrades/ The supported way for servers is with do-release-upgrade (from the update-manager-core package). cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Sat Jul 5 02:54:36 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: Message-ID: <842288399.155021215251676766.JavaMail.root@zimbra.vpac.org> ----- "Robert G. Brown" wrote: > ...and it can break the hell out of the elaborate > dependency system if you go installing random libraries > in e.g. /usr/local Oh indeed, our current method is to use: /usr/local/$package/$version and then use Modules to let people set up the appropriate environment for them. We have also reduced our customisations of users init scripts to just adding: module load vpac and then having the vpac modules load what we recommend as a default environment. When we change those settings we put a new (dated) vpac module in and so let users go back should they so wish. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Sat Jul 5 03:01:45 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: <362688751.155051215251832582.JavaMail.root@zimbra.vpac.org> Message-ID: <2045038041.155071215252105917.JavaMail.root@zimbra.vpac.org> ----- "Joe Landman" wrote: > eeek!! something named local is shared??? No, /usr/local is local to the cluster, the compute nodes are just drones in the Borg collective. ;-) -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From eagles051387 at gmail.com Sat Jul 5 03:12:20 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: <2045038041.155071215252105917.JavaMail.root@zimbra.vpac.org> References: <362688751.155051215251832582.JavaMail.root@zimbra.vpac.org> <2045038041.155071215252105917.JavaMail.root@zimbra.vpac.org> Message-ID: resistance is futile :p On Sat, Jul 5, 2008 at 12:01 PM, Chris Samuel wrote: > > ----- "Joe Landman" wrote: > > > eeek!! something named local is shared??? > > No, /usr/local is local to the cluster, the compute > nodes are just drones in the Borg collective. ;-) > > -- > Christopher Samuel - (03) 9925 4751 - Systems Manager > The Victorian Partnership for Advanced Computing > P.O. Box 201, Carlton South, VIC 3053, Australia > VPAC is a not-for-profit Registered Research Agency > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080705/7c4a76c1/attachment.html From csamuel at vpac.org Sat Jul 5 03:30:50 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: <1501882596.155101215252266373.JavaMail.root@zimbra.vpac.org> Message-ID: <1124767276.155141215253850248.JavaMail.root@zimbra.vpac.org> ----- "Jon Aquilina" wrote: > one thing must not be forgotten though. in regards to pkging stuff for > the ubuntu variation once someone like you and me you upload it for > someone higher up on the chain to check and upload to the servers. so > basically someone is checking what someone else has packaged. Ubuntu has the concept of a Personal Package Archive (PPA) which will build x86 and AMD64 packages from a source package that you provide and builds a repo from them that you (and others) can apt-get from. https://help.launchpad.net/PPAQuickStart -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Sat Jul 5 03:34:52 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: Message-ID: <721042444.155171215254092247.JavaMail.root@zimbra.vpac.org> ----- "Mark Hahn" wrote: > >> I was hoping for some discussion of concrete issues. for > instance, > >> I have the impression debian uses something other than sysvinit - > >> does that work out well? > >> > > Debian uses standard sysvinit-style scripts in /etc/init.d, > /etc/rc0.d, ... > > thanks. I guess I was assuming that mainstream debian was like > ubuntu. Fedora has also adopted Upstart, and given that RHEL is said to be based off Fedora it'll be interesting to see whether this gets adopted there too.. http://fedoraproject.org/wiki/Features/Upstart cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Sat Jul 5 03:36:45 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: <486A8B33.7020600@scalableinformatics.com> Message-ID: <660319876.155201215254205322.JavaMail.root@zimbra.vpac.org> ----- "Joe Landman" wrote: > Yeah ... can't escape this. I like some of the elements of > Ubuntu/Debian better than I do RHEL (the network configuration > in Debian is IMO sane, while in RHEL/Centos/SuSE it is not). > There are some aspects that are worse (no /etc/profile.d ... > so I add that back in by hand ). Shouldn't be necessary on Ubuntu these days: chris@quad:~$ cat /etc/issue Ubuntu 8.04.1 \n \l chris@quad:~$ dlocate /etc/profile.d base-files: /etc/profile.d cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Sat Jul 5 03:43:35 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Small Distributed Clusters In-Reply-To: Message-ID: <405946662.155231215254615483.JavaMail.root@zimbra.vpac.org> ----- "Tim Cutts" wrote: > clusterssh gets a little unwieldy with more than 30 or so > machines at a time, even if you set the xterm font to eye-wateringly > small and have a monitor the size of a football pitch. This was pretty much the conclusion of the folks who were using it to admin the 45 Linksys AP's running OpenWRT for LCA 2008, the auto-sized tiling of Windows was sub-optimal for their purposes (though it worked very well). cheers! Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Sat Jul 5 03:46:39 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: Message-ID: <2019408218.155261215254799082.JavaMail.root@zimbra.vpac.org> ----- "Jon Aquilina" wrote: > would it be possible to back up to tape as well as raided hdd array? Of course, this has been a feature of various backup systems (free and proprietary) for many years. cheers! Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From john.hearns at streamline-computing.com Sat Jul 5 04:16:24 2008 From: john.hearns at streamline-computing.com (John Hearns) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: <2BADCF250DCC4A6CAB307A8E4B0A4C32@geoffPC> References: <200806281735.m5SHZ8vS025843@bluewest.scyld.com> <2BADCF250DCC4A6CAB307A8E4B0A4C32@geoffPC> Message-ID: <1215256594.5035.3.camel@Vigor13> On Fri, 2008-07-04 at 09:10 +0200, Geoff Galitz wrote: > Backing up to tape allows you to go back to a specific point in > history. Particularly useful if you need to recover a file that has > become corrupted or you need to rollback to a specific stage and you > are unaware of that fact for a few days. dirvish allows you to do exactly this on a RAID based backup system: http://www.dirvish.org/ Dirvish has the concept of a "vault" which is defined to have a cerain lifetime (weeks, months, years...) You make backup copies to your vault - the smart thing being that any files which are unchanged since the last backup are links to the first copy of the file. So your vault size does not grow and grow endlessly. You can roll back to any given date. From csamuel at vpac.org Sat Jul 5 05:27:56 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: <1215256594.5035.3.camel@Vigor13> Message-ID: <2128671214.155311215260876971.JavaMail.root@zimbra.vpac.org> ----- "John Hearns" wrote: > - the smart thing being that any files which are unchanged since the > last backup are links to the first copy of the file. So your vault > size does not grow and grow endlessly. You can roll back to any given > date. FWIW BackupPC claims to do the same, extending that to duplicate copies across multiple machines. Of course then you want to be sure that the single copy you have on disk doesn't go bad.. http://backuppc.sourceforge.net/info.html cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From eagles051387 at gmail.com Sat Jul 5 07:22:09 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: <2128671214.155311215260876971.JavaMail.root@zimbra.vpac.org> References: <1215256594.5035.3.camel@Vigor13> <2128671214.155311215260876971.JavaMail.root@zimbra.vpac.org> Message-ID: what i dont understand is why someone would want to invest in something that is already quite expensive instead of using a method which is not expensive and in a way provides double redundency. On Sat, Jul 5, 2008 at 2:27 PM, Chris Samuel wrote: > > ----- "John Hearns" wrote: > > > - the smart thing being that any files which are unchanged since the > > last backup are links to the first copy of the file. So your vault > > size does not grow and grow endlessly. You can roll back to any given > > date. > > FWIW BackupPC claims to do the same, extending that to > duplicate copies across multiple machines. Of course > then you want to be sure that the single copy you have > on disk doesn't go bad.. > > http://backuppc.sourceforge.net/info.html > > cheers, > Chris > -- > Christopher Samuel - (03) 9925 4751 - Systems Manager > The Victorian Partnership for Advanced Computing > P.O. Box 201, Carlton South, VIC 3053, Australia > VPAC is a not-for-profit Registered Research Agency > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080705/fa331247/attachment.html From gerry.creager at tamu.edu Sat Jul 5 08:03:26 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: References: <1215256594.5035.3.camel@Vigor13> <2128671214.155311215260876971.JavaMail.root@zimbra.vpac.org> Message-ID: <486F8D3E.5060003@tamu.edu> In my data management exploits, I'm inclined to have first-tier (iSCSI) disk, second-tier (AoE) disk, and third-tier (remote site) storage. If I can manage the remote site as another storage server farm, with rotating media, great. If I can manage it with robotic tape, great. I still duplicate Tier 2 data to Tier 3 for disaster recovery. A lot of this depends on how serious you are about being able to get your data back. Even though I can tap the ultimate archival site for the meteorological data I retain, translating it from netcdf to database is time-consuming and requires a human to babysit at times. Being able to respond nearly immediately to user requests for data from the Tier 1 data makes our services more valuable (and makes my work with data assimilation for weather models easier/faster). I retain some 90 days of data on Tier 1. Requests for data floated off to Tier 2 take longer to fill but the data holdings are, for all intents and purposes, permanent. Takes longer to get the data off but users know and understand that, and a simple e-mail tells 'em it's ready. Permanent, less-volatile Tier 3 storage is disaster-recovery stuff. Similarly, for hurricanes making US landfall, we also store data away on DVD to make its retrieval a (little) bit easier to locate. We use a database to maintain an inventory of where things are on disk, with significant file metadata, but sometimes it's easier to go to the DVD storage case to retrieve that stuff. If you're not as worried about how you'll recover your data after the inevitable storage failure (ask me about burning a RAID shelf down, some day), then not worrying about diversity in data storage/management isn't as big an issue. gerry Jon Aquilina wrote: > what i dont understand is why someone would want to invest in something > that is already quite expensive instead of using a method which is not > expensive and in a way provides double redundency. > > On Sat, Jul 5, 2008 at 2:27 PM, Chris Samuel > wrote: > > > ----- "John Hearns" > wrote: > > > - the smart thing being that any files which are unchanged since the > > last backup are links to the first copy of the file. So your vault > > size does not grow and grow endlessly. You can roll back to any given > > date. > > FWIW BackupPC claims to do the same, extending that to > duplicate copies across multiple machines. Of course > then you want to be sure that the single copy you have > on disk doesn't go bad.. > > http://backuppc.sourceforge.net/info.html > > cheers, > Chris > -- > Christopher Samuel - (03) 9925 4751 - Systems Manager > The Victorian Partnership for Advanced Computing > P.O. Box 201, Carlton South, VIC 3053, Australia > VPAC is a not-for-profit Registered Research Agency > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > -- > Jonathan Aquilina > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From landman at scalableinformatics.com Sat Jul 5 09:19:38 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] zfs tuning for HJPC/cluster workloads? Message-ID: <486F9F1A.4070201@scalableinformatics.com> Hi folks: Investigating zfs on a Solaris 10 5/08 loaded JackRabbit for a customer. zfs performance isn't that good relative to Linux on this same hardware (literally a reboot between the two environments) I am looking for ways to tune zfs, or even Solaris so we can hopefully get to parity with Linux (less than 50% of Linux performance now c.f. http://scalability.org/?p=640 ). What I have found online has been http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide and a number of others. Even the dangerous tuning methods mentioned in this document (turning off Zil), don't really help all that much. This is for an IO intensive application with multiple threads doing 1-100 GB streaming reads. Offline from others I have heard similar issues, so if someone knows how to tweak the OS or FS to get good performance, please fire me over a pointer ... I would appreciate it ! Thanks. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From tjrc at sanger.ac.uk Sat Jul 5 10:41:42 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Re: A press release In-Reply-To: References: <20080701193721.B6843B5404D@mx2.its.rochester.edu> <5753D20A3C7E4233B4FAF0D8670A0423@geoffPC> <2C7CE2AF-E2DB-444E-8F91-559062363FF7@sanger.ac.uk> Message-ID: <753AC097-661C-4C2A-8782-A1962090A262@sanger.ac.uk> On 4 Jul 2008, at 10:11 pm, R P Herrold wrote: > On Fri, 4 Jul 2008, Tim Cutts wrote: > >> If upgrading packages wrecks the system, then the package >> installation scripts are broken. They should spot the upgrade in >> progress and take appropriate action, depending on the previously >> installed version. This can be quite a detailed process for Debian >> packages, which is probably why they have fewer problems than Red >> Hat in this regard. See http://www.debian.org/doc/debian-policy/ch-maintainerscripts.html#s-mscriptsinstact >> if you're interested in how it works for Debian packages. > > well, no so much. The 2.4 to 2.6 kernel transition was handled no > better by Debian 'testing', than CentOS/Red Hat derived. ;) Well, yes. That's, er, why it's called "testing". :-) You're comparing apples to oranges. If you upgraded a 2.4 kernel sarge system (which was a stable release) to etch (which uses 2.6.18) you'll probably find it goes quite well. If you're running production services off the testing track, you can probably expect to get burned occasionally. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From csamuel at vpac.org Sat Jul 5 17:42:16 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] zfs tuning for HJPC/cluster workloads? In-Reply-To: <1223811406.155511215304253502.JavaMail.root@zimbra.vpac.org> Message-ID: <773796376.155531215304936105.JavaMail.root@zimbra.vpac.org> ----- "Joe Landman" wrote: > Hi folks: Hi Joe, > Investigating zfs on a Solaris 10 5/08 loaded JackRabbit for a > customer. [...] > > I am looking for ways to tune zfs, or even Solaris so we can > hopefully get to parity with Linux (less than 50% of Linux performance > now c.f. http://scalability.org/?p=640 ). Don't know if you realise this, but you have to get written permission from Sun before being able to publish any Solaris 10 benchmarks.. http://www.sun.com/software/solaris/licensing/sla.xml # (f) You may not publish or provide the results of any benchmark # or comparison tests run on Software to any third party without # the prior written consent of Sun. The Linux NFSv4 folks had to get this before being able to post interoperability results with Solaris. http://linux-nfs.org/pipermail/nfsv4/2005-October/002647.html I don't believe that this restriction applies to OpenSolaris (IANAL, YMMV, IIRC). FWIW my own testing of ZFS under OpenSolaris, compared with other Linux filesystems, showed the same issue for write's but much faster read performance. This was about a year ago though! http://www.csamuel.org/articles/emerging-filesystems-200709/#id2538499 cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From landman at scalableinformatics.com Sat Jul 5 18:31:07 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] zfs tuning for HJPC/cluster workloads? In-Reply-To: <773796376.155531215304936105.JavaMail.root@zimbra.vpac.org> References: <773796376.155531215304936105.JavaMail.root@zimbra.vpac.org> Message-ID: <4870205B.5070202@scalableinformatics.com> Chris Samuel wrote: > Don't know if you realise this, but you have to get written > permission from Sun before being able to publish any Solaris > 10 benchmarks.. > > http://www.sun.com/software/solaris/licensing/sla.xml Ugh .. no I didn't. Thanks. Posting of results removed from blog. I think I understand why they don't want benchmarks published. [...] > I don't believe that this restriction applies to OpenSolaris > (IANAL, YMMV, IIRC). Sort of says something about the "Open Source" nature of the OS. > FWIW my own testing of ZFS under OpenSolaris, compared > with other Linux filesystems, showed the same issue for > write's but much faster read performance. This was about > a year ago though! > > http://www.csamuel.org/articles/emerging-filesystems-200709/#id2538499 What limited information I have been able to find suggests that "thar be dragons" (e.g. don't tune, that it is smarter than you are, and it auto tunes). Yeah. Right. I just want this file system to be about on par with its Linux counterpart on the same hardware. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From csamuel at vpac.org Sat Jul 5 19:24:19 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] zfs tuning for HJPC/cluster workloads? In-Reply-To: <197681176.155621215310963297.JavaMail.root@zimbra.vpac.org> Message-ID: <86407002.155691215311059272.JavaMail.root@zimbra.vpac.org> ----- "Joe Landman" wrote: > I think I understand why they don't want benchmarks published. :-) Don't forget that FreeBSD 7 includes ZFS support, so there's another option for you there. http://wiki.freebsd.org/ZFS https://www.ish.com.au/solutions/articles/freebsdzfs > I just want this file system to be about on par with > its Linux counterpart on the same hardware. I reckon if you're going to compare apples to apples then you probably want to test it against a Linux checksumming filesystem, the only real contender I'm aware of being btrfs (still pre-alpha). http://btrfs.wiki.kernel.org/ Alternatively you can disable checksumming on a ZFS volume and retest it against XFS, etc. zfs set checksum=off foo/bar cheers! Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From landman at scalableinformatics.com Sat Jul 5 19:50:21 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] zfs tuning for HJPC/cluster workloads? In-Reply-To: <86407002.155691215311059272.JavaMail.root@zimbra.vpac.org> References: <86407002.155691215311059272.JavaMail.root@zimbra.vpac.org> Message-ID: <487032ED.7030103@scalableinformatics.com> Chris Samuel wrote: > ----- "Joe Landman" wrote: > >> I think I understand why they don't want benchmarks published. > > :-) > > Don't forget that FreeBSD 7 includes ZFS support, so > there's another option for you there. Not sure we can do this, as the user is looking at Linux and Solaris. Introducing a third element might not fly (and the commercial software they want to run is restricted to Linux, Windows, and Solaris). > > http://wiki.freebsd.org/ZFS > > https://www.ish.com.au/solutions/articles/freebsdzfs Maybe we will try benchmarking with that. >> I just want this file system to be about on par with >> its Linux counterpart on the same hardware. > > I reckon if you're going to compare apples to apples > then you probably want to test it against a Linux > checksumming filesystem, the only real contender > I'm aware of being btrfs (still pre-alpha). > > http://btrfs.wiki.kernel.org/ > > Alternatively you can disable checksumming on a ZFS > volume and retest it against XFS, etc. > > zfs set checksum=off foo/bar I did turn off checksum, zil, and other things. No dice. Zfs does not appear to do well with hardware accelerated RAID. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From eagles051387 at gmail.com Sun Jul 6 04:32:45 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: <1124767276.155141215253850248.JavaMail.root@zimbra.vpac.org> References: <1501882596.155101215252266373.JavaMail.root@zimbra.vpac.org> <1124767276.155141215253850248.JavaMail.root@zimbra.vpac.org> Message-ID: the only thing i have see ppa used for is obtaining amarok 2 nightly builds. On Sat, Jul 5, 2008 at 12:30 PM, Chris Samuel wrote: > > ----- "Jon Aquilina" wrote: > > > one thing must not be forgotten though. in regards to pkging stuff for > > the ubuntu variation once someone like you and me you upload it for > > someone higher up on the chain to check and upload to the servers. so > > basically someone is checking what someone else has packaged. > > Ubuntu has the concept of a Personal Package Archive (PPA) > which will build x86 and AMD64 packages from a source package > that you provide and builds a repo from them that you (and others) > can apt-get from. > > https://help.launchpad.net/PPAQuickStart > > -- > Christopher Samuel - (03) 9925 4751 - Systems Manager > The Victorian Partnership for Advanced Computing > P.O. Box 201, Carlton South, VIC 3053, Australia > VPAC is a not-for-profit Registered Research Agency > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080706/a9d5cf86/attachment.html From tortay at cc.in2p3.fr Sun Jul 6 12:17:12 2008 From: tortay at cc.in2p3.fr (Loic Tortay) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] zfs tuning for HJPC/cluster workloads? In-Reply-To: <486F9F1A.4070201@scalableinformatics.com> References: <486F9F1A.4070201@scalableinformatics.com> Message-ID: <48711A38.8090807@cc.in2p3.fr> Joe Landman wrote: > > Investigating zfs on a Solaris 10 5/08 loaded JackRabbit for a > customer. zfs performance isn't that good relative to Linux on this > same hardware (literally a reboot between the two environments) > > I am looking for ways to tune zfs, or even Solaris so we can hopefully > get to parity with Linux (less than 50% of Linux performance now c.f. > http://scalability.org/?p=640 ). What I have found online has been > http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide > > and a number of others. Even the dangerous tuning methods mentioned > in this document (turning off Zil), don't really help all that much. > > This is for an IO intensive application with multiple threads doing > 1-100 GB streaming reads. > > Offline from others I have heard similar issues, so if someone knows > how to tweak the OS or FS to get good performance, please fire me over a > pointer ... I would appreciate it ! > We have seen the same issue on (non Sun) high density storage servers which performed correctly with RHEL5 & XFS but comparatively poorly with Solaris 10 & ZFS. ZFS seems to be extremely sensitive to the quality/behaviour of the driver for the HBA or RAID/disk controller, especially with SATA disks (for NCQ support). Having a driver is not enough, a good one is required. Another point is that ZFS requires a different configuration "mindset" than "ordinary" RAID. Have you noticed the "small vdev" advice on the Solaris Internals Wiki ? This is probably the single most important hint for ZFS configuration. IOW, most of the time you can't just use the same underlying configuration with ZFS as the one you (would) use with Linux. This means that you may need to trade usable space for performance, sometimes in more drastic ways than with ordinary RAID. Finally, like it or not, ZFS is often more happy/efficient when it does the RAID itself (no "hardware" RAID controller or LVM involved). Lo?c. PS: regarding your other message in this thread (and your blog), you seem confused: the "open source" OS is OpenSolaris, not Solaris 10. The benchmark publishing restriction only applies to Solaris 10 (see ). PPS: while I dislike Sun's policy, I specifically remember being told by someone from a DOE lab (who did actually evaluate your product about 18 months ago) that you didn't want their unfavorable benchmarks results to be published. You can't have it both ways. -- | Lo?c Tortay - IN2P3 Computing Centre | From landman at scalableinformatics.com Sun Jul 6 13:47:59 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] zfs tuning for HJPC/cluster workloads? In-Reply-To: <48711A38.8090807@cc.in2p3.fr> References: <486F9F1A.4070201@scalableinformatics.com> <48711A38.8090807@cc.in2p3.fr> Message-ID: <48712F7F.1040002@scalableinformatics.com> Loic Tortay wrote: > Joe Landman wrote: [...] > We have seen the same issue on (non Sun) high density storage servers > which performed correctly with RHEL5 & XFS but comparatively poorly with > Solaris 10 & ZFS. > > ZFS seems to be extremely sensitive to the quality/behaviour of the > driver for the HBA or RAID/disk controller, especially with SATA disks > (for NCQ support). Having a driver is not enough, a good one is required. > > Another point is that ZFS requires a different configuration "mindset" than > "ordinary" RAID. Hmmmm.... Here is what I like. Setting up a raid is painless. Really painless. Here is what I don't like. I can't tune that raid. Well, I can, by tearing it down and starting again. I tried turning off checksum, compression, even zil. The thing I wanted to do was to put the log onto another device, and following the man pages on this resulted in errors. zpool would not have it. > Have you noticed the "small vdev" advice on the Solaris Internals Wiki ? Yeah, they mention 10 drives or less. I tried it with two 8-drive vdevs, 1x 16-drive vdev, and a few other things. > This is probably the single most important hint for ZFS configuration. > IOW, most of the time you can't just use the same underlying > configuration with ZFS as the one you (would) use with Linux. > This means that you may need to trade usable space for performance, > sometimes in more drastic ways than with ordinary RAID. Tried a few methods. Understand, we have a preference to show the fastest possible speed on our units. So we want to figure out how to tune/tweak zfs for these systems. > > Finally, like it or not, ZFS is often more happy/efficient when it does > the RAID itself (no "hardware" RAID controller or LVM involved). The performance on pure zfs sw-only raid was lower (significantly) than the hardware RAID running solaris. I tried several variations on this. That and the crashing (driver related I believe) concern me. I would like to be able to get the performance that some imply I can get out of it. I certainly would like to be able to tune it. > Lo?c. > > PS: regarding your other message in this thread (and your blog), you > seem confused: the "open source" OS is OpenSolaris, not Solaris 10. Hmmm .... we keep hearing that "Solaris is open source" without providing any distinction between Sun Solaris and Open Solaris. Maybe it is marketing not being precise on this. Ask your Sun sales rep if Solaris is open source, without specifying which one. The answer will be "yes". Ambiguity? Yes. On purpose? I dunno. > The benchmark publishing restriction only applies to Solaris 10 (see > ). Yup. Will eventually try OpenSolaris on this gear. > PPS: while I dislike Sun's policy, I specifically remember being told by > someone from a DOE lab (who did actually evaluate your product about 18 > months ago) that you didn't want their unfavorable benchmarks results to > be published. You can't have it both ways. Owie ... no one is having it "both ways" Luc. Everything we are doing in testing is in the open, and we invite both comment and criticism ... like "Hey buddy, turn up read-ahead" or "luser, turn off compression." Our tests and results are open. Others can run them, and report back results. If they give me permission to publish them, I will. If they publish them, I may critique them (we reserve the right to respond). As a note also, you just dragged an external group into this discussion, and I am guessing that they really didn't want to be. So I am going to tread carefully here. We published a critique of the published "evaluation", pointing to the faults, and doing a thorough job of analyzing the same. We didn't deny them the right to publish their results. As a result of this, we got in return, a rather nasty email/blog post trail. I still have it in my mail archives, and it is hidden in the blog archives. I won't rehash it, other than to point out that some on this list would take issue with the results. I removed my critique after they asked me to, with them promising in return to amend and address my criticisms. As far as I can tell, they withdrew their report, and did not amend or address my criticisms. More curious are the reports that the group responsible for this report, has run away from their (formerly) preferred platform towards a BlueArc platform. There was a nice quote from the principal author of the report to this effect (moving forward with BlueArc) last year in HPCWire, for what they were considering the other unit (thumper) for. This said, they were free to use the unit and publish benchmark results, which they did. We criticized the benchmark they did for its flaws in analysis, in execution, and setup, as we were free to do. Nobody is having it "both ways" Luc. We reserve the right to respond, and we did. We did not ask them to take down the report. They did ask us to take our criticisms of their report down. FWIW: I will not name or divulge the group's name in public or private. I ask that anyone with knowledge of this group also keep their names/affiliation out of the discussion. Luc dragged them in here, and I would like to accord them some measure of privacy, no matter whether I agree or disagree with them. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From landman at scalableinformatics.com Sun Jul 6 13:53:51 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] zfs tuning for HJPC/cluster workloads? In-Reply-To: <48712F7F.1040002@scalableinformatics.com> References: <486F9F1A.4070201@scalableinformatics.com> <48711A38.8090807@cc.in2p3.fr> <48712F7F.1040002@scalableinformatics.com> Message-ID: <487130DF.6060305@scalableinformatics.com> Joe Landman wrote: > Loic Tortay wrote: >> Joe Landman wrote: [sigh] s/Luc/Loic/g My bad. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From csamuel at vpac.org Sun Jul 6 16:53:17 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: <1315344918.159831215388073858.JavaMail.root@zimbra.vpac.org> Message-ID: <1633990448.160121215388397365.JavaMail.root@zimbra.vpac.org> ----- "Jon Aquilina" wrote: > the only thing i have see ppa used for is obtaining amarok 2 nightly > builds. https://launchpad.net/ubuntu/+ppas - 3157 registered PPAs - 861 active PPAs - 4607 published sources - 21243 published binaries My point was more that people can use this as a way to publish packages without going through a MOTU, etc.. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From cousins at umit.maine.edu Sun Jul 6 20:05:40 2008 From: cousins at umit.maine.edu (Steve Cousins) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: <200807040726.m647QkCr016092@bluewest.scyld.com> References: <200807040726.m647QkCr016092@bluewest.scyld.com> Message-ID: > > From: "Jon Aquilina" > > this is slightly off topic but im just wondering why spend thousands of > dollars when u can just setup another server and backup everything to a > raided hard drive array? Another RAID system helps but only if it is located somewhere else. The main reason we backup is for disaster recovery. One nice thing about tape is that you can take the tapes to another location easily or put them in a fire safe. Another reason is that RAID systems don't scale up as easily as a tape system. Our library has two 15 tape magazines that can be removed and replaced. It costs about $750 to buy 15 new tapes plus a magazine. That's not too bad for 6 TB of storage (uncompressed, with HW compression we get about 9 TB). Plus it takes practically no time to start using it. The library wasn't really that expensive when we bought it either. Somewhere around $7500. At the time we bought that we were using 400 GB drives in our RAID systems at $300 each. To build a server with 5 TiB (usable) of RAID storage at the time was about $7000. The tapes were more expensive then (about $100 each) but for about $10,500 we got 12 TB of tape storage (library plus 30 tapes). To get roughly the same of disk storage would have been about $14K. So right off the bat tape was cheaper. Plus it is so much easier to manage. I like the idea of snapshots and using rsync plus links is a crafty idea but I sleep better knowing that I have a "real" (one that I can carry around) backup of our data in our safe. Steve From eagles051387 at gmail.com Sun Jul 6 22:34:40 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] A press release In-Reply-To: <1633990448.160121215388397365.JavaMail.root@zimbra.vpac.org> References: <1315344918.159831215388073858.JavaMail.root@zimbra.vpac.org> <1633990448.160121215388397365.JavaMail.root@zimbra.vpac.org> Message-ID: with amarok 2 nightly i get the impression this is for beta or alpha testing. On Mon, Jul 7, 2008 at 1:53 AM, Chris Samuel wrote: > > ----- "Jon Aquilina" wrote: > > > the only thing i have see ppa used for is obtaining amarok 2 nightly > > builds. > > https://launchpad.net/ubuntu/+ppas > > - 3157 registered PPAs > - 861 active PPAs > - 4607 published sources > - 21243 published binaries > > My point was more that people can use this as a way to > publish packages without going through a MOTU, etc.. > > cheers, > Chris > -- > Christopher Samuel - (03) 9925 4751 - Systems Manager > The Victorian Partnership for Advanced Computing > P.O. Box 201, Carlton South, VIC 3053, Australia > VPAC is a not-for-profit Registered Research Agency > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/bab14470/attachment.html From eagles051387 at gmail.com Sun Jul 6 22:42:15 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:23 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: References: <200807040726.m647QkCr016092@bluewest.scyld.com> Message-ID: in my case where money isnt an issue wouldnt it be better for me to build a raid backup array? i understand your reasoning. im still studying and fairly new in the higher education of IT so when i start working ill keep what you mentioned to heart. only problem is that where i am located in europe things are more expensive here. another random idea why not create a raided backup array backed up to tape? is it possible to do a tape back up of data thata being written to disk instantly On Mon, Jul 7, 2008 at 5:05 AM, Steve Cousins wrote: > >> From: "Jon Aquilina" >> this is slightly off topic but im just wondering why spend thousands of >> dollars when u can just setup another server and backup everything to a >> raided hard drive array? >> > > Another RAID system helps but only if it is located somewhere else. The > main reason we backup is for disaster recovery. One nice thing about tape is > that you can take the tapes to another location easily or put them in a fire > safe. > > Another reason is that RAID systems don't scale up as easily as a tape > system. Our library has two 15 tape magazines that can be removed and > replaced. It costs about $750 to buy 15 new tapes plus a magazine. That's > not too bad for 6 TB of storage (uncompressed, with HW compression we get > about 9 TB). Plus it takes practically no time to start using it. > > The library wasn't really that expensive when we bought it either. > Somewhere around $7500. At the time we bought that we were using 400 GB > drives in our RAID systems at $300 each. To build a server with 5 TiB > (usable) of RAID storage at the time was about $7000. The tapes were more > expensive then (about $100 each) but for about $10,500 we got 12 TB of tape > storage (library plus 30 tapes). To get roughly the same of disk storage > would have been about $14K. So right off the bat tape was cheaper. Plus it > is so much easier to manage. I like the idea of snapshots and using rsync > plus links is a crafty idea but I sleep better knowing that I have a "real" > (one that I can carry around) backup of our data in our safe. > > Steve > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/777573f5/attachment.html From eagles051387 at gmail.com Mon Jul 7 00:22:12 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] which version of the gpl Message-ID: im getting ready to start the development of my own clustering distro. im going to be registering it on lanuchpad.net to help me keep track of bugs and for users to post suggestions. how do i determine which version of the gpl to use? -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/79c5d45f/attachment.html From eagles051387 at gmail.com Mon Jul 7 03:17:50 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] 32 bit or 64bit pkgs re /usr/local over NFS is okay, Joe Message-ID: how does 1 determine if pkgs installed are 32 or 64? im running kubuntu and use alot of the pre pkged stuff. -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/1d3ba555/attachment.html From carsten.aulbert at aei.mpg.de Mon Jul 7 03:21:56 2008 From: carsten.aulbert at aei.mpg.de (Carsten Aulbert) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] 32 bit or 64bit pkgs re /usr/local over NFS is okay, Joe In-Reply-To: References: Message-ID: <4871EE44.50904@aei.mpg.de> Jon Aquilina wrote: > how does 1 determine if pkgs installed are 32 or 64? im running kubuntu > and use alot of the pre pkged stuff. Blind shot: $ file /usr/bin/file /usr/bin/file: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.0, dynamically linked (uses shared libs), for GNU/Linux 2.6.0, stripped But of course there are other ways, but this it fast Cheers Carsten From eagles051387 at gmail.com Mon Jul 7 03:24:52 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] 32 bit or 64bit pkgs re /usr/local over NFS is okay, Joe In-Reply-To: <4871EE44.50904@aei.mpg.de> References: <4871EE44.50904@aei.mpg.de> Message-ID: so what your telling me to make sure im running 64bit bins is to check the /usr/bin. if im not im best compiling from source? On 7/7/08, Carsten Aulbert wrote: > > > > Jon Aquilina wrote: > > how does 1 determine if pkgs installed are 32 or 64? im running kubuntu > > and use alot of the pre pkged stuff. > > Blind shot: > $ file /usr/bin/file > > /usr/bin/file: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), > for GNU/Linux 2.6.0, dynamically linked (uses shared libs), for > GNU/Linux 2.6.0, stripped > > But of course there are other ways, but this it fast > > Cheers > > Carsten > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/72ffd6cd/attachment.html From carsten.aulbert at aei.mpg.de Mon Jul 7 03:29:58 2008 From: carsten.aulbert at aei.mpg.de (Carsten Aulbert) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] 32 bit or 64bit pkgs re /usr/local over NFS is okay, Joe In-Reply-To: References: <4871EE44.50904@aei.mpg.de> Message-ID: <4871F026.2060802@aei.mpg.de> Jon Aquilina wrote: > so what your telling me to make sure im running 64bit bins is to check > the /usr/bin. if im not im best compiling from source? Well other possiblities are: look at the output of uname -a if you get i686 you are running a 32bit kernel if the output contains x64_64 you are running a 64 bit kernel (sometimes uname -m tells you the same). For Debian/*buntu: see if you can install ia32 packages. These are compatibility packages to run 32bit binaries under 64bit systems. Obviously those are not needed on 32bit systems and hence not available - as far as I know. HTH Carsten From eagles051387 at gmail.com Mon Jul 7 03:31:13 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] 32 bit or 64bit pkgs re /usr/local over NFS is okay, Joe In-Reply-To: <4871F026.2060802@aei.mpg.de> References: <4871EE44.50904@aei.mpg.de> <4871F026.2060802@aei.mpg.de> Message-ID: do u happen to have gtalk by any chance. i downloaded the 64bit version of kubuntu. i think the kernel is a 64bit kernel but i suspect that the pre pkged debians arent or might not be? On 7/7/08, Carsten Aulbert wrote: > > > > Jon Aquilina wrote: > > so what your telling me to make sure im running 64bit bins is to check > > the /usr/bin. if im not im best compiling from source? > > Well other possiblities are: > > look at the output of uname -a > > if you get i686 you are running a 32bit kernel if the output contains > x64_64 you are running a 64 bit kernel (sometimes uname -m tells you the > same). > > For Debian/*buntu: see if you can install ia32 packages. These are > compatibility packages to run 32bit binaries under 64bit systems. > Obviously those are not needed on 32bit systems and hence not available > - as far as I know. > > HTH > > Carsten > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/17e5baa3/attachment.html From gerry.creager at tamu.edu Mon Jul 7 04:29:48 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: References: <200807040726.m647QkCr016092@bluewest.scyld.com> Message-ID: <4871FE2C.4090605@tamu.edu> Jon Aquilina wrote: > in my case where money isnt an issue wouldnt it be better for me to > build a raid backup array? i understand your reasoning. im still > studying and fairly new in the higher education of IT so when i start > working ill keep what you mentioned to heart. only problem is that where > i am located in europe things are more expensive here. another random > idea why not create a raided backup array backed up to tape? is it > possible to do a tape back up of data thata being written to disk instantly It's absolutely possible to do a mirrored write to RAID spinning media and also to tape. In a perfect world, where I don't have budget constraints, that's how I'd achieve my third tier of backup. The real reason we bother with tiered storage and multiple copies, however, remains "disaster recovery". One theory says that simply having two copies in the data center is enough. Experience teaches that, for true disaster recovery, one needs a pretty recent off-site copy, that is unlikely to be disrupted by an event in one locale. I know of one company that mirrors disks over 100 miles from their r&d/corporate offices via multiple 10gigabit paths, with two feeds for power, a diesel generator, and a battery plant to keep things running. In their main site, they have disk and tape. Offsite, they have another disk copy. And last week's tapes. In higher-education IT, one tends to have a lot of budget constraints. Funding agencies want accountability and don't seem to just give us hardware dollars for the asking, although it often seems that way when someone who's not seeking said funding, watches the process. Therefore, money IS a problem and we have to determine the best way to keep things going while optimizing expenses. Different approaches don't mean we're disagreeing with you, however. MY primary backup is spinning (RAID) disk. I'd like to expand to LTO tape with robotics but my funding agencies have not yet seen the wisdom of this, and think my use of disk is just fine. Until we have a problem (and problems are almost guaranteed) and get in trouble for not having incorporated tape (or another, different, technology) in our backup plan, I don't expect to see funds for it. In fact, when we do get in trouble, I see us redirecting already allocated funds rather than getting new funds, to accomplish this. Just understand that redirecting funding for a new hardware implementation requires sponsor approval, and if they don't understand "Why?" it can get messy. gerry > On Mon, Jul 7, 2008 at 5:05 AM, Steve Cousins > wrote: > > > From: "Jon Aquilina" > this is slightly off topic but im just wondering why spend > thousands of > dollars when u can just setup another server and backup > everything to a > raided hard drive array? > > > Another RAID system helps but only if it is located somewhere else. > The main reason we backup is for disaster recovery. One nice thing > about tape is that you can take the tapes to another location easily > or put them in a fire safe. > > Another reason is that RAID systems don't scale up as easily as a > tape system. Our library has two 15 tape magazines that can be > removed and replaced. It costs about $750 to buy 15 new tapes plus a > magazine. That's not too bad for 6 TB of storage (uncompressed, with > HW compression we get about 9 TB). Plus it takes practically no time > to start using it. > > The library wasn't really that expensive when we bought it either. > Somewhere around $7500. At the time we bought that we were using 400 > GB drives in our RAID systems at $300 each. To build a server with 5 > TiB (usable) of RAID storage at the time was about $7000. The tapes > were more expensive then (about $100 each) but for about $10,500 we > got 12 TB of tape storage (library plus 30 tapes). To get roughly > the same of disk storage would have been about $14K. So right off > the bat tape was cheaper. Plus it is so much easier to manage. I > like the idea of snapshots and using rsync plus links is a crafty > idea but I sleep better knowing that I have a "real" (one that I can > carry around) backup of our data in our safe. > > Steve > > > > > -- > Jonathan Aquilina > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From rgb at phy.duke.edu Mon Jul 7 04:29:20 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] which version of the gpl In-Reply-To: References: Message-ID: On Mon, 7 Jul 2008, Jon Aquilina wrote: > im getting ready to start the development of my own clustering distro. im > going to be registering it on lanuchpad.net to help me keep track of bugs > and for users to post suggestions. how do i determine which version of the > gpl to use? The only place you have a choice is in software you write. A distribution is almost entirely software written by other people, and you will obviously inherit (and be redistributing under) the license each item already has, which "should" be clearly included in the source packages. If you don't find it, you'll have to try to contact the authors and get them to license it -- there is stuff out there with no overt copyright (leaving the ware covered by a passive one -- new work is copyrighted to the author(s) no matter what, basically) and with no license at all, or with vague statements about who can use it and when. You can opt to include software only with v2, v3, BSD-like, or include shareware or even commercial -- it is your distro, you choose. If you write a packaging system, an installation system, or any NEW clusterware then you can choose your license, and GPL v2 or v3 are fine choices. rgb -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From tjrc at sanger.ac.uk Mon Jul 7 05:44:01 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] 32 bit or 64bit pkgs re /usr/local over NFS is okay, Joe In-Reply-To: <4871F026.2060802@aei.mpg.de> References: <4871EE44.50904@aei.mpg.de> <4871F026.2060802@aei.mpg.de> Message-ID: On 7 Jul 2008, at 11:29 am, Carsten Aulbert wrote: > > > Jon Aquilina wrote: >> so what your telling me to make sure im running 64bit bins is to >> check >> the /usr/bin. if im not im best compiling from source? > > Well other possiblities are: > > look at the output of uname -a > > if you get i686 you are running a 32bit kernel if the output contains > x64_64 you are running a 64 bit kernel (sometimes uname -m tells you > the > same). That's unreliable. It's possible to run a 64-bit kernel with a 32-bit userland, so just because uname returns x86_64 that doesn't mean you can definitely run 64-bit userland binaries. > For Debian/*buntu: see if you can install ia32 packages. These are > compatibility packages to run 32bit binaries under 64bit systems. > Obviously those are not needed on 32bit systems and hence not > available > - as far as I know. The cast-iron way to do it on Debian/Ubuntu is the output of: dpkg --print-architecture That will print "i386" for 32-bit Intel, and "amd64" for x86_64. Obviously for other supported architectures it prints other things. It will print the correct thing regardless of whether the kernel is 32- or 64-bit. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From eagles051387 at gmail.com Mon Jul 7 06:03:49 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] 32 bit or 64bit pkgs re /usr/local over NFS is okay, Joe In-Reply-To: References: <4871EE44.50904@aei.mpg.de> <4871F026.2060802@aei.mpg.de> Message-ID: im would like at some point in time to migrate to a fully 64bit os and leave 32 bit behind at least for the time being. On 7/7/08, Tim Cutts wrote: > > > On 7 Jul 2008, at 11:29 am, Carsten Aulbert wrote: > > >> >> Jon Aquilina wrote: >> >>> so what your telling me to make sure im running 64bit bins is to check >>> the /usr/bin. if im not im best compiling from source? >>> >> >> Well other possiblities are: >> >> look at the output of uname -a >> >> if you get i686 you are running a 32bit kernel if the output contains >> x64_64 you are running a 64 bit kernel (sometimes uname -m tells you the >> same). >> > > That's unreliable. It's possible to run a 64-bit kernel with a 32-bit > userland, so just because uname returns x86_64 that doesn't mean you can > definitely run 64-bit userland binaries. > > For Debian/*buntu: see if you can install ia32 packages. These are >> compatibility packages to run 32bit binaries under 64bit systems. >> Obviously those are not needed on 32bit systems and hence not available >> - as far as I know. >> > > The cast-iron way to do it on Debian/Ubuntu is the output of: > > dpkg --print-architecture > > That will print "i386" for 32-bit Intel, and "amd64" for x86_64. Obviously > for other supported architectures it prints other things. It will print the > correct thing regardless of whether the kernel is 32- or 64-bit. > > Tim > > > -- > The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, > a charity registered in England with number 1021457 and acompany registered > in England with number 2742969, whose registeredoffice is 215 Euston Road, > London, NW1 2BE. > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/3c4f6750/attachment.html From eagles051387 at gmail.com Mon Jul 7 06:04:53 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] which version of the gpl In-Reply-To: References: Message-ID: what is the difference between version 2 and 3? On 7/7/08, Robert G. Brown wrote: > > On Mon, 7 Jul 2008, Jon Aquilina wrote: > > im getting ready to start the development of my own clustering distro. im >> going to be registering it on lanuchpad.net to help me keep track of bugs >> and for users to post suggestions. how do i determine which version of >> the >> gpl to use? >> > > The only place you have a choice is in software you write. A > distribution is almost entirely software written by other people, and > you will obviously inherit (and be redistributing under) the license > each item already has, which "should" be clearly included in the source > packages. If you don't find it, you'll have to try to contact the > authors and get them to license it -- there is stuff out there with no > overt copyright (leaving the ware covered by a passive one -- new work > is copyrighted to the author(s) no matter what, basically) and with no > license at all, or with vague statements about who can use it and when. > You can opt to include software only with v2, v3, BSD-like, or include > shareware or even commercial -- it is your distro, you choose. > > If you write a packaging system, an installation system, or any > NEW clusterware then you can choose your license, and GPL v2 or v3 are > fine choices. > > rgb > > -- > Robert G. Brown Phone(cell): 1-919-280-8443 > Duke University Physics Dept, Box 90305 > Durham, N.C. 27708-0305 > Web: http://www.phy.duke.edu/~rgb > Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php > Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/2799304d/attachment.html From rgb at phy.duke.edu Mon Jul 7 06:33:34 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] which version of the gpl In-Reply-To: References: Message-ID: On Mon, 7 Jul 2008, Jon Aquilina wrote: > what is the difference between version 2 and 3? In a nutshell: 2 is/was on everything in the gnu universe until about a year ago, but the gnu lawyers were concerned about a few technical points so they invented 3 which is marginally stronger. If you want to read the whole story GIYF, as always. I think there is an explanation on both the FSF site and the Fedora site (at least) and there was even a discussion on this a year or so ago here that is probably in the list archives. rgb > > On 7/7/08, Robert G. Brown wrote: >> >> On Mon, 7 Jul 2008, Jon Aquilina wrote: >> >> im getting ready to start the development of my own clustering distro. im >>> going to be registering it on lanuchpad.net to help me keep track of bugs >>> and for users to post suggestions. how do i determine which version of >>> the >>> gpl to use? >>> >> >> The only place you have a choice is in software you write. A >> distribution is almost entirely software written by other people, and >> you will obviously inherit (and be redistributing under) the license >> each item already has, which "should" be clearly included in the source >> packages. If you don't find it, you'll have to try to contact the >> authors and get them to license it -- there is stuff out there with no >> overt copyright (leaving the ware covered by a passive one -- new work >> is copyrighted to the author(s) no matter what, basically) and with no >> license at all, or with vague statements about who can use it and when. >> You can opt to include software only with v2, v3, BSD-like, or include >> shareware or even commercial -- it is your distro, you choose. >> >> If you write a packaging system, an installation system, or any >> NEW clusterware then you can choose your license, and GPL v2 or v3 are >> fine choices. >> >> rgb >> >> -- >> Robert G. Brown Phone(cell): 1-919-280-8443 >> Duke University Physics Dept, Box 90305 >> Durham, N.C. 27708-0305 >> Web: http://www.phy.duke.edu/~rgb >> Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php >> Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 >> > > > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From eagles051387 at gmail.com Mon Jul 7 06:59:57 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] mdns Message-ID: is mdns strictly for the mac os or can it be incorporated into any linux cluster?? -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/60c744d8/attachment.html From apittman at concurrent-thinking.com Mon Jul 7 07:20:40 2008 From: apittman at concurrent-thinking.com (Ashley Pittman) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] mdns In-Reply-To: References: Message-ID: <1215440440.6740.7.camel@bruce.priv.wark.uk.streamline-computing.com> On Mon, 2008-07-07 at 15:59 +0200, Jon Aquilina wrote: > is mdns strictly for the mac os or can it be incorporated into any > linux cluster?? It works under Linux, my sound server at home and the printers at work use this quite satisfactorily. I would caution against using it in a cluster however, it's design-goal and benefit are to handle changing network environments where devices are being added to and removed from the network frequently. This is the polar opposite of what you should try and aim for in a cluster where the hardware configuration is known in advance and for the most part constant. In addition it used to be the case there were performance issues associated with using zeroconf on large networks and the last thing you want in a cluster is additional network traffic clogging up the system. Ashley Pittman. From eagles051387 at gmail.com Mon Jul 7 07:24:25 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] Re: OT: LTO Ultrium (3) throughput? In-Reply-To: <4871FE2C.4090605@tamu.edu> References: <200807040726.m647QkCr016092@bluewest.scyld.com> <4871FE2C.4090605@tamu.edu> Message-ID: well right now i have no funding what so ever im trying to scrape together a few machines. being from houston and going to a private institute there i have had my ins and outs with higher education IT. i agree with what you are saying and now it makes totally perfect sense. also in addition to off siting the tapes from last week couldnt you take a drive out of your raid array and store them in an off site location then reuse again for a back up down the road? On 7/7/08, Gerry Creager wrote: > > Jon Aquilina wrote: > >> in my case where money isnt an issue wouldnt it be better for me to build >> a raid backup array? i understand your reasoning. im still studying and >> fairly new in the higher education of IT so when i start working ill keep >> what you mentioned to heart. only problem is that where i am located in >> europe things are more expensive here. another random idea why not create a >> raided backup array backed up to tape? is it possible to do a tape back up >> of data thata being written to disk instantly >> > > It's absolutely possible to do a mirrored write to RAID spinning media and > also to tape. In a perfect world, where I don't have budget constraints, > that's how I'd achieve my third tier of backup. The real reason we bother > with tiered storage and multiple copies, however, remains "disaster > recovery". One theory says that simply having two copies in the data center > is enough. Experience teaches that, for true disaster recovery, one needs a > pretty recent off-site copy, that is unlikely to be disrupted by an event in > one locale. I know of one company that mirrors disks over 100 miles from > their r&d/corporate offices via multiple 10gigabit paths, with two feeds for > power, a diesel generator, and a battery plant to keep things running. In > their main site, they have disk and tape. Offsite, they have another disk > copy. And last week's tapes. > > In higher-education IT, one tends to have a lot of budget constraints. > Funding agencies want accountability and don't seem to just give us hardware > dollars for the asking, although it often seems that way when someone who's > not seeking said funding, watches the process. Therefore, money IS a > problem and we have to determine the best way to keep things going while > optimizing expenses. > > Different approaches don't mean we're disagreeing with you, however. MY > primary backup is spinning (RAID) disk. I'd like to expand to LTO tape with > robotics but my funding agencies have not yet seen the wisdom of this, and > think my use of disk is just fine. Until we have a problem (and problems > are almost guaranteed) and get in trouble for not having incorporated tape > (or another, different, technology) in our backup plan, I don't expect to > see funds for it. In fact, when we do get in trouble, I see us redirecting > already allocated funds rather than getting new funds, to accomplish this. > Just understand that redirecting funding for a new hardware implementation > requires sponsor approval, and if they don't understand "Why?" it can get > messy. > > gerry > > On Mon, Jul 7, 2008 at 5:05 AM, Steve Cousins > cousins@umit.maine.edu>> wrote: >> >> >> From: "Jon Aquilina" >> this is slightly off topic but im just wondering why spend >> thousands of >> dollars when u can just setup another server and backup >> everything to a >> raided hard drive array? >> >> >> Another RAID system helps but only if it is located somewhere else. >> The main reason we backup is for disaster recovery. One nice thing >> about tape is that you can take the tapes to another location easily >> or put them in a fire safe. >> >> Another reason is that RAID systems don't scale up as easily as a >> tape system. Our library has two 15 tape magazines that can be >> removed and replaced. It costs about $750 to buy 15 new tapes plus a >> magazine. That's not too bad for 6 TB of storage (uncompressed, with >> HW compression we get about 9 TB). Plus it takes practically no time >> to start using it. >> >> The library wasn't really that expensive when we bought it either. >> Somewhere around $7500. At the time we bought that we were using 400 >> GB drives in our RAID systems at $300 each. To build a server with 5 >> TiB (usable) of RAID storage at the time was about $7000. The tapes >> were more expensive then (about $100 each) but for about $10,500 we >> got 12 TB of tape storage (library plus 30 tapes). To get roughly >> the same of disk storage would have been about $14K. So right off >> the bat tape was cheaper. Plus it is so much easier to manage. I >> like the idea of snapshots and using rsync plus links is a crafty >> idea but I sleep better knowing that I have a "real" (one that I can >> carry around) backup of our data in our safe. >> >> Steve >> >> >> >> >> -- >> Jonathan Aquilina >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> > > -- > Gerry Creager -- gerry.creager@tamu.edu > Texas Mesonet -- AATLT, Texas A&M University > Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 > Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/c18cf3dd/attachment.html From eagles051387 at gmail.com Mon Jul 7 07:26:15 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] mdns In-Reply-To: <1215440440.6740.7.camel@bruce.priv.wark.uk.streamline-computing.com> References: <1215440440.6740.7.camel@bruce.priv.wark.uk.streamline-computing.com> Message-ID: can u clarify what you mean by sound server. so basically what you are telling me if there is a windows dns server (active directory in server 2k3) mdns can replace the active directory server? also is there a way to curtail the network bottle necks? On 7/7/08, Ashley Pittman wrote: > > On Mon, 2008-07-07 at 15:59 +0200, Jon Aquilina wrote: > > is mdns strictly for the mac os or can it be incorporated into any > > linux cluster?? > > It works under Linux, my sound server at home and the printers at work > use this quite satisfactorily. > > I would caution against using it in a cluster however, it's design-goal > and benefit are to handle changing network environments where devices > are being added to and removed from the network frequently. This is the > polar opposite of what you should try and aim for in a cluster where the > hardware configuration is known in advance and for the most part > constant. In addition it used to be the case there were performance > issues associated with using zeroconf on large networks and the last > thing you want in a cluster is additional network traffic clogging up > the system. > > Ashley Pittman. > > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/75ab1024/attachment.html From apittman at concurrent-thinking.com Mon Jul 7 07:58:52 2008 From: apittman at concurrent-thinking.com (Ashley Pittman) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] mdns In-Reply-To: References: <1215440440.6740.7.camel@bruce.priv.wark.uk.streamline-computing.com> Message-ID: <1215442732.6740.20.camel@bruce.priv.wark.uk.streamline-computing.com> There are two parts to mdns, automatic address configuration and then the advertising of services on top of those addresses. I'm not sure which of these you are asking about, I realised after I'd hit send that my answer only applied to the second of these. At home I use what according to Wikipedia is called DNS-SD to enable rythmbox on my desktop to automatically discover the daap servers on either my Mac (iTunes) or on another Linux machine (firefly media server). I'm (just) young enough never to have used a Windows desktop so I can't comment on what active directory offers. Unfortunately with Multicast I think network bottle necks are a fact of life and on network with static hardware configuration it really is better to have a static software configuration as well. What problem are you trying to solve? Ashley Pittman. On Mon, 2008-07-07 at 16:26 +0200, Jon Aquilina wrote: > can u clarify what you mean by sound server. so basically what you are > telling me if there is a windows dns server (active directory in > server 2k3) mdns can replace the active directory server? also is > there a way to curtail the network bottle necks? > > > On 7/7/08, Ashley Pittman wrote: > On Mon, 2008-07-07 at 15:59 +0200, Jon Aquilina wrote: > > is mdns strictly for the mac os or can it be incorporated > into any > > linux cluster?? > > It works under Linux, my sound server at home and the printers > at work > use this quite satisfactorily. > > I would caution against using it in a cluster however, it's > design-goal > and benefit are to handle changing network environments where > devices > are being added to and removed from the network > frequently. This is the > polar opposite of what you should try and aim for in a cluster > where the > hardware configuration is known in advance and for the most > part > constant. In addition it used to be the case there were > performance > issues associated with using zeroconf on large networks and > the last > thing you want in a cluster is additional network traffic > clogging up > the system. > > Ashley Pittman. > > > > > -- > Jonathan Aquilina From jlforrest at berkeley.edu Mon Jul 7 09:38:28 2008 From: jlforrest at berkeley.edu (Jon Forrest) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] /usr/local over NFS is okay, Joe In-Reply-To: <486E0D29.8060605@rri.sari.ac.uk> References: <011E261F-94D7-4F1C-AA69-4A008A1DA1E2@sanger.ac.uk> <486CD643.1050904@ias.edu> <486CE03D.40901@scalableinformatics.com> <486CE86B.90104@ias.edu> <486CEF27.8090507@scalableinformatics.com> <486E0D29.8060605@rri.sari.ac.uk> Message-ID: <48724684.8030507@berkeley.edu> Tony Travis wrote: > In particular, Sun went out of their way to move a lot of things from > /bin into /usr/bin precisely so it could be shared by NFS. I also agree > with the widely used convention that '/usr/local' means local to the > site, not the particular machine. The way we used to this about this in the CS Department at Berkeley was that /usr/local holds locally built stuff. It had nothing to do with where the bits are stored. Using tricks that other people have mentioned, we created a "Software Warehouse" which held software that we built. It contained the standard BSD, GNU, and other open source stuff. This was stored on an Auspex file server. We built mostly the same stuff for ~6 architectures. The file systems on the Auspex were named in ways that made it clear which architecture they were for, but they were always mounted as /usr/sww on the desktop machines. (We could have called this /usr/local but we wanted to make it clear where the file systems were coming from.) Long ago I wrote what became the standard document for how to create a "dataless" environment for the DEC OSF operating system. (Talk about a non-intuitive use of words - "dataless" is perfect example. All it means is everything except the root file system is remotely mounted). This allowed me to use the same method for /usr that we used for /usr/sww. This was a long time ago when disks were small, slow, and expensive, and before rsync was born. I'm not sure the same architecture would make sense now, even with fast networks. Cordially, -- Jon Forrest Research Computing Support College of Chemistry 173 Tan Hall University of California Berkeley Berkeley, CA 94720-1460 510-643-1032 jlforrest@berkeley.edu From eagles051387 at gmail.com Mon Jul 7 11:19:20 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:24 2009 Subject: [Beowulf] mdns In-Reply-To: <1215442732.6740.20.camel@bruce.priv.wark.uk.streamline-computing.com> References: <1215440440.6740.7.camel@bruce.priv.wark.uk.streamline-computing.com> <1215442732.6740.20.camel@bruce.priv.wark.uk.streamline-computing.com> Message-ID: i have used both but i prefer linux. server 2k3 is to much of a resource hog. i would love to try this out in a small server environment thing is i dont have any machiens with me at the moment to test mdns out on they r all back in the states On Mon, Jul 7, 2008 at 4:58 PM, Ashley Pittman < apittman@concurrent-thinking.com> wrote: > > There are two parts to mdns, automatic address configuration and then > the advertising of services on top of those addresses. I'm not sure > which of these you are asking about, I realised after I'd hit send that > my answer only applied to the second of these. > > At home I use what according to Wikipedia is called DNS-SD to enable > rythmbox on my desktop to automatically discover the daap servers on > either my Mac (iTunes) or on another Linux machine (firefly media > server). I'm (just) young enough never to have used a Windows desktop > so I can't comment on what active directory offers. > > Unfortunately with Multicast I think network bottle necks are a fact of > life and on network with static hardware configuration it really is > better to have a static software configuration as well. > > What problem are you trying to solve? > > Ashley Pittman. > > On Mon, 2008-07-07 at 16:26 +0200, Jon Aquilina wrote: > > can u clarify what you mean by sound server. so basically what you are > > telling me if there is a windows dns server (active directory in > > server 2k3) mdns can replace the active directory server? also is > > there a way to curtail the network bottle necks? > > > > > > On 7/7/08, Ashley Pittman wrote: > > On Mon, 2008-07-07 at 15:59 +0200, Jon Aquilina wrote: > > > is mdns strictly for the mac os or can it be incorporated > > into any > > > linux cluster?? > > > > It works under Linux, my sound server at home and the printers > > at work > > use this quite satisfactorily. > > > > I would caution against using it in a cluster however, it's > > design-goal > > and benefit are to handle changing network environments where > > devices > > are being added to and removed from the network > > frequently. This is the > > polar opposite of what you should try and aim for in a cluster > > where the > > hardware configuration is known in advance and for the most > > part > > constant. In addition it used to be the case there were > > performance > > issues associated with using zeroconf on large networks and > > the last > > thing you want in a cluster is additional network traffic > > clogging up > > the system. > > > > Ashley Pittman. > > > > > > > > > > -- > > Jonathan Aquilina > > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/6ce011bb/attachment.html From lindahl at pbm.com Mon Jul 7 11:40:14 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A simple cluster In-Reply-To: <0D49B15ACFDF2F46BF90B6E08C90048A04884DA294@quadbrsex1.quadrics.com> References: <0D49B15ACFDF2F46BF90B6E08C90048A04884DA294@quadbrsex1.quadrics.com> Message-ID: <20080707184013.GC27386@bx9.net> On Sat, Jun 28, 2008 at 08:59:28PM +0100, Dan.Kidger@quadrics.com wrote: > Greg wrote: > > > Many-core chips that look like a big x86 SMP don't look anything like > > a GPU. With the addition of a few commuications primitives, MPI > > will run even better on big x86 SMPs. All of the programming > > approaches for GPUs and Clearspeed and historical array processors > > are yucky compared to "high level language + MPI". > > Are you daring to suggest say that using MPI is not yucky ? There are degrees of yuckyness. Rewriting every line of my code to get it to work with an accellerator is yuckier than parallelizing a code with MPI. Especially if my code is already parallelized with MPI, with all the communication hidden in library routines... -- greg From K.D.Strouts at sms.ed.ac.uk Fri Jul 4 01:39:26 2008 From: K.D.Strouts at sms.ed.ac.uk (Kenneth Duncan Strouts) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: <486A4754.900@rri.sari.ac.uk> References: <486A4754.900@rri.sari.ac.uk> Message-ID: <20080704093926.vzj0jtz1cg4owgow@www.sms.ed.ac.uk> Hi Jon, > Quoting Tony Travis : > Although Kerrighed looks very promising, it is also quite fragile in > our hands. If one node crashes, you lose the entire cluster. That > said, the Kerrighed project is extremely well supported and I > believe it will be a good alternative in the near future. We found that with Kerrighed, one node crashing sees the whole cluster go down. The following is output to kern.log before the cluster dies. Jul 2 13:57:03 nodeC@kghed kernel: TIPC: Resetting link <1.1.2:eth1-1.1.3:eth1>, peer not responding Jul 2 13:57:03 nodeC@kghed kernel: TIPC: Lost link <1.1.2:eth1-1.1.3:eth1> on network plane B Jul 2 13:57:03 nodeC@kghed kernel: TIPC: Lost contact with <1.1.3> From the Kerrighed mailing list (Louis Rilling); "Indeed, Kerrighed does not tolerate node failures yet. We have no precise date for this, and giving a date right now would be meaningless. The first step for us is to support dynamic cluster resizing (IOW live node additions and removals), and we've just started working on it. We will work on node failures in a second step." It seems they are working on this, and on a new framework for configurable process scheduling. Probably Kerrighed will provide a good alternative in future. Kenneth -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From herrold at owlriver.com Fri Jul 4 14:11:58 2008 From: herrold at owlriver.com (R P Herrold) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <2C7CE2AF-E2DB-444E-8F91-559062363FF7@sanger.ac.uk> References: <20080701193721.B6843B5404D@mx2.its.rochester.edu> <5753D20A3C7E4233B4FAF0D8670A0423@geoffPC> <2C7CE2AF-E2DB-444E-8F91-559062363FF7@sanger.ac.uk> Message-ID: On Fri, 4 Jul 2008, Tim Cutts wrote: > If upgrading packages wrecks the system, then the package installation > scripts are broken. They should spot the upgrade in progress and take > appropriate action, depending on the previously installed version. This can > be quite a detailed process for Debian packages, which is probably why they > have fewer problems than Red Hat in this regard. See > http://www.debian.org/doc/debian-policy/ch-maintainerscripts.html#s-mscriptsinstact > if you're interested in how it works for Debian packages. well, no so much. The 2.4 to 2.6 kernel transition was handled no better by Debian 'testing', than CentOS/Red Hat derived. ;) -- Russ herrold From spambox at emboss.co.nz Sun Jul 6 00:12:51 2008 From: spambox at emboss.co.nz (Michael Brown) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] zfs tuning for HJPC/cluster workloads? In-Reply-To: <486F9F1A.4070201@scalableinformatics.com> References: <486F9F1A.4070201@scalableinformatics.com> Message-ID: Joe Landman wrote: > Hi folks: > > Investigating zfs on a Solaris 10 5/08 loaded JackRabbit for a customer. > zfs performance isn't that good relative to Linux on this same hardware > (literally a reboot between the two environments) [...] > This is for an IO intensive application with multiple threads doing > 1-100 GB streaming reads. Here's your problem. One of the problems with ZFS is it's performace for streaming workloads (reads or writes). It's designed for, and does much better at, large quantities of smallish unrelated I/O operations. I even vaugely remember something in the Solaris documentation recommending you use UFS if you're going to be streaming a lot. The only suggestion I have for improving ZFS performance in this case is to turn off the prefetching if you haven't already. Also, streaming the files on to the disk one at a time does help a bit. Also, regarding ZFS on FreeBSD, from what I've read I wouldn't recommend it on a production server just yet. While it has improved dramatically over the last year (I haven't seen any recent reports of filesystems being eaten, for example), it still has the tendency to crash or lock up machines under certain workloads. -- Michael Brown Add michael@ to emboss.co.nz ---+--- My inbox is always open From kspaans at student.math.uwaterloo.ca Mon Jul 7 06:27:06 2008 From: kspaans at student.math.uwaterloo.ca (Kyle Spaans) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] which version of the gpl In-Reply-To: References: Message-ID: <20080707132706.GA13114@student.math> On Mon, Jul 07, 2008 at 03:04:53PM +0200, Jon Aquilina wrote: > what is the difference between version 2 and 3? IIRC, the GPLv3 contains a bunch of modernization "upgrades" over the GPLv2. The only one I can think of off of the top of my head is the "Tivoisation" provision, meaning that hardware manufacturers can't lock-in a specific version of the software to the hardware (meaning you can get the source, but you can't modify and run it on the hardware). There are probably a bunch of provisions for DRM too. But I would imagine that for a desktop developer, there is little to stop you from opting for v3 over v2. If you are familiar with v2, a useful link: http://www.groklaw.net/articlebasic.php?story=20060118155841115 Else, gplv3.fsf.org and wikipedia.org are good starts. From csamuel at vpac.org Mon Jul 7 20:53:23 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] 32 bit or 64bit pkgs re /usr/local over NFS is okay, Joe In-Reply-To: Message-ID: <1709195798.174881215489203175.JavaMail.root@zimbra.vpac.org> ----- "Jon Aquilina" wrote: > do u happen to have gtalk by any chance. i downloaded the 64bit > version of kubuntu. i think the kernel is a 64bit kernel but i > suspect that the pre pkged debians arent or might not be? The Debian/Ubuntu way is to be a proper 64-bit distro on AMD64, so userland is 64-bit. This is why some people find it hard to run legacy 32-bit closed source apps there, such as Flash, etc. Ubuntu's support for these has improved greatly, though. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Mon Jul 7 20:55:08 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] 32 bit or 64bit pkgs re /usr/local over NFS is okay, Joe In-Reply-To: <4871F026.2060802@aei.mpg.de> Message-ID: <649495057.174971215489308742.JavaMail.root@zimbra.vpac.org> ----- "Carsten Aulbert" wrote: > Well other possiblities are: > > look at the output of uname -a > > if you get i686 you are running a 32bit kernel if the output contains > x64_64 you are running a 64 bit kernel (sometimes uname -m tells you > the same). Not universal, for instance with both SLES and RHEL on PPC64 you get a 64-bit kernel and a 32-bit userland with 64-bit compatibility. That causes all sorts of fun issues.. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Mon Jul 7 21:19:35 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] which version of the gpl In-Reply-To: Message-ID: <1540083202.175811215490775680.JavaMail.root@zimbra.vpac.org> ----- "Jon Aquilina" wrote: > what is the difference between version 2 and 3? GPLv3 is longer ? :-) There is an excellent side by side comparison of the two licenses at Groklaw here: http://www.groklaw.net/articlebasic.php?story=20060118155841115 #include IANAL, YMMV, consult a lawyer for real legal advice.. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From eagles051387 at gmail.com Mon Jul 7 22:49:50 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: <20080704093926.vzj0jtz1cg4owgow@www.sms.ed.ac.uk> References: <486A4754.900@rri.sari.ac.uk> <20080704093926.vzj0jtz1cg4owgow@www.sms.ed.ac.uk> Message-ID: cant that be used in conjunction wiht other packages taht will power down a node which fails? On Fri, Jul 4, 2008 at 10:39 AM, Kenneth Duncan Strouts < K.D.Strouts@sms.ed.ac.uk> wrote: > Hi Jon, > > > Quoting Tony Travis : >> Although Kerrighed looks very promising, it is also quite fragile in our >> hands. If one node crashes, you lose the entire cluster. That said, the >> Kerrighed project is extremely well supported and I believe it will be a >> good alternative in the near future. >> > > > We found that with Kerrighed, one node crashing sees the whole cluster go > down. The following is output to kern.log before the cluster dies. > > Jul 2 13:57:03 nodeC@kghed kernel: TIPC: Resetting link > <1.1.2:eth1-1.1.3:eth1>, peer not responding > Jul 2 13:57:03 nodeC@kghed kernel: TIPC: Lost link > <1.1.2:eth1-1.1.3:eth1> on network plane B > Jul 2 13:57:03 nodeC@kghed kernel: TIPC: Lost contact with <1.1.3> > > From the Kerrighed mailing list (Louis Rilling); > > "Indeed, Kerrighed does not tolerate node failures yet. We have no precise > date > for this, and giving a date right now would be meaningless. The first step > for > us is to support dynamic cluster resizing (IOW live node additions and > removals), and we've just started working on it. We will work on node > failures in a second step." > > It seems they are working on this, and on a new framework for configurable > process scheduling. Probably Kerrighed will provide a good alternative in > future. > > Kenneth > > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080708/0afc24d4/attachment.html From eagles051387 at gmail.com Mon Jul 7 22:51:25 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] 32 bit or 64bit pkgs re /usr/local over NFS is okay, Joe In-Reply-To: <1709195798.174881215489203175.JavaMail.root@zimbra.vpac.org> References: <1709195798.174881215489203175.JavaMail.root@zimbra.vpac.org> Message-ID: in regards to flash i dont have any issues with it. On Tue, Jul 8, 2008 at 5:53 AM, Chris Samuel wrote: > > ----- "Jon Aquilina" wrote: > > > do u happen to have gtalk by any chance. i downloaded the 64bit > > version of kubuntu. i think the kernel is a 64bit kernel but i > > suspect that the pre pkged debians arent or might not be? > > The Debian/Ubuntu way is to be a proper 64-bit distro > on AMD64, so userland is 64-bit. > > This is why some people find it hard to run legacy 32-bit > closed source apps there, such as Flash, etc. > > Ubuntu's support for these has improved greatly, though. > > cheers, > Chris > -- > Christopher Samuel - (03) 9925 4751 - Systems Manager > The Victorian Partnership for Advanced Computing > P.O. Box 201, Carlton South, VIC 3053, Australia > VPAC is a not-for-profit Registered Research Agency > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080708/7d7aa5c4/attachment.html From eugen at leitl.org Tue Jul 8 02:01:18 2008 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:07:25 2009 Subject: META: posting etiquette Re: [Beowulf] 32 bit or 64bit pkgs re /usr/local over NFS is okay, Joe In-Reply-To: References: <1709195798.174881215489203175.JavaMail.root@zimbra.vpac.org> Message-ID: <20080708090118.GU9875@leitl.org> On Tue, Jul 08, 2008 at 07:51:25AM +0200, Jon Aquilina wrote: > > in regards to flash i dont have any issues with it. This to the mailing list, since you're not answering to private mail. 1) You're overposting. 2) Also, http://en.wikipedia.org/wiki/Posting_style (message unchanged below). > On Tue, Jul 8, 2008 at 5:53 AM, Chris Samuel <[1]csamuel@vpac.org> > wrote: > > ----- "Jon Aquilina" <[2]eagles051387@gmail.com> wrote: > > do u happen to have gtalk by any chance. i downloaded the 64bit > > version of kubuntu. i think the kernel is a 64bit kernel but i > > suspect that the pre pkged debians arent or might not be? > > The Debian/Ubuntu way is to be a proper 64-bit distro > on AMD64, so userland is 64-bit. > This is why some people find it hard to run legacy 32-bit > closed source apps there, such as Flash, etc. > Ubuntu's support for these has improved greatly, though. > cheers, > Chris > -- > Christopher Samuel - (03) 9925 4751 - Systems Manager > The Victorian Partnership for Advanced Computing > P.O. Box 201, Carlton South, VIC 3053, Australia > VPAC is a not-for-profit Registered Research Agency > _______________________________________________ > Beowulf mailing list, [3]Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > [4]http://www.beowulf.org/mailman/listinfo/beowulf > > -- > Jonathan Aquilina > > References > > 1. mailto:csamuel@vpac.org > 2. mailto:eagles051387@gmail.com > 3. mailto:Beowulf@beowulf.org > 4. http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE From prentice at ias.edu Tue Jul 8 07:09:18 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <20080702072709.GX11428@casco.aei.mpg.de> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> Message-ID: <4873750E.1020407@ias.edu> Steffen Grunewald wrote: > > Which isn't true. Don't you remember MCC Interim Linux, back in the old > days of 0.95[abc] kernels? It didn't consist of tens of floppies (yet), > but it *was* a distro. > Actually, no, I don't remember MCC Interim Linux. It was before my time. My experience with Linux started in December 1996 or January 1997 with Red Hat Linux 4. -- Prentice From smulcahy at aplpi.com Tue Jul 8 07:20:50 2008 From: smulcahy at aplpi.com (stephen mulcahy) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <4873750E.1020407@ias.edu> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> Message-ID: <487377C2.2070503@aplpi.com> Prentice Bisbal wrote: > Actually, no, I don't remember MCC Interim Linux. It was before my time. > My experience with Linux started in December 1996 or January 1997 with > Red Hat Linux 4. I also have vague memories of SLS and Ygdrassil (sp?) - not sure if they came before or after MCC. -stephen -- Stephen Mulcahy, Applepie Solutions Ltd., Innovation in Business Center, GMIT, Dublin Rd, Galway, Ireland. +353.91.751262 http://www.aplpi.com Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway) From hahn at mcmaster.ca Tue Jul 8 07:58:14 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <487377C2.2070503@aplpi.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> Message-ID: > I also have vague memories of SLS and Ygdrassil (sp?) - not sure if they came > before or after MCC. well after - gotta be around 93-94. (I wish I'd kept my MCC floppies, at least to be bronzed or something.) From deadline at eadline.org Tue Jul 8 08:30:06 2008 From: deadline at eadline.org (Douglas Eadline) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <487377C2.2070503@aplpi.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> Message-ID: <50761.69.139.186.42.1215531006.squirrel@mail.eadline.org> A blast from the past. I have a copy of the Yggdrasil "Linux Bible". A phone book of Linux How-To's and other docs from around 1995. Quite useful before Google became the help desk. -- Doug > > > Prentice Bisbal wrote: >> Actually, no, I don't remember MCC Interim Linux. It was before my time. >> My experience with Linux started in December 1996 or January 1997 with >> Red Hat Linux 4. > > I also have vague memories of SLS and Ygdrassil (sp?) - not sure if they > came before or after MCC. > > -stephen > > -- > Stephen Mulcahy, Applepie Solutions Ltd., Innovation in Business Center, > GMIT, Dublin Rd, Galway, Ireland. +353.91.751262 http://www.aplpi.com > Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway) > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Doug From Dan.Kidger at quadrics.com Tue Jul 8 08:41:29 2008 From: Dan.Kidger at quadrics.com (Dan.Kidger@quadrics.com) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <4873750E.1020407@ias.edu> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> Message-ID: <0D49B15ACFDF2F46BF90B6E08C90048A0488491946@quadbrsex1.quadrics.com> >-----Original Message----- >From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Prentice Bisbal >Sent: 08 July 2008 15:09 >Cc: beowulf@beowulf.org >Subject: Re: [Beowulf] A press release > >Steffen Grunewald wrote: >> >> Which isn't true. Don't you remember MCC Interim Linux, back in the old >> days of 0.95[abc] kernels? It didn't consist of tens of floppies (yet), >> but it *was* a distro. >> > >Actually, no, I don't remember MCC Interim Linux. It was before my time. > My experience with Linux started in December 1996 or January 1997 with >Red Hat Linux 4. a Linux newbie then? Daniel From landman at scalableinformatics.com Tue Jul 8 09:01:29 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <0D49B15ACFDF2F46BF90B6E08C90048A0488491946@quadbrsex1.quadrics.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <0D49B15ACFDF2F46BF90B6E08C90048A0488491946@quadbrsex1.quadrics.com> Message-ID: <48738F59.1050102@scalableinformatics.com> Dan.Kidger@quadrics.com wrote: >> -----Original Message----- >> From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Prentice Bisbal >> Sent: 08 July 2008 15:09 >> Cc: beowulf@beowulf.org >> Subject: Re: [Beowulf] A press release >> >> Steffen Grunewald wrote: >>> Which isn't true. Don't you remember MCC Interim Linux, back in the old >>> days of 0.95[abc] kernels? It didn't consist of tens of floppies (yet), >>> but it *was* a distro. >>> >> Actually, no, I don't remember MCC Interim Linux. It was before my time. >> My experience with Linux started in December 1996 or January 1997 with >> Red Hat Linux 4. > > > a Linux newbie then? Shades of the Monty Python skit with the people sitting around, talking about what happened to them when they were young, and how unappreciative the youth is today ... "Right. We would wake up at 9:30, half an hour before we went to sleep ..." -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From prentice at ias.edu Tue Jul 8 10:20:20 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <0D49B15ACFDF2F46BF90B6E08C90048A0488491946@quadbrsex1.quadrics.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <0D49B15ACFDF2F46BF90B6E08C90048A0488491946@quadbrsex1.quadrics.com> Message-ID: <4873A1D4.9080001@ias.edu> Dan.Kidger@quadrics.com wrote: >> -----Original Message----- >> From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Prentice Bisbal >> Sent: 08 July 2008 15:09 >> Cc: beowulf@beowulf.org >> Subject: Re: [Beowulf] A press release >> >> Steffen Grunewald wrote: >>> Which isn't true. Don't you remember MCC Interim Linux, back in the old >>> days of 0.95[abc] kernels? It didn't consist of tens of floppies (yet), >>> but it *was* a distro. >>> >> Actually, no, I don't remember MCC Interim Linux. It was before my time. >> My experience with Linux started in December 1996 or January 1997 with >> Red Hat Linux 4. > > > a Linux newbie then? Yes, but I have youth on my side. -- Prentice From dnlombar at ichips.intel.com Tue Jul 8 10:58:44 2008 From: dnlombar at ichips.intel.com (Lombard, David N) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <487377C2.2070503@aplpi.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> Message-ID: <20080708175844.GA5087@nlxdcldnl2.cl.intel.com> On Tue, Jul 08, 2008 at 07:20:50AM -0700, stephen mulcahy wrote: > > > Prentice Bisbal wrote: > > Actually, no, I don't remember MCC Interim Linux. It was before my time. > > My experience with Linux started in December 1996 or January 1997 with > > Red Hat Linux 4. > > I also have vague memories of SLS and Ygdrassil (sp?) - not sure if they > came before or after MCC. All were after. MCC dates from '91; vintage '92 files are still downloadable from MCC. SLS (the parent of Slackware) is mid '92 Yggdrasil is late '92 Slackware and Debian are mid '93. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From linuxmercedes at gmail.com Mon Jul 7 15:26:17 2008 From: linuxmercedes at gmail.com (Linux Mercedes) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] Small Distributed Clusters In-Reply-To: <486E0522.2090600@rri.sari.ac.uk> References: <486E0522.2090600@rri.sari.ac.uk> Message-ID: On Fri, Jul 4, 2008 at 6:10 AM, Tony Travis wrote: > Ian Pascoe wrote: > >> Hi all, >> > > > > > Both will be connected to the Internet using ADSL and the limitation will >> be >> the upload speed of a maximum of 512Kbs. >> > > Another issue, apart from the 'A' (Assymetric speed) if you're ADSL is that > of setting up your routers to permit incoming connections on port 22, and > having static IP addresses. This is straight forward, but does need to be > done before your clusters can communicate. Or, if you don't want to pay for static ip addresses, use something like DynDNS.com to get subdomains for each of your internet connections and use the domain name -> ip address translation to keep track of dynamic ip's without having to keep reconfiguring the software. I do this with my web server so I can always type ssh linuxmercedes.homelinux.com into a terminal and log into it. Should work for any regular server. If you're wondering, DynDNS keeps track of your dynamic IP by having you run a little perl script called ddclient that automatically updates your address in their records when it changes. Nifty. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080707/fd986dbf/attachment.html From kstrouts at fastmail.fm Tue Jul 8 04:15:31 2008 From: kstrouts at fastmail.fm (Kenneth Strouts) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] open mosix alternative In-Reply-To: References: <486A4754.900@rri.sari.ac.uk> <20080704093926.vzj0jtz1cg4owgow@www.sms.ed.ac.uk> Message-ID: <1215515731.10297.1262391567@webmail.messagingengine.com> No since the cluster freezes when a node crashes, so there's no chance to start such a package. Kerrighed doesn't support addition and removal of nodes from the cluster yet, and there isn't much scope for dealing with node failure until it does. On Tue, 8 Jul 2008 07:49:50 +0200, "Jon Aquilina" said: > cant that be used in conjunction wiht other packages taht will power down > a > node which fails? > > On Fri, Jul 4, 2008 at 10:39 AM, Kenneth Duncan Strouts < > K.D.Strouts@sms.ed.ac.uk> wrote: > > > Hi Jon, > > > > > > Quoting Tony Travis : > >> Although Kerrighed looks very promising, it is also quite fragile in our > >> hands. If one node crashes, you lose the entire cluster. That said, the > >> Kerrighed project is extremely well supported and I believe it will be a > >> good alternative in the near future. > >> > > > > > > We found that with Kerrighed, one node crashing sees the whole cluster go > > down. The following is output to kern.log before the cluster dies. > > > > Jul 2 13:57:03 nodeC@kghed kernel: TIPC: Resetting link > > <1.1.2:eth1-1.1.3:eth1>, peer not responding > > Jul 2 13:57:03 nodeC@kghed kernel: TIPC: Lost link > > <1.1.2:eth1-1.1.3:eth1> on network plane B > > Jul 2 13:57:03 nodeC@kghed kernel: TIPC: Lost contact with <1.1.3> > > > > From the Kerrighed mailing list (Louis Rilling); > > > > "Indeed, Kerrighed does not tolerate node failures yet. We have no precise > > date > > for this, and giving a date right now would be meaningless. The first step > > for > > us is to support dynamic cluster resizing (IOW live node additions and > > removals), and we've just started working on it. We will work on node > > failures in a second step." > > > > It seems they are working on this, and on a new framework for configurable > > process scheduling. Probably Kerrighed will provide a good alternative in > > future. > > > > Kenneth > > > > > > > > -- > > The University of Edinburgh is a charitable body, registered in > > Scotland, with registration number SC005336. > > > > > > > > > > > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > -- > Jonathan Aquilina -- Kenneth Strouts kstrouts@fastmail.fm -- http://www.fastmail.fm - The way an email service should be From steffen.grunewald at aei.mpg.de Tue Jul 8 08:21:32 2008 From: steffen.grunewald at aei.mpg.de (Steffen Grunewald) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: References: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> Message-ID: <20080708152132.GI11428@casco.aei.mpg.de> On Tue, Jul 08, 2008 at 10:58:14AM -0400, Mark Hahn wrote: > >I also have vague memories of SLS and Ygdrassil (sp?) - not sure if they > >came before or after MCC. > > well after - gotta be around 93-94. > (I wish I'd kept my MCC floppies, at least to be bronzed or something.) Want a copy of mine? SLS was the first multi-floppy (>10) distro I'm aware of. Yggdrasil was something on CDs (nothing I could afford at that time, also since there was no CD drive for the laptop I was using. USB wasn't invented yet. SCSI PCMCIA I couldn't afford) Steffen From pub at acnlab.csie.ncu.edu.tw Wed Jul 2 09:45:00 2008 From: pub at acnlab.csie.ncu.edu.tw (publication acnlab) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] [CFP] IEEE ICPADS Workshop P2PNVE'08 (Submission Deadline Extended to July 15, 2008) Message-ID: <486BB08C.4050506@acnlab.csie.ncu.edu.tw> [Paper Submission Deadline Extend to July 15, 2008] Call for Papers The 2nd International Workshop on Peer-to-Peer Networked Virtual Environments in conjunction with The 14th International Conference on Parallel and Distributed Systems (ICPADS 2008) December 8 -10, 2008 Melbourne, Victoria, AUSTRALIA http://acnlab.csie.ncu.edu.tw/P2PNVE2008/ PURPOSE AND SCOPE The rapid growth and popularity of networked virtual environments (NVEs) such as Massively Multiplayer Online Games (MMOGs) in recent years have spawned a series of research interests in constructing such large-scale virtual environments. For increasing scalability and decreasing the cost of management and deployment, more and more studies propose using peer-to-peer (P2P) architectures to construct large-scale NVEs for games, multimedia virtual worlds and other applications. The goal of such research may be to support an Earth-scale virtual environment or to make hosting virtual worlds more affordable than existing client-server approaches. However, existing solutions for consistency control, persistent data storage, multimedia data dissemination, and cheat-prevention may not be straightforwardly adapted to such new environments, novel ideas and designs thus are needed to realize the potential of P2P-based NVEs. The 1st International Workshop on Peer-to-Peer Networked Virtual Environments (P2P-NVE 2007) was held in Hsinchu, Taiwan, in 2007. To adhere to the theme of P2P-NVE 2007, the theme of P2P-NVE 2008 is to solicit original and previously unpublished new ideas on general P2P schemes and on the design and realization of P2P-based NVEs. The workshops aim to facilitate discussions and idea exchanges by both academics and practitioners. Student participations are also strongly encouraged. Topics of interest include, but are not limited to: *P2P systems and infrastructures *Applications of P2P systems *Performance evaluation of P2P systems *Trust and security issues in P2P systems *Network support for P2P systems *Fault tolerance in P2P systems *Efficient P2P resource lookup and sharing *Distributed Hash Tables (DHTs) and related issues *Constructions of P2P overlays for NVEs *Multicast for P2P NVEs *P2P NVE content distribution *3D streaming for P2P NVEs *Voice communication on P2P NVEs *Persistent storage for P2P NVEs *Security and cheat-prevention mechanisms for P2P games *Data structures and queries for P2P NVEs *Consistency control for P2P NVEs *Design considerations for P2P NVEs *Prototypes of P2P NVEs *P2P control for mobile NVEs *P2P NVE applications on mobile devices IMPORTANT DATES Submission: July 15, 2008 Notification: August 15, 2008 Camera ready: September 12, 2008 PAPER SUBMISSION Authors are invited to submit an electronic version of original, unpublished manuscripts, not to exceed 6 double-columned, single-spaced pages, to web site http://acnlab.csie.ncu.edu.tw/P2PNVE2008. Submitted papers should be in be in PDF format in accordance with IEEE Computer Society guidelines (ftp://pubftp.computer.org/press/outgoing/proceedings). All submitted papers will be refereed by reviewers in terms of originality, contribution, correctness, and presentation. From pub at acnlab.csie.ncu.edu.tw Mon Jul 7 22:40:49 2008 From: pub at acnlab.csie.ncu.edu.tw (publication acnlab) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] [CFP] P2P-NVE 2008 [paper submission one-week remainder] Message-ID: <4872FDE1.5040606@acnlab.csie.ncu.edu.tw> [Paper Submission Deadline Extend to July 15, 2008] Call for Papers The 2nd International Workshop on Peer-to-Peer Networked Virtual Environments in conjunction with The 14th International Conference on Parallel and Distributed Systems (ICPADS 2008) December 8 -10, 2008 Melbourne, Victoria, AUSTRALIA http://acnlab.csie.ncu.edu.tw/P2PNVE2008/ PURPOSE AND SCOPE The rapid growth and popularity of networked virtual environments (NVEs) such as Massively Multiplayer Online Games (MMOGs) in recent years have spawned a series of research interests in constructing such large-scale virtual environments. For increasing scalability and decreasing the cost of management and deployment, more and more studies propose using peer-to-peer (P2P) architectures to construct large-scale NVEs for games, multimedia virtual worlds and other applications. The goal of such research may be to support an Earth-scale virtual environment or to make hosting virtual worlds more affordable than existing client-server approaches. However, existing solutions for consistency control, persistent data storage, multimedia data dissemination, and cheat-prevention may not be straightforwardly adapted to such new environments, novel ideas and designs thus are needed to realize the potential of P2P-based NVEs. The 1st International Workshop on Peer-to-Peer Networked Virtual Environments (P2P-NVE 2007) was held in Hsinchu, Taiwan, in 2007. To adhere to the theme of P2P-NVE 2007, the theme of P2P-NVE 2008 is to solicit original and previously unpublished new ideas on general P2P schemes and on the design and realization of P2P-based NVEs. The workshops aim to facilitate discussions and idea exchanges by both academics and practitioners. Student participations are also strongly encouraged. Topics of interest include, but are not limited to: *P2P systems and infrastructures *Applications of P2P systems *Performance evaluation of P2P systems *Trust and security issues in P2P systems *Network support for P2P systems *Fault tolerance in P2P systems *Efficient P2P resource lookup and sharing *Distributed Hash Tables (DHTs) and related issues *Constructions of P2P overlays for NVEs *Multicast for P2P NVEs *P2P NVE content distribution *3D streaming for P2P NVEs *Voice communication on P2P NVEs *Persistent storage for P2P NVEs *Security and cheat-prevention mechanisms for P2P games *Data structures and queries for P2P NVEs *Consistency control for P2P NVEs *Design considerations for P2P NVEs *Prototypes of P2P NVEs *P2P control for mobile NVEs *P2P NVE applications on mobile devices IMPORTANT DATES Submission: July 15, 2008 Notification: August 15, 2008 Camera ready: September 12, 2008 PAPER SUBMISSION Authors are invited to submit an electronic version of original, unpublished manuscripts, not to exceed 6 double-columned, single-spaced pages, to web site http://acnlab.csie.ncu.edu.tw/P2PNVE2008. Submitted papers should be in be in PDF format in accordance with IEEE Computer Society guidelines (ftp://pubftp.computer.org/press/outgoing/proceedings). All submitted papers will be refereed by reviewers in terms of originality, contribution, correctness, and presentation. From landman at scalableinformatics.com Tue Jul 8 19:01:48 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] An annoying MPI problem Message-ID: <48741C0C.2050303@scalableinformatics.com> Hi folks Dealing with an MPI problem that has me scratching my head. Quite beowulfish, as thats where this code runs. Short version: The code starts and runs. Reads in its data. Starts its iterations. And then somewhere after this, it hangs. But not always at the same place. It doesn't write state data back out to the disk, just logs. Rerunning it gets it to a different point, sometimes hanging sooner, sometimes later. Seems to be the case on multiple different machines, with different OSes. Working on comparing MPI distributions, and it hangs with IB as well as with shared memory and tcp sockets. Right now we are using OpenMPI 1.2.6, and this code does use allreduce. When it hangs, an strace of the master process shows lots of polling: c1-1:~ # strace -p 8548 Process 8548 attached - interrupt to quit rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 rt_sigaction(SIGCHLD, {0x2b061f65c9b2, [CHLD], SA_RESTORER|SA_RESTART, 0x2b062049b130}, NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0 poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=10, events=POLLIN}], 6, 0) = 0 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 rt_sigaction(SIGCHLD, {0x2b061f65c9b2, [CHLD], SA_RESTORER|SA_RESTART, 0x2b062049b130}, NULL, 8) = 0 [spin forever] ... So it looks like the process is waiting for the appropriate posting on the internal scoreboard, and just hanging in a tight loop until this actually happens. But these hangs usually happen at the same place each time for a logic error. This is what I have seen in the past from other MPI codes where you have enough sends and receives, but everyone posts their send before their receive ... ordering is important of course. But the odd thing about this code is that it worked fine 12 - 18 months ago, and we haven't touched it since (nor has it changed). What has changed is that we are now using OpenMPI 1.2.6. So the code hasn't changed, and the OS on which it runs hasn't changed, but the MPI stack has. Yeah, thats a clue. Turning off openib and tcp doesn't make a great deal of impact. This is also a clue. I am looking now to trying mvapich2 and seeing how that goes. Using Intel and gfortran compilers (Fortran/C mixed code). Anyone see strange things like this with their MPI stacks? OpenMPI? Mvapich2? I should try the Intel MPI as well (rebuilt mvapich2 as I remember). I'll try all the usual things (reduce the optimization level, etc). Sage words of advice (and clue sticks) welcome. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From rgb at phy.duke.edu Tue Jul 8 20:25:58 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <487377C2.2070503@aplpi.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> Message-ID: On Tue, 8 Jul 2008, stephen mulcahy wrote: > > > Prentice Bisbal wrote: >> Actually, no, I don't remember MCC Interim Linux. It was before my time. >> My experience with Linux started in December 1996 or January 1997 with >> Red Hat Linux 4. > > I also have vague memories of SLS and Ygdrassil (sp?) - not sure if they came > before or after MCC. Wikipedia has at least one article on the history of linux and talks about the very first distros -- there were two or three that were close to simultaneous, although the VERY first ones were very thin indeed. SLS (later Slackware) was one of the first, but not quite THE first. It was probably the first "complete" distribution, or close to it. IIRC, SLS was fewer floppies that Slackware, but Slackware was arguably easier to install and worked better (as one would expect). And it had more, better packages. I'm not certain that W. information is totally accurate here, but it probably isn't bad. People don't usually write articles like this unless they are at least moderately authoritative. rgb > > -stephen > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From shaeffer at neuralscape.com Tue Jul 8 20:53:00 2008 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: References: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> Message-ID: <20080709035300.GA31703@synapse.neuralscape.com> On Tue, Jul 08, 2008 at 11:25:58PM -0400, Robert G. Brown wrote: > > Wikipedia has at least one article on the history of linux and talks > about the very first distros -- there were two or three that were close > to simultaneous, although the VERY first ones were very thin indeed. > SLS (later Slackware) was one of the first, but not quite THE first. It > was probably the first "complete" distribution, or close to it. > > IIRC, SLS was fewer floppies that Slackware, but Slackware was arguably > easier to install and worked better (as one would expect). And it had > more, better packages. > > rgb Hi, OK, here is your linux history buff quiz. We all know Patrick V. was the technical spirit of slackware. Who was the original sales and marketing wizard for slackware? (smiles ;) Thanks, Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer@neuralscape.com http://www.neuralscape.com From rgb at phy.duke.edu Tue Jul 8 21:21:50 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <20080709035300.GA31703@synapse.neuralscape.com> References: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080709035300.GA31703@synapse.neuralscape.com> Message-ID: On Tue, 8 Jul 2008, Karen Shaeffer wrote: > On Tue, Jul 08, 2008 at 11:25:58PM -0400, Robert G. Brown wrote: >> >> Wikipedia has at least one article on the history of linux and talks >> about the very first distros -- there were two or three that were close >> to simultaneous, although the VERY first ones were very thin indeed. >> SLS (later Slackware) was one of the first, but not quite THE first. It >> was probably the first "complete" distribution, or close to it. >> >> IIRC, SLS was fewer floppies that Slackware, but Slackware was arguably >> easier to install and worked better (as one would expect). And it had >> more, better packages. >> >> rgb > > Hi, > > OK, here is your linux history buff quiz. We all know Patrick V. was > the technical spirit of slackware. Who was the original sales and > marketing wizard for slackware? (smiles ;) Hmmm, I'm guessing that would be "Bob", wouldn't it? I mean, a lot of famous slacker linuxites used it, e.g. Maddog, but if you are smiling, it has to be him. And he wasn't smilin' because of Enzyte! rgb -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From prentice at ias.edu Wed Jul 9 06:53:07 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <50761.69.139.186.42.1215531006.squirrel@mail.eadline.org> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <50761.69.139.186.42.1215531006.squirrel@mail.eadline.org> Message-ID: <4874C2C3.2040007@ias.edu> Douglas Eadline wrote: > A blast from the past. I have a copy of the Yggdrasil "Linux Bible". > A phone book of Linux How-To's and other docs from around 1995. > Quite useful before Google became the help desk. > > -- > Doug > Translation: Doug is a pack rat. -- Prentice From prentice at ias.edu Wed Jul 9 06:58:21 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <20080709035300.GA31703@synapse.neuralscape.com> References: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080709035300.GA31703@synapse.neuralscape.com> Message-ID: <4874C3FD.6010605@ias.edu> Karen Shaeffer wrote: > > Hi, > > OK, here is your linux history buff quiz. We all know Patrick V. was > the technical spirit of slackware. Who was the original sales and > marketing wizard for slackware? (smiles ;) > > Thanks, > Karen Quiz #2: Spell Patrick V.'s last name for Karen. -- Prentice From shaeffer at neuralscape.com Wed Jul 9 09:04:05 2008 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <4874C3FD.6010605@ias.edu> References: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080709035300.GA31703@synapse.neuralscape.com> <4874C3FD.6010605@ias.edu> Message-ID: <20080709160405.GA8974@synapse.neuralscape.com> On Wed, Jul 09, 2008 at 09:58:21AM -0400, Prentice Bisbal wrote: > Karen Shaeffer wrote: > > > > Hi, > > > > OK, here is your linux history buff quiz. We all know Patrick V. was > > the technical spirit of slackware. Who was the original sales and > > marketing wizard for slackware? (smiles ;) > > > > Thanks, > > Karen > > Quiz #2: Spell Patrick V.'s last name for Karen. > > -- > Prentice Hi Prentice, http://en.wikipedia.org/wiki/Patrick_Volkerding Now, if we just knew Bob's last name... Quiz #3: Where was the home base City and State for Slackware? Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer@neuralscape.com http://www.neuralscape.com From peter.st.john at gmail.com Wed Jul 9 09:14:50 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <20080709160405.GA8974@synapse.neuralscape.com> References: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080709035300.GA31703@synapse.neuralscape.com> <4874C3FD.6010605@ias.edu> <20080709160405.GA8974@synapse.neuralscape.com> Message-ID: Re "Bob", I was guessing Dobbs, http://en.wikipedia.org/wiki/J._R._%22Bob%22_Dobbs, mentioned in that wiki article. Peter On 7/9/08, Karen Shaeffer wrote: > > On Wed, Jul 09, 2008 at 09:58:21AM -0400, Prentice Bisbal wrote: > > Karen Shaeffer wrote: > > > > > > Hi, > > > > > > OK, here is your linux history buff quiz. We all know Patrick V. was > > > the technical spirit of slackware. Who was the original sales and > > > marketing wizard for slackware? (smiles ;) > > > > > > Thanks, > > > Karen > > > > Quiz #2: Spell Patrick V.'s last name for Karen. > > > > -- > > Prentice > > > Hi Prentice, > > http://en.wikipedia.org/wiki/Patrick_Volkerding > > Now, if we just knew Bob's last name... > > Quiz #3: > > Where was the home base City and State for Slackware? > > > Karen > -- > Karen Shaeffer > Neuralscape, Palo Alto, Ca. 94306 > shaeffer@neuralscape.com http://www.neuralscape.com > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080709/10fded08/attachment.html From rgb at phy.duke.edu Wed Jul 9 09:59:36 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <20080709160405.GA8974@synapse.neuralscape.com> References: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080709035300.GA31703@synapse.neuralscape.com> <4874C3FD.6010605@ias.edu> <20080709160405.GA8974@synapse.neuralscape.com> Message-ID: On Wed, 9 Jul 2008, Karen Shaeffer wrote: > Now, if we just knew Bob's last name... I'm sorry, this information is reserved only to the elitonic illuminachos of the seventh tier. However, for a small fee I can provide you with the secret instructions required to commune with YHWH-1 and be enlightmored. Now, where did I put my pipe...;-) rgb -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From rgb at phy.duke.edu Wed Jul 9 10:01:20 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: References: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080709035300.GA31703@synapse.neuralscape.com> <4874C3FD.6010605@ias.edu> <20080709160405.GA8974@synapse.neuralscape.com> Message-ID: On Wed, 9 Jul 2008, Peter St. John wrote: > Re "Bob", I was guessing Dobbs, > http://en.wikipedia.org/wiki/J._R._%22Bob%22_Dobbs, mentioned in that wiki > article. > Peter You can't just TELL people that. You don't want the subgenii showing up around your door -- those sons of yetis can be downright dangerous when secrets are revealed to anyone that hasn't paid the fee...:-) rgb > > On 7/9/08, Karen Shaeffer wrote: >> >> On Wed, Jul 09, 2008 at 09:58:21AM -0400, Prentice Bisbal wrote: >>> Karen Shaeffer wrote: >>>> >>>> Hi, >>>> >>>> OK, here is your linux history buff quiz. We all know Patrick V. was >>>> the technical spirit of slackware. Who was the original sales and >>>> marketing wizard for slackware? (smiles ;) >>>> >>>> Thanks, >>>> Karen >>> >>> Quiz #2: Spell Patrick V.'s last name for Karen. >>> >>> -- >>> Prentice >> >> >> Hi Prentice, >> >> http://en.wikipedia.org/wiki/Patrick_Volkerding >> >> Now, if we just knew Bob's last name... >> >> Quiz #3: >> >> Where was the home base City and State for Slackware? >> >> >> Karen >> -- >> Karen Shaeffer >> Neuralscape, Palo Alto, Ca. 94306 >> shaeffer@neuralscape.com http://www.neuralscape.com >> _______________________________________________ >> >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From shaeffer at neuralscape.com Wed Jul 9 10:22:08 2008 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: References: <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080709035300.GA31703@synapse.neuralscape.com> <4874C3FD.6010605@ias.edu> <20080709160405.GA8974@synapse.neuralscape.com> Message-ID: <20080709172208.GA9535@synapse.neuralscape.com> On Wed, Jul 09, 2008 at 12:59:36PM -0400, Robert G. Brown wrote: > On Wed, 9 Jul 2008, Karen Shaeffer wrote: > > >Now, if we just knew Bob's last name... > > I'm sorry, this information is reserved only to the elitonic illuminachos > of the seventh tier. However, for a small fee I can provide you with > the secret instructions required to commune with YHWH-1 and be > enlightmored. > > Now, where did I put my pipe...;-) > > rgb Hi rgb, Hahaha! I know Bob's last name. But someone else will have to divulge it. Patrick's last name was trivial. Hahahaha. BTW, Patrick is a really cool figure in Linux history. My source for who Bob is, is Patrick himself. (smiles ;) Patrick and I once had a long and rambling discussion about the early days of Slackware, and that is when Patrick discussed Bob to me... I won't divulge any of the content of that private discussion. (smiles ;) So my source is definitive. We still don't know the original home base city and state for Slackware: OK: It is Fargo, North Dakota. Patrick told me this himself once. But it is verified here: http://www.linuxjournal.com/article/2750 Thanks, Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer@neuralscape.com http://www.neuralscape.com From shaeffer at neuralscape.com Wed Jul 9 10:26:49 2008 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: References: <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080709035300.GA31703@synapse.neuralscape.com> <4874C3FD.6010605@ias.edu> <20080709160405.GA8974@synapse.neuralscape.com> Message-ID: <20080709172649.GA9603@synapse.neuralscape.com> Hi Pete, Based on my discussions with Patrick, I don't believe this is correct. Hahaha, but it does deserve careful consideration. Karen On Wed, Jul 09, 2008 at 12:14:50PM -0400, Peter St. John wrote: > Re "Bob", I was guessing Dobbs, > http://en.wikipedia.org/wiki/J._R._%22Bob%22_Dobbs, mentioned in that wiki > article. > Peter > > On 7/9/08, Karen Shaeffer wrote: > > > > On Wed, Jul 09, 2008 at 09:58:21AM -0400, Prentice Bisbal wrote: > > > Karen Shaeffer wrote: > > > > > > > > Hi, > > > > > > > > OK, here is your linux history buff quiz. We all know Patrick V. was > > > > the technical spirit of slackware. Who was the original sales and > > > > marketing wizard for slackware? (smiles ;) > > > > > > > > Thanks, > > > > Karen > > > > > > Quiz #2: Spell Patrick V.'s last name for Karen. > > > > > > -- > > > Prentice > > > > > > Hi Prentice, > > > > http://en.wikipedia.org/wiki/Patrick_Volkerding > > > > Now, if we just knew Bob's last name... > > > > Quiz #3: > > > > Where was the home base City and State for Slackware? > > > > > > Karen > > -- > > Karen Shaeffer > > Neuralscape, Palo Alto, Ca. 94306 > > shaeffer@neuralscape.com http://www.neuralscape.com > > _______________________________________________ > > > > Beowulf mailing list, Beowulf@beowulf.org > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf ---end quoted text--- -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer@neuralscape.com http://www.neuralscape.com From James.P.Lux at jpl.nasa.gov Wed Jul 9 10:53:22 2008 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <20080709160405.GA8974@synapse.neuralscape.com> References: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080709035300.GA31703@synapse.neuralscape.com> <4874C3FD.6010605@ias.edu> <20080709160405.GA8974@synapse.neuralscape.com> Message-ID: <6.2.5.6.2.20080709104852.02d1cc50@jpl.nasa.gov> At 09:04 AM 7/9/2008, Karen Shaeffer wrote: >On Wed, Jul 09, 2008 at 09:58:21AM -0400, Prentice Bisbal wrote: > > Karen Shaeffer wrote: > > > > > > Hi, > > > > > > OK, here is your linux history buff quiz. We all know Patrick V. was > > > the technical spirit of slackware. Who was the original sales and > > > marketing wizard for slackware? (smiles ;) > > > > > > Thanks, > > > Karen > > > > Quiz #2: Spell Patrick V.'s last name for Karen. Why bother when GIYF http://en.wikipedia.org/wiki/Patrick_Volkerding From prentice at ias.edu Wed Jul 9 11:03:09 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <6.2.5.6.2.20080709104852.02d1cc50@jpl.nasa.gov> References: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080709035300.GA31703@synapse.neuralscape.com> <4874C3FD.6010605@ias.edu> <20080709160405.GA8974@synapse.neuralscape.com> <6.2.5.6.2.20080709104852.02d1cc50@jpl.nasa.gov> Message-ID: <4874FD5D.4040802@ias.edu> Jim Lux wrote: > At 09:04 AM 7/9/2008, Karen Shaeffer wrote: >> On Wed, Jul 09, 2008 at 09:58:21AM -0400, Prentice Bisbal wrote: >> > Karen Shaeffer wrote: >> > > >> > > Hi, >> > > >> > > OK, here is your linux history buff quiz. We all know Patrick V. was >> > > the technical spirit of slackware. Who was the original sales and >> > > marketing wizard for slackware? (smiles ;) >> > > >> > > Thanks, >> > > Karen >> > >> > Quiz #2: Spell Patrick V.'s last name for Karen. > > Why bother when GIYF > http://en.wikipedia.org/wiki/Patrick_Volkerding > You're all cheaters! I know GIYF, and how to spell his name without "The Google." I noticed that Karen avoided spelling his last name, so I thought I'd have some fun with the implication that she used the "V." 'cause she couldn't spell it. I'm not angry, just... disappointed. ;) -- Prentice From shaeffer at neuralscape.com Wed Jul 9 11:06:57 2008 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <4874FD5D.4040802@ias.edu> References: <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080709035300.GA31703@synapse.neuralscape.com> <4874C3FD.6010605@ias.edu> <20080709160405.GA8974@synapse.neuralscape.com> <6.2.5.6.2.20080709104852.02d1cc50@jpl.nasa.gov> <4874FD5D.4040802@ias.edu> Message-ID: <20080709180657.GB9998@synapse.neuralscape.com> On Wed, Jul 09, 2008 at 02:03:09PM -0400, Prentice Bisbal wrote: > Jim Lux wrote: > > At 09:04 AM 7/9/2008, Karen Shaeffer wrote: > >> On Wed, Jul 09, 2008 at 09:58:21AM -0400, Prentice Bisbal wrote: > >> > Karen Shaeffer wrote: > >> > > > >> > > Hi, > >> > > > >> > > OK, here is your linux history buff quiz. We all know Patrick V. was > >> > > the technical spirit of slackware. Who was the original sales and > >> > > marketing wizard for slackware? (smiles ;) > >> > > > >> > > Thanks, > >> > > Karen > >> > > >> > Quiz #2: Spell Patrick V.'s last name for Karen. > > > > Why bother when GIYF > > http://en.wikipedia.org/wiki/Patrick_Volkerding > > > > You're all cheaters! I know GIYF, and how to spell his name without "The > Google." I noticed that Karen avoided spelling his last name, so I > thought I'd have some fun with the implication that she used the "V." > 'cause she couldn't spell it. > > I'm not angry, just... disappointed. ;) Hey Prentice, I had no problem spelling Volkerding. I know Patrick. And his name is very easy to spell. I didn't give his last name out of some notion of respecting his privacy. We still appear to be doing that for Bob... (giggles ;) Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer@neuralscape.com http://www.neuralscape.com From gerry.creager at tamu.edu Wed Jul 9 12:25:08 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: References: <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080709035300.GA31703@synapse.neuralscape.com> <4874C3FD.6010605@ias.edu> <20080709160405.GA8974@synapse.neuralscape.com> Message-ID: <48751094.30205@tamu.edu> Robert G. Brown wrote: > On Wed, 9 Jul 2008, Peter St. John wrote: > >> Re "Bob", I was guessing Dobbs, >> http://en.wikipedia.org/wiki/J._R._%22Bob%22_Dobbs, mentioned in that >> wiki >> article. >> Peter > > You can't just TELL people that. You don't want the subgenii showing up > around your door -- those sons of yetis can be downright dangerous when > secrets are revealed to anyone that hasn't paid the fee...:-) Isn't the plural of 'yeti', well, 'yeti'? >> On 7/9/08, Karen Shaeffer wrote: >>> >>> On Wed, Jul 09, 2008 at 09:58:21AM -0400, Prentice Bisbal wrote: >>>> Karen Shaeffer wrote: >>>>> >>>>> Hi, >>>>> >>>>> OK, here is your linux history buff quiz. We all know Patrick V. was >>>>> the technical spirit of slackware. Who was the original sales and >>>>> marketing wizard for slackware? (smiles ;) >>>>> >>>>> Thanks, >>>>> Karen >>>> >>>> Quiz #2: Spell Patrick V.'s last name for Karen. >>>> >>>> -- >>>> Prentice >>> >>> >>> Hi Prentice, >>> >>> http://en.wikipedia.org/wiki/Patrick_Volkerding >>> >>> Now, if we just knew Bob's last name... >>> >>> Quiz #3: >>> >>> Where was the home base City and State for Slackware? >>> >>> >>> Karen >>> -- >>> Karen Shaeffer >>> Neuralscape, Palo Alto, Ca. 94306 >>> shaeffer@neuralscape.com http://www.neuralscape.com >>> _______________________________________________ >>> >>> Beowulf mailing list, Beowulf@beowulf.org >>> To change your subscription (digest mode or unsubscribe) visit >>> http://www.beowulf.org/mailman/listinfo/beowulf >>> >> > -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From matt at technoronin.com Wed Jul 9 13:17:10 2008 From: matt at technoronin.com (Matt Lawrence) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <4874C2C3.2040007@ias.edu> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <50761.69.139.186.42.1215531006.squirrel@mail.eadline.org> <4874C2C3.2040007@ias.edu> Message-ID: On Wed, 9 Jul 2008, Prentice Bisbal wrote: > Douglas Eadline wrote: >> A blast from the past. I have a copy of the Yggdrasil "Linux Bible". >> A phone book of Linux How-To's and other docs from around 1995. >> Quite useful before Google became the help desk. >> >> -- >> Doug >> > > Translation: Doug is a pack rat. No, he is merely satisfying his genetic tendancy toward archivism. -- Matt It's not what I know that counts. It's what I can remember in time to use. From apittman at concurrent-thinking.com Wed Jul 9 14:25:27 2008 From: apittman at concurrent-thinking.com (Ashley Pittman) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] An annoying MPI problem In-Reply-To: <48741C0C.2050303@scalableinformatics.com> References: <48741C0C.2050303@scalableinformatics.com> Message-ID: <1215638727.10900.23.camel@bruce.priv.wark.uk.streamline-computing.com> On Tue, 2008-07-08 at 22:01 -0400, Joe Landman wrote: > Short version: The code starts and runs. Reads in its data. Starts > its iterations. And then somewhere after this, it hangs. But not > always at the same place. It doesn't write state data back out to the > disk, just logs. Rerunning it gets it to a different point, sometimes > hanging sooner, sometimes later. Seems to be the case on multiple > different machines, with different OSes. Working on comparing MPI > distributions, and it hangs with IB as well as with shared memory and > tcp sockets. Sounds like you've found a bug, doesn't sound too difficult to find, comments in-line. > Right now we are using OpenMPI 1.2.6, and this code does use > allreduce. When it hangs, an strace of the master process shows lots of > polling: Why do you mention allreduce, does it tend to be in allreduce when it hangs? Is it happening at the same place but on a different iteration every time perhaps? This is quite important, you could either have a "random" memory corruption which can cause the program to stop anywhere and are often hard to find or a race condition which is easier to deal with, if there are any similarities in the stack then it tends to point to the latter. allreduce is one of the collective functions with an implicit barrier which means that *no* process can return from it until *all* processes have called it, if you program uses allreduce extensively it's entirely possible that one process has stopped for whatever reason and have the rest continued as far as they can until they too deadlock. Collectives often get accused of causing programs to hang when in reality N-1 processes are in the collective call and 1 is off somewhere else. > c1-1:~ # strace -p 8548 > [spin forever] Any chance of a stack trace, preferably a parallel one? I assume *all* processes in the job are in the R state? Do you have a mechanism available to allow you to see the message queues? > So it looks like the process is waiting for the appropriate posting on > the internal scoreboard, and just hanging in a tight loop until this > actually happens. > > But these hangs usually happen at the same place each time for a logic > error. Like in allreduce you mean? > But the odd thing about this code is that it worked fine 12 - 18 months > ago, and we haven't touched it since (nor has it changed). What has > changed is that we are now using OpenMPI 1.2.6. The other important thing to know here is what you have changed *from*. > So the code hasn't changed, and the OS on which it runs hasn't changed, > but the MPI stack has. Yeah, thats a clue. > Turning off openib and tcp doesn't make a great deal of impact. This is > also a clue. So it's likely algorithmic? You could turn off shared memory as well but it won't make a great deal of impact so there isn't any point. > I am looking now to trying mvapich2 and seeing how that goes. Using > Intel and gfortran compilers (Fortran/C mixed code). > > Anyone see strange things like this with their MPI stacks? All the time, it's not really strange, just what happens on large systems, expecially when developing MPI or applications. > I'll try all the usual things (reduce the optimization level, etc). > Sage words of advice (and clue sticks) welcome. Is it the application which hangs or a combination of the application and the dataset you give it? What's the smallest process count and timescale you can reproduce this on? You could try valgrind which works well with openmpi, it will help you with memory corruption but won't help be of much help if you have a race condition. Going by reputation Marmot might be of some use, it'll point out if you are doing anything silly with MPI calls, there is enough flexibility in the standard that you can do something completely illegal but have it work in 90% of cases, marmot should pick up on these. http://www.hlrs.de/organization/amt/projects/marmot/ We could take this off-line if you prefer, this could potentially get quite involved... Ashley Pittman. From coutinho at dcc.ufmg.br Wed Jul 9 15:58:35 2008 From: coutinho at dcc.ufmg.br (Bruno Coutinho) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] An annoying MPI problem In-Reply-To: <1215638727.10900.23.camel@bruce.priv.wark.uk.streamline-computing.com> References: <48741C0C.2050303@scalableinformatics.com> <1215638727.10900.23.camel@bruce.priv.wark.uk.streamline-computing.com> Message-ID: Try disabling shared memory only. Open MPI shared memory buffer is limited and it enters deadlock if you overflow it. As Open MPI uses busy wait, it appears as a livelock. 2008/7/9 Ashley Pittman : > On Tue, 2008-07-08 at 22:01 -0400, Joe Landman wrote: > > Short version: The code starts and runs. Reads in its data. Starts > > its iterations. And then somewhere after this, it hangs. But not > > always at the same place. It doesn't write state data back out to the > > disk, just logs. Rerunning it gets it to a different point, sometimes > > hanging sooner, sometimes later. Seems to be the case on multiple > > different machines, with different OSes. Working on comparing MPI > > distributions, and it hangs with IB as well as with shared memory and > > tcp sockets. > > Sounds like you've found a bug, doesn't sound too difficult to find, > comments in-line. > > > Right now we are using OpenMPI 1.2.6, and this code does use > > allreduce. When it hangs, an strace of the master process shows lots of > > polling: > > Why do you mention allreduce, does it tend to be in allreduce when it > hangs? Is it happening at the same place but on a different iteration > every time perhaps? This is quite important, you could either have a > "random" memory corruption which can cause the program to stop anywhere > and are often hard to find or a race condition which is easier to deal > with, if there are any similarities in the stack then it tends to point > to the latter. > > allreduce is one of the collective functions with an implicit barrier > which means that *no* process can return from it until *all* processes > have called it, if you program uses allreduce extensively it's entirely > possible that one process has stopped for whatever reason and have the > rest continued as far as they can until they too deadlock. Collectives > often get accused of causing programs to hang when in reality N-1 > processes are in the collective call and 1 is off somewhere else. > > > c1-1:~ # strace -p 8548 > > > [spin forever] > > Any chance of a stack trace, preferably a parallel one? I assume *all* > processes in the job are in the R state? Do you have a mechanism > available to allow you to see the message queues? > > > So it looks like the process is waiting for the appropriate posting on > > the internal scoreboard, and just hanging in a tight loop until this > > actually happens. > > > > But these hangs usually happen at the same place each time for a logic > > error. > > Like in allreduce you mean? > > > But the odd thing about this code is that it worked fine 12 - 18 months > > ago, and we haven't touched it since (nor has it changed). What has > > changed is that we are now using OpenMPI 1.2.6. > > The other important thing to know here is what you have changed *from*. > > > So the code hasn't changed, and the OS on which it runs hasn't changed, > > but the MPI stack has. Yeah, thats a clue. > > > Turning off openib and tcp doesn't make a great deal of impact. This is > > also a clue. > > So it's likely algorithmic? You could turn off shared memory as well > but it won't make a great deal of impact so there isn't any point. > > > I am looking now to trying mvapich2 and seeing how that goes. Using > > Intel and gfortran compilers (Fortran/C mixed code). > > > > Anyone see strange things like this with their MPI stacks? > > All the time, it's not really strange, just what happens on large > systems, expecially when developing MPI or applications. > > > I'll try all the usual things (reduce the optimization level, etc). > > Sage words of advice (and clue sticks) welcome. > > Is it the application which hangs or a combination of the application > and the dataset you give it? What's the smallest process count and > timescale you can reproduce this on? > > You could try valgrind which works well with openmpi, it will help you > with memory corruption but won't help be of much help if you have a race > condition. Going by reputation Marmot might be of some use, it'll point > out if you are doing anything silly with MPI calls, there is enough > flexibility in the standard that you can do something completely illegal > but have it work in 90% of cases, marmot should pick up on these. > http://www.hlrs.de/organization/amt/projects/marmot/ > > We could take this off-line if you prefer, this could potentially get > quite involved... > > Ashley Pittman. > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080709/ba4fdfaa/attachment.html From csamuel at vpac.org Wed Jul 9 18:07:17 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] An annoying MPI problem In-Reply-To: <2136251104.192101215651942090.JavaMail.root@zimbra.vpac.org> Message-ID: <1374084408.192231215652037798.JavaMail.root@zimbra.vpac.org> ----- "Bruno Coutinho" wrote: > Try disabling shared memory only. > Open MPI shared memory buffer is limited and it enters deadlock if > you overflow it. > As Open MPI uses busy wait, it appears as a livelock. Interesting - is it possible to disable that at runtime with one of its billions of configuration options ? cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From landman at scalableinformatics.com Wed Jul 9 18:22:27 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] An annoying MPI problem In-Reply-To: <1374084408.192231215652037798.JavaMail.root@zimbra.vpac.org> References: <1374084408.192231215652037798.JavaMail.root@zimbra.vpac.org> Message-ID: <48756453.704@scalableinformatics.com> Chris Samuel wrote: > ----- "Bruno Coutinho" wrote: > >> Try disabling shared memory only. >> Open MPI shared memory buffer is limited and it enters deadlock if >> you overflow it. >> As Open MPI uses busy wait, it appears as a livelock. > > Interesting - is it possible to disable that at runtime > with one of its billions of configuration options ? -mca btl ^sm > > cheers, > Chris -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From csamuel at vpac.org Wed Jul 9 18:26:50 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <709078722.192571215653129034.JavaMail.root@zimbra.vpac.org> Message-ID: <1683391331.192621215653210247.JavaMail.root@zimbra.vpac.org> ----- "Robert G. Brown" wrote: > I'm sorry, this information is reserved only to the elitonic > illuminachos of the seventh tier. I can see the Fnords! TINC. At this rate of nostalgia Kibo will turn up soon.. ObLinux: I can remember in 1993 approaching my then boss at the University of Wales, Aberystwyth, with a plan cooked up by myself and Piercarlo Grandi (mainly him, truth be told) to replace their DEC 5830 Ultrix boxes with a big Linux PC. We were laughed at. :-) Little did any of us suspect then.. -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Wed Jul 9 18:44:26 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <195128089.192841215653770150.JavaMail.root@zimbra.vpac.org> Message-ID: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> ----- "Joe Landman" wrote: > Shades of the Monty Python skit with the people sitting around, > talking about what happened to them when they were young, and how > unappreciative the youth is today ... http://en.wikipedia.org/wiki/Four_Yorkshiremen_sketch I remember when I could bring the complete kernel sources home on a single 3.5" floppy disk. Compressed, mind you, not this new fangled gzip! Then came 0.9.15, 1.8MB compressed.. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Wed Jul 9 18:47:41 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] An annoying MPI problem In-Reply-To: <48756453.704@scalableinformatics.com> Message-ID: <1946233117.193061215654461247.JavaMail.root@zimbra.vpac.org> ----- "Joe Landman" wrote: > Chris Samuel wrote: > > > ----- "Bruno Coutinho" wrote: > > > >> Try disabling shared memory only. > > > > Interesting - is it possible to disable that at runtime > > with one of its billions of configuration options ? > > -mca btl ^sm Brilliant, thanks Joe! -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From landman at scalableinformatics.com Wed Jul 9 20:58:32 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] Update on mpi problem Message-ID: <487588E8.5030507@scalableinformatics.com> Ok ... thought this would be interesting for some folks. As a reminder, using Open-MPI 1.2.6 for a customer code, seeing different behavior than in the past. Scratching my head over it (seemingly non-deterministic). I tried using '--mca btl ^sm' (turn off shared memory usage) on the non-infiniband machine, and ... it runs. Repeatedly. To completion. Ok, over to the Infiniband machine. I tried using '--mca btl ^sm'. No dice (the tcp and openib are still available). Next I tried turning off the tcp (ethernet) --mca btl ^sm,tcp Nope. Still doesn't work right. Hmmm.... One left. Turn off openib (infiniband). --mca btl ^sm,openib Yup. It works. Repeatedly. To completion. It looks like this is an MPI stack issue of some sort. I'll ping the Open-MPI list and see what they think. Thanks to all the suggestions and comments. FWIW, I also pulled down the DDT tool from Allinea, with the thought of testing it, and seeing if I could figure out where the problem was with the code. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From ajt at rri.sari.ac.uk Thu Jul 10 01:44:42 2008 From: ajt at rri.sari.ac.uk (Tony Travis) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> Message-ID: <4875CBFA.6090109@rri.sari.ac.uk> Chris Samuel wrote: > [...] > http://en.wikipedia.org/wiki/Four_Yorkshiremen_sketch > > I remember when I could bring the complete kernel sources > home on a single 3.5" floppy disk. Compressed, mind you, > not this new fangled gzip! Then came 0.9.15, 1.8MB compressed.. Hello, Chris. 3.5" floppy - you were lucky! I 'ad t' run th'entire Minix on half a broken low-density 5.25" floppy, and father would beat me wi a stick for wastin' precious disk space... ...and he'd complain that there's nowt wrong wi' th'old 8" floppies! Eee, by gum them were probably th'happiest days of mi life... [I'm from Lancashire, not Yorkshire, BTW] Tony. -- Dr. A.J.Travis, | mailto:ajt@rri.sari.ac.uk Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751 Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687 From andrew at moonet.co.uk Thu Jul 10 02:35:47 2008 From: andrew at moonet.co.uk (andrew holway) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <4875CBFA.6090109@rri.sari.ac.uk> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> Message-ID: bloody woollybacks. Yer durnt nur nowt bout t' computers or beatin's ower there. mah dad used to etch the bits inter me skin with a bit 'o' ot' coal then beat me with 'is Commadore +4 Andrew Holway Yorkshireman On Thu, Jul 10, 2008 at 9:44 AM, Tony Travis wrote: > Chris Samuel wrote: >> >> [...] >> http://en.wikipedia.org/wiki/Four_Yorkshiremen_sketch >> >> I remember when I could bring the complete kernel sources >> home on a single 3.5" floppy disk. Compressed, mind you, >> not this new fangled gzip! Then came 0.9.15, 1.8MB compressed.. > > Hello, Chris. > > 3.5" floppy - you were lucky! > > I 'ad t' run th'entire Minix on half a broken low-density 5.25" floppy, and > father would beat me wi a stick for wastin' precious disk space... > > ...and he'd complain that there's nowt wrong wi' th'old 8" floppies! > > Eee, by gum them were probably th'happiest days of mi life... > > [I'm from Lancashire, not Yorkshire, BTW] > > Tony. > -- > Dr. A.J.Travis, | mailto:ajt@rri.sari.ac.uk > Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt > Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751 > Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687 > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From apittman at concurrent-thinking.com Thu Jul 10 04:44:26 2008 From: apittman at concurrent-thinking.com (Ashley Pittman) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] Update on mpi problem In-Reply-To: <487588E8.5030507@scalableinformatics.com> References: <487588E8.5030507@scalableinformatics.com> Message-ID: <1215690266.28335.44.camel@bruce.priv.wark.uk.streamline-computing.com> On Wed, 2008-07-09 at 23:58 -0400, Joe Landman wrote: > Ok ... thought this would be interesting for some folks. As a reminder, > using Open-MPI 1.2.6 for a customer code, seeing different behavior than > in the past. Scratching my head over it (seemingly non-deterministic). > > I tried using '--mca btl ^sm' (turn off shared memory usage) on the > non-infiniband machine, and ... it runs. Repeatedly. To completion. See, I told you that would be a worthwhile test. > It looks like this is an MPI stack issue of some sort. I'll ping the > Open-MPI list and see what they think. That doesn't necessarily follow, if you are posing your sends before your receives then you are relying on unexpected message buffering within the MPI library. How much of this is available is up the the library, not the standard so I think it's possible that openmpi is being MPI compliant in both cases. Ashley Pittman. From rgb at phy.duke.edu Thu Jul 10 04:54:55 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <4875CBFA.6090109@rri.sari.ac.uk> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> Message-ID: On Thu, 10 Jul 2008, Tony Travis wrote: > Chris Samuel wrote: >> [...] >> http://en.wikipedia.org/wiki/Four_Yorkshiremen_sketch >> >> I remember when I could bring the complete kernel sources >> home on a single 3.5" floppy disk. Compressed, mind you, >> not this new fangled gzip! Then came 0.9.15, 1.8MB compressed.. > > Hello, Chris. > > 3.5" floppy - you were lucky! > > I 'ad t' run th'entire Minix on half a broken low-density 5.25" floppy, and > father would beat me wi a stick for wastin' precious disk space... > > ...and he'd complain that there's nowt wrong wi' th'old 8" floppies! > > Eee, by gum them were probably th'happiest days of mi life... > > [I'm from Lancashire, not Yorkshire, BTW] > > Tony. OK, in a minute you'll have me reminiscing about booting from paper tape and havin' to drill me own 'oles wit' me teeth... ;-) rgb -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From gerry.creager at tamu.edu Thu Jul 10 05:47:04 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> Message-ID: <487604C8.8020602@tamu.edu> Robert G. Brown wrote: > On Thu, 10 Jul 2008, Tony Travis wrote: > >> Chris Samuel wrote: >>> [...] >>> http://en.wikipedia.org/wiki/Four_Yorkshiremen_sketch >>> >>> I remember when I could bring the complete kernel sources >>> home on a single 3.5" floppy disk. Compressed, mind you, >>> not this new fangled gzip! Then came 0.9.15, 1.8MB compressed.. >> >> Hello, Chris. >> >> 3.5" floppy - you were lucky! >> >> I 'ad t' run th'entire Minix on half a broken low-density 5.25" >> floppy, and father would beat me wi a stick for wastin' precious disk >> space... >> >> ...and he'd complain that there's nowt wrong wi' th'old 8" floppies! >> >> Eee, by gum them were probably th'happiest days of mi life... >> >> [I'm from Lancashire, not Yorkshire, BTW] >> >> Tony. > > OK, in a minute you'll have me reminiscing about booting from paper tape > and havin' to drill me own 'oles wit' me teeth... > > ;-) Glad you smiled with that. Seems I find myself teaching an OS class all of a sudden this summer. I recounted my experience with an old HP system where the boot-loader was toggled in on front panel switches and the OS was loaded with paper tape. The kids (whose language experience ranges from Java to PHP, with C# thrown in for good measure) looked at me like I was reciting either fiction or lore. -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From rgb at phy.duke.edu Thu Jul 10 06:02:25 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <487604C8.8020602@tamu.edu> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> Message-ID: On Thu, 10 Jul 2008, Gerry Creager wrote: >> OK, in a minute you'll have me reminiscing about booting from paper tape >> and havin' to drill me own 'oles wit' me teeth... >> >> ;-) > > Glad you smiled with that. Seems I find myself teaching an OS class all of a > sudden this summer. I recounted my experience with an old HP system where > the boot-loader was toggled in on front panel switches and the OS was loaded > with paper tape. The kids (whose language experience ranges from Java to > PHP, with C# thrown in for good measure) looked at me like I was reciting > either fiction or lore. Aye, laddy, 'twas a PDP 1 with sense switches on the front that one did indeed toggle in the paper tape boot program (to get it to where it could read from other media), a green oscilloscope-style CRT for a monitor (complete with crosshatches IIRC), a real teletype console, and the ability to run simple fortran programs that did beam optic computations and displayed a simple visualization of same. The PDP itself was quite large, and I was told that it "had a few bad bits..." but that they usually didn't affect the outcome of a computation. I was kidding about the teeth, though. Next (if things go as they usually do) we all have to kid around about IBM 5100's and QIC (floppies are just plain too modern for me:-) but if I do that the Titorheads will attack and I'll have to once again unconvincingly deny that I am, in fact, John Titor. rgb -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From gdjacobs at gmail.com Thu Jul 10 06:19:08 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] A press release In-Reply-To: <487604C8.8020602@tamu.edu> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> Message-ID: <48760C4C.70906@gmail.com> Gerry Creager wrote: > Robert G. Brown wrote: >> On Thu, 10 Jul 2008, Tony Travis wrote: >> >>> Chris Samuel wrote: >>>> [...] >>>> http://en.wikipedia.org/wiki/Four_Yorkshiremen_sketch >>>> >>>> I remember when I could bring the complete kernel sources >>>> home on a single 3.5" floppy disk. Compressed, mind you, >>>> not this new fangled gzip! Then came 0.9.15, 1.8MB compressed.. >>> >>> Hello, Chris. >>> >>> 3.5" floppy - you were lucky! >>> >>> I 'ad t' run th'entire Minix on half a broken low-density 5.25" >>> floppy, and father would beat me wi a stick for wastin' precious disk >>> space... >>> >>> ...and he'd complain that there's nowt wrong wi' th'old 8" floppies! >>> >>> Eee, by gum them were probably th'happiest days of mi life... >>> >>> [I'm from Lancashire, not Yorkshire, BTW] >>> >>> Tony. >> >> OK, in a minute you'll have me reminiscing about booting from paper tape >> and havin' to drill me own 'oles wit' me teeth... >> >> ;-) > > Glad you smiled with that. Seems I find myself teaching an OS class all > of a sudden this summer. I recounted my experience with an old HP > system where the boot-loader was toggled in on front panel switches and > the OS was loaded with paper tape. The kids (whose language experience > ranges from Java to PHP, with C# thrown in for good measure) looked at > me like I was reciting either fiction or lore. I seem to recall that Seymour Cray toggled in the operating system code on the CDC 7600 by hand, from memory. Perhaps that's just an urban legend. From mark.kosmowski at gmail.com Thu Jul 10 06:29:51 2008 From: mark.kosmowski at gmail.com (Mark Kosmowski) Date: Wed Nov 25 01:07:25 2009 Subject: [Beowulf] Re: energy costs and poor grad students Message-ID: A couple weeks ago I complained about energy costs with respect to my personal cluster used for graduate work. I received a great deal of excellent advice as well as some offers of compute time when I'm ready for production runs. Thank you everyone! My solution so far has been to consolidate my DIMMs onto one workstation - I have 14 Gb on it. During this process I learned which of my 2 Gb DIMMs was bad. I'm also in the process of upgrading my entertainment machine to a 64 bit dual-core Athlon Linux box to be used as a part-time compute node as needed. Also, I'm on the lookout for a couple cheap 2 Gb ECC registered DDR 400 DIMMs to bring the workstation to a full 16 Gb at some point. I may also keep an eye out for multi-core Opterons. I have also decided to upgrade my software to try an eek a little speed out of things. I've done a clean install of OpenSUSE 11.0 using KDE 3.5 (I need the GUI for the workstation) and will be installing the latest versions of OpenMPI, CPMD, compilers and math libraries. Some people on the CPMD list (my primary code at this point - plane wave quantum chemistry) suggest fftw as part of the math library solution. I noticed that only fftw 2.1.5 supports MPI, while the latest version of the 3.x series does not. Eventually I will be running large jobs and may need to go back to a cluster, so I'm interested in keeping my code MPI-ready and running two processors that way. I will likely use the ACML (AMD math library) for the functionality not provided by fftw. I am uncertain whether I will use ifort or gfortran at the moment. I'd be willing to look at the Sun suite. Other then hopefully a PhD at some point, I am receiving no compensation for my research, so ifort is a free option. Is fftw 2.1.5 and the latest acml a reasonable combination for speed / efficiency or is there a different combination of math libraries that stands out for speed? Is the choice of math library yet another instance of the actual application makes a difference on which on is fastest? Also, is there a recent compiler benchmark somewhere? The one at Polyhedron seems a little dated - the ifort cited is known to have issues with the code I use and the Sun compiler is given as 8.x when 12.x is available now. If I break down and decide to run my own benchmarks on actual code are there any restrictions on the free versions of ifort and Sun to share the results? Thanks, Mark E. Kosmowski From xclski at yahoo.com Thu Jul 10 06:39:47 2008 From: xclski at yahoo.com (Ellis Wilson) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release Message-ID: <965427.1434.qm@web37902.mail.mud.yahoo.com> Gerry Creager wrote: > The kids (whose language experience > ranges from Java to PHP, with C# thrown in for good measure) looked at > me like I was reciting either fiction or lore. Hey! Lets remember not all of us (such as those of us who are turning 21 today) were around to help change out vacuum tubes when they burned out and remove moths remains from the shorts between them. Either way, while I do know the three languages listed above (plus C, MPI, Fortran, and other more good ol' god-fearing languages) I have to complain about the way we youngin's are taught today. For instance, I put up with having to learn those better languages out of classes on my own (because all they taught in class were Java and C#) my whole collegiate career and when the capstone course comes they continued to screw me. I thought it would be a class where you could choose the topic of your research (I wanted to develop my own package management software to understand the process a little better) and the language of implementation - NOPE. They (LaSalle Professor's) forced me to write an IT oriented business-type C# and MS SQL Server program. What is the world coming to? We might as well call Computer Science Business Science with a Little Computing. Hopefully PHD-land is better. Ellis From landman at scalableinformatics.com Thu Jul 10 07:00:24 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Re: energy costs and poor grad students In-Reply-To: References: Message-ID: <487615F8.4070001@scalableinformatics.com> Mark Kosmowski wrote: > > I have also decided to upgrade my software to try an eek a little > speed out of things. I've done a clean install of OpenSUSE 11.0 using > KDE 3.5 (I need the GUI for the workstation) and will be installing > the latest versions of OpenMPI, CPMD, compilers and math libraries. Great. > > Some people on the CPMD list (my primary code at this point - plane > wave quantum chemistry) suggest fftw as part of the math library > solution. I noticed that only fftw 2.1.5 supports MPI, while the > latest version of the 3.x series does not. Eventually I will be > running large jobs and may need to go back to a cluster, so I'm > interested in keeping my code MPI-ready and running two processors > that way. I will likely use the ACML (AMD math library) for the > functionality not provided by fftw. I am uncertain whether I will use > ifort or gfortran at the moment. I'd be willing to look at the Sun > suite. Other then hopefully a PhD at some point, I am receiving no Hmmm... My own tests with the Sun compiler suite about a year to year and a half ago suggest it doesn't generate as good code (e.g. fast code) as the gnu compilers. This was true on Solaris 10 and Linux. Baseline RHEL 4 with gcc generated faster code for most of the tests I ran. YMMV, but I wouldn't advise going the Sun compiler route unless it generates demonstrably better code (and it didn't for me). > compensation for my research, so ifort is a free option. > > Is fftw 2.1.5 and the latest acml a reasonable combination for speed / > efficiency or is there a different combination of math libraries that > stands out for speed? Is the choice of math library yet another > instance of the actual application makes a difference on which on is > fastest? You may be able to get the Intel MKL on a similar license as the compiler, I am not sure. Check it out. > > Also, is there a recent compiler benchmark somewhere? The one at > Polyhedron seems a little dated - the ifort cited is known to have > issues with the code I use and the Sun compiler is given as 8.x when > 12.x is available now. If I break down and decide to run my own > benchmarks on actual code are there any restrictions on the free > versions of ifort and Sun to share the results? I did tests with gcc, ifort, pgi, and the sun compilers on a particular code (HMMer) about 2 years ago. pgi was the best, followed closely by ifort and gcc. Sun trailed badly. This was with baseline and maximal optimization. HMMer is very different than MD though, so YMMV -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From dnlombar at ichips.intel.com Thu Jul 10 07:53:09 2008 From: dnlombar at ichips.intel.com (Lombard, David N) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] An annoying MPI problem In-Reply-To: <48741C0C.2050303@scalableinformatics.com> References: <48741C0C.2050303@scalableinformatics.com> Message-ID: <20080710145309.GA10896@nlxdcldnl2.cl.intel.com> On Tue, Jul 08, 2008 at 07:01:48PM -0700, Joe Landman wrote: > Hi folks > > Dealing with an MPI problem that has me scratching my head. Quite > beowulfish, as thats where this code runs. > > Short version: The code starts and runs. Reads in its data. Starts > its iterations. And then somewhere after this, it hangs. But not > always at the same place. It doesn't write state data back out to the > disk, just logs. Rerunning it gets it to a different point, sometimes > hanging sooner, sometimes later. Seems to be the case on multiple > different machines, with different OSes. Working on comparing MPI > distributions, and it hangs with IB as well as with shared memory and > tcp sockets. ... > I'll try all the usual things (reduce the optimization level, etc). > Sage words of advice (and clue sticks) welcome. Not trying to sound like an ad... The currently shipping Intel Trace Collector and Analyzer (7.1), includes message correctness checking. An option is available that adds a library to an Intel MPI build that checks messages during the run. You can then view any errors it found in the Intel Trace Analyzer. This may find there's a problem that has only just started to trip the code up. I certainly have welts from those; I suspect others do too. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From landman at scalableinformatics.com Thu Jul 10 08:02:12 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] An annoying MPI problem In-Reply-To: <20080710145309.GA10896@nlxdcldnl2.cl.intel.com> References: <48741C0C.2050303@scalableinformatics.com> <20080710145309.GA10896@nlxdcldnl2.cl.intel.com> Message-ID: <48762474.4020908@scalableinformatics.com> Lombard, David N wrote: >> I'll try all the usual things (reduce the optimization level, etc). >> Sage words of advice (and clue sticks) welcome. > > Not trying to sound like an ad... > > The currently shipping Intel Trace Collector and Analyzer (7.1), includes > message correctness checking. An option is available that adds a > library to an Intel MPI build that checks messages during the run. > You can then view any errors it found in the Intel Trace Analyzer. > > This may find there's a problem that has only just started to trip the > code up. I certainly have welts from those; I suspect others do too. Actually, Intel MPI and related tools are in general one of the things we want to try. User may be open to that (especially if it is more pain free than the alternative). We have reliable functional non-sm/non-ib based execution on multiple machines now. New code drop coming, so we have to wait on that. Once we have that, we'll be doing more testing. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From lbickley at bickleywest.com Thu Jul 10 08:07:30 2008 From: lbickley at bickleywest.com (Lyle Bickley) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <487604C8.8020602@tamu.edu> Message-ID: <200807100807.31260.lbickley@bickleywest.com> On Thursday 10 July 2008, Robert G. Brown wrote: --snip-- > Aye, laddy, 'twas a PDP 1 with sense switches on the front that one did > indeed toggle in the paper tape boot program (to get it to where it > could read from other media), a green oscilloscope-style CRT for a > monitor (complete with crosshatches IIRC), a real teletype console, and > the ability to run simple fortran programs that did beam optic > computations and displayed a simple visualization of same. The PDP > itself was quite large, and I was told that it "had a few bad bits..." > but that they usually didn't affect the outcome of a computation. I'm on the PDP-1 and IBM 1620 Restoration Teams at the Computer History Museum (see: http://www.computerhistory.org/pdp-1/). So if the PDP-1 was your first computer (and one NEVER forgets their first computer) you can see it in action (running Spacewar!, Minskytron, Music, etc.) at the CHM! In fact, if your visiting Silicon Valley, I'll give you a private tour! > I was kidding about the teeth, though. ;-) > Next (if things go as they usually do) we all have to kid around about > IBM 5100's and QIC (floppies are just plain too modern for me:-) but if > I do that the Titorheads will attack and I'll have to once again > unconvincingly deny that I am, in fact, John Titor. Those of us who volunteer at the Computer History Museum feel we are "time traveler's" for sure... Funny combination - setting up Beowulf clusters and restoring vintage computers. There must be a lesson here :-) Cheers, Lyle > > rgb -- Lyle Bickley Bickley Consulting West Inc. Mountain View, CA 94040 http://bickleywest.com "Black holes are where God is dividing by zero" From tjrc at sanger.ac.uk Thu Jul 10 08:54:10 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> Message-ID: <48174B0A-2152-4C15-956E-C3FDBCB30E72@sanger.ac.uk> On 10 Jul 2008, at 2:02 pm, Robert G. Brown wrote: > Next (if things go as they usually do) we all have to kid around about > IBM 5100's and QIC (floppies are just plain too modern for me:-) but > if > I do that the Titorheads will attack and I'll have to once again > unconvincingly deny that I am, in fact, John Titor. In 1998 I was still partially responsible for a machine which booted from tape. The machine in question was the data collection device for an NMR spectrometer. The machine in question is probably still going strong now - NMR machines being somewhat expensive to replace. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From rgb at phy.duke.edu Thu Jul 10 09:58:20 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: <200807100807.31260.lbickley@bickleywest.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <487604C8.8020602@tamu.edu> <200807100807.31260.lbickley@bickleywest.com> Message-ID: On Thu, 10 Jul 2008, Lyle Bickley wrote: > On Thursday 10 July 2008, Robert G. Brown wrote: > --snip-- >> Aye, laddy, 'twas a PDP 1 with sense switches on the front that one did >> indeed toggle in the paper tape boot program (to get it to where it >> could read from other media), a green oscilloscope-style CRT for a >> monitor (complete with crosshatches IIRC), a real teletype console, and >> the ability to run simple fortran programs that did beam optic >> computations and displayed a simple visualization of same. The PDP >> itself was quite large, and I was told that it "had a few bad bits..." >> but that they usually didn't affect the outcome of a computation. > > I'm on the PDP-1 and IBM 1620 Restoration Teams at the Computer History Museum > (see: http://www.computerhistory.org/pdp-1/). So if the PDP-1 was your first > computer (and one NEVER forgets their first computer) you can see it in > action (running Spacewar!, Minskytron, Music, etc.) at the CHM! > > In fact, if your visiting Silicon Valley, I'll give you a private tour! Cool! I think TUNL finally got rid of theirs -- always a shame, somehow, when a magnificent thing like that dies. > Funny combination - setting up Beowulf clusters and restoring vintage > computers. There must be a lesson here :-) Vintage clusters? ;-) rgb > > Cheers, > Lyle > > > > > >> >> rgb > > > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From James.P.Lux at jpl.nasa.gov Thu Jul 10 10:50:03 2008 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <487604C8.8020602@tamu.edu> <200807100807.31260.lbickley@bickleywest.com> Message-ID: <6.2.5.6.2.20080710104816.02cf7858@jpl.nasa.gov> At 09:58 AM 7/10/2008, Robert G. Brown wrote: >On Thu, 10 Jul 2008, Lyle Bickley wrote: > >>On Thursday 10 July 2008, Robert G. Brown wrote: >>--snip-- >>>Aye, laddy, 'twas a PDP 1 with sense switches on the front that one did >>>indeed toggle in the paper tape boot program (to get it to where it >>>could read from other media), a green oscilloscope-style CRT for a >>>monitor (complete with crosshatches IIRC), a real teletype console, and >>>the ability to run simple fortran programs that did beam optic >>>computations and displayed a simple visualization of same. The PDP >>>itself was quite large, and I was told that it "had a few bad bits..." >>>but that they usually didn't affect the outcome of a computation. >> >>I'm on the PDP-1 and IBM 1620 Restoration Teams at the Computer >>History Museum >>(see: http://www.computerhistory.org/pdp-1/). So if the PDP-1 was your first >>computer (and one NEVER forgets their first computer) you can see it in >>action (running Spacewar!, Minskytron, Music, etc.) at the CHM! >> >>In fact, if your visiting Silicon Valley, I'll give you a private tour! > >Cool! I think TUNL finally got rid of theirs -- always a shame, >somehow, when a magnificent thing like that dies. > >>Funny combination - setting up Beowulf clusters and restoring vintage >>computers. There must be a lesson here :-) > >Vintage clusters? > >;-) > > rgb in classic /. form, should we not now interject: What about a beowulf cluster of PDP-8s (or -1s or IBM 1130s or...)? (not that any of those were really "commodity", although there were an awful lot of PDP-8i and PDP-11s out there.. not really open source though) Jim From dnlombar at ichips.intel.com Thu Jul 10 13:22:06 2008 From: dnlombar at ichips.intel.com (Lombard, David N) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: <6.2.5.6.2.20080710104816.02cf7858@jpl.nasa.gov> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <487604C8.8020602@tamu.edu> <200807100807.31260.lbickley@bickleywest.com> <6.2.5.6.2.20080710104816.02cf7858@jpl.nasa.gov> Message-ID: <20080710202206.GA11222@nlxdcldnl2.cl.intel.com> On Thu, Jul 10, 2008 at 10:50:03AM -0700, Jim Lux wrote: > in classic /. form, should we not now interject: > > What about a beowulf cluster of PDP-8s (or -1s or IBM 1130s or...)? Hmmm, don't know about the PDPs, but there aren't a lot of working 1130's about. ibm1130.org has one, along with cards, disk cartridges, docs, &etc. They also have a simulator that permits you to run R2 V12! Brings back memories... Too bad they can't find the EMU-Fortran. As for a cluster of 1130 simulators; IIRC, the only networking they had was in support of the RJE station. Perhaps you could run an EP code with a 360 as the headnode :p -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From gerry.creager at tamu.edu Thu Jul 10 13:41:35 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <487604C8.8020602@tamu.edu> <200807100807.31260.lbickley@bickleywest.com> Message-ID: <487673FF.5010508@tamu.edu> Robert G. Brown wrote: > On Thu, 10 Jul 2008, Lyle Bickley wrote: > >> On Thursday 10 July 2008, Robert G. Brown wrote: >> --snip-- >>> Aye, laddy, 'twas a PDP 1 with sense switches on the front that one did >>> indeed toggle in the paper tape boot program (to get it to where it >>> could read from other media), a green oscilloscope-style CRT for a >>> monitor (complete with crosshatches IIRC), a real teletype console, and >>> the ability to run simple fortran programs that did beam optic >>> computations and displayed a simple visualization of same. The PDP >>> itself was quite large, and I was told that it "had a few bad bits..." >>> but that they usually didn't affect the outcome of a computation. >> >> I'm on the PDP-1 and IBM 1620 Restoration Teams at the Computer >> History Museum >> (see: http://www.computerhistory.org/pdp-1/). So if the PDP-1 was your >> first >> computer (and one NEVER forgets their first computer) you can see it in >> action (running Spacewar!, Minskytron, Music, etc.) at the CHM! >> >> In fact, if your visiting Silicon Valley, I'll give you a private tour! > > Cool! I think TUNL finally got rid of theirs -- always a shame, > somehow, when a magnificent thing like that dies. > >> Funny combination - setting up Beowulf clusters and restoring vintage >> computers. There must be a lesson here :-) > > Vintage clusters? Only if running something earlier than WINE 0.3 -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From bill at cse.ucdavis.edu Thu Jul 10 13:53:55 2008 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] CUDA stream on GTX 260/280? Message-ID: <487676E3.5030406@cse.ucdavis.edu> Has anyone run the McCalpin's stream port to CUDA on the GTX 260/280? stream.cu is available here: http://forums.nvidia.com/index.php?showtopic=52686 I'm interested in the ATI equivalent as well, but I'm not sure there's a stream port. From lbickley at bickleywest.com Thu Jul 10 14:54:37 2008 From: lbickley at bickleywest.com (Lyle Bickley) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: <20080710202206.GA11222@nlxdcldnl2.cl.intel.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <6.2.5.6.2.20080710104816.02cf7858@jpl.nasa.gov> <20080710202206.GA11222@nlxdcldnl2.cl.intel.com> Message-ID: <200807101454.37990.lbickley@bickleywest.com> On Thursday 10 July 2008, Lombard, David N wrote: > On Thu, Jul 10, 2008 at 10:50:03AM -0700, Jim Lux wrote: > > in classic /. form, should we not now interject: > > > > What about a beowulf cluster of PDP-8s (or -1s or IBM 1130s or...)? > > Hmmm, don't know about the PDPs, but there aren't a lot of working 1130's > about. ibm1130.org has one, along with cards, disk cartridges, docs, &etc. > They also have a simulator that permits you to run R2 V12! Brings back > memories... Too bad they can't find the EMU-Fortran. There are quite a few PDP-8s around. I have a PDP-8/E, PDP-8/F and two PDP-8 compatible Decmates - all are operational. There are even more PDP-11s "about". I've got a PDP 11/34C, 11/85, several 11/23s an LSI-11 and a MINC 11/23 (laboratory computer) - all operational. Typical peripherals: 9 track tape, RX02 8" floppies, RL01/2 Cartridge Drives (5MB+10MB), etc > As for a cluster of 1130 simulators; IIRC, the only networking they had > was in support of the RJE station. Perhaps you could run an EP code with > a 360 as the headnode :p As to clustering... Latency would be a "bit" of a problem. PDP-8s only support serial and synchronous "networks". 9600 baud won't quite make it as a high performance cluster infrastructure ;-) PDP-11s do have 10Base Ethernet. But running TCP/IP on a PDP-11 is incredibly slow (even if running BSD Unix). It take just about everything the processor has just to push the packets out. So one could create a true ethernet cluster infrastructure - but there wouldn't be much processor power left over to do calculations. So as much as I love these old beasties, they just don't make great clusters, sigh... On the other hand, VAXes just might :-) Cheers, Lyle -- Lyle Bickley Bickley Consulting West Inc. Mountain View, CA 94040 http://bickleywest.com "Black holes are where God is dividing by zero" From reuti at staff.uni-marburg.de Thu Jul 10 15:02:35 2008 From: reuti at staff.uni-marburg.de (Reuti) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] An annoying MPI problem In-Reply-To: <48762474.4020908@scalableinformatics.com> References: <48741C0C.2050303@scalableinformatics.com> <20080710145309.GA10896@nlxdcldnl2.cl.intel.com> <48762474.4020908@scalableinformatics.com> Message-ID: <11CDCDE6-17F6-4DE5-A87A-826764808F23@staff.uni-marburg.de> Hi, Am 10.07.2008 um 17:02 schrieb Joe Landman: > Lombard, David N wrote: > >>> I'll try all the usual things (reduce the optimization level, etc). >>> Sage words of advice (and clue sticks) welcome. >> Not trying to sound like an ad... >> The currently shipping Intel Trace Collector and Analyzer (7.1), >> includes >> message correctness checking. An option is available that adds a >> library to an Intel MPI build that checks messages during the run. >> You can then view any errors it found in the Intel Trace Analyzer. >> This may find there's a problem that has only just started to trip >> the >> code up. I certainly have welts from those; I suspect others do too. > > Actually, Intel MPI and related tools are in general one of the > things we want to try. User may be open to that (especially if it > is more pain free than the alternative). isn't Intel MPI based on MPICH2 - is there any file what they added/ changed in detail? -- Reuti > We have reliable functional non-sm/non-ib based execution on > multiple machines now. New code drop coming, so we have to wait on > that. Once we have that, we'll be doing more testing. > > Joe > > > -- > Joseph Landman, Ph.D > Founder and CEO > Scalable Informatics LLC, > email: landman@scalableinformatics.com > web : http://www.scalableinformatics.com > http://jackrabbit.scalableinformatics.com > phone: +1 734 786 8423 > fax : +1 866 888 3112 > cell : +1 734 612 4615 > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Jul 10 15:07:38 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: <200807101454.37990.lbickley@bickleywest.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <6.2.5.6.2.20080710104816.02cf7858@jpl.nasa.gov> <20080710202206.GA11222@nlxdcldnl2.cl.intel.com> <200807101454.37990.lbickley@bickleywest.com> Message-ID: > As to clustering... Latency would be a "bit" of a problem. PDP-8s only support > serial and synchronous "networks". 9600 baud won't quite make it as a high > performance cluster infrastructure ;-) > > PDP-11s do have 10Base Ethernet. But running TCP/IP on a PDP-11 is incredibly > slow (even if running BSD Unix). It take just about everything the processor > has just to push the packets out. So one could create a true ethernet cluster > infrastructure - but there wouldn't be much processor power left over to do > calculations. > > So as much as I love these old beasties, they just don't make great clusters, > sigh... > > On the other hand, VAXes just might :-) "Great" as in the meaning of "not great at all"? The real question is how many kiloVAXes it would take to equal the ONE laptop I'm typing this reply on... Moore's law is cruel, alas. OTOH, if we ignore trying to build an antique cluster for WORK and focus on doing it for FUN, a 10-system PDP-1 cluster with 9600 baud serial connections becomes a simply smashing idea... although finding ones with all their bits intact is perhaps a bit of a problem. rgb > > Cheers, > Lyle > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From Shainer at mellanox.com Thu Jul 10 15:43:36 2008 From: Shainer at mellanox.com (Gilad Shainer) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Infiniband modular switches Message-ID: <9FA59C95FFCBB34EA5E42C1A8573784F013425E3@mtiexch01.mti.com> Patrick Geoffray wrote: > > Not only that I was there, but also had conversations > afterwards. It > > is a really "fair" comparison when you have different injection > > rate/network capacity parameters. You can also take 10Mb > and inject it > > into 10Gb/s network to show the same, and you always can create the > > network pattern to show what you want to show, but you prove nothing > > The injection rate is irrelevant The injection rate is super relevant. If your injection rate is 10% of the fabric capability, or 100% of the fabric capability, the histogram will be different. It was proven by the same person who did the slides you referred to, when doing the same testing on IB DDR we got much better results with IB versus Quadrics. your theory does not really meet reality. Also, when you do look on adaptive routing, make sure it is real time solution... Not some synthetic testing .... > > So far, IB only used static routing. If it still relies on > packet order on the wire for a given Queue Pair, then the > only way to do some sort of adaptive routing is to use a > different QP for each possible route (LID). > This is what Panda's group tried in a paper. However, the > number of QP explodes, each QP is still subject to HOL > blocking and the QP interleaving is static. > What Panda did is using multiple static routs, which can be chosen according to different algorithms. It is an interesting idea, and used in several cases with nice results. This is not adaptive routing. Gilad. From ajt at rri.sari.ac.uk Fri Jul 11 01:40:12 2008 From: ajt at rri.sari.ac.uk (Tony Travis) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: <487604C8.8020602@tamu.edu> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> Message-ID: <48771C6C.9080306@rri.sari.ac.uk> Gerry Creager wrote: > Robert G. Brown wrote: >> [...] >> OK, in a minute you'll have me reminiscing about booting from paper tape >> and havin' to drill me own 'oles wit' me teeth... >> >> ;-) > > Glad you smiled with that. Seems I find myself teaching an OS class all > of a sudden this summer. I recounted my experience with an old HP > system where the boot-loader was toggled in on front panel switches and > the OS was loaded with paper tape. The kids (whose language experience > ranges from Java to PHP, with C# thrown in for good measure) looked at > me like I was reciting either fiction or lore. Hello, Gerry. I have fond memories of doing something similar with a hex keypad on a pdp11/34, to boot Unix version 7 from mag tape. When I left that job, they gave me the pdp11/34 as a 'joke' leaving present, and I took it! I wouldn't be surprised if other people on this list have had pdp's or older kit running in their living rooms: I upgraded mine to a pdp11/23 when the University scrapped one. I used it to run the UK MUGNET mail and news 'backbone' site for a while, dialling up Cistron Electronics in the Netherlands in the middle of the night to get feeds at 1200 baud. I was good fun, but the weight of all the 'spare' rl02's and rk05's in my loft were a bit of a worry ;-) Tony. -- Dr. A.J.Travis, | mailto:ajt@rri.sari.ac.uk Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751 Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687 From Dan.Kidger at quadrics.com Fri Jul 11 08:56:44 2008 From: Dan.Kidger at quadrics.com (Dan.Kidger@quadrics.com) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Infiniband modular switches In-Reply-To: <9FA59C95FFCBB34EA5E42C1A8573784F013425E3@mtiexch01.mti.com> References: <9FA59C95FFCBB34EA5E42C1A8573784F013425E3@mtiexch01.mti.com> Message-ID: <0D49B15ACFDF2F46BF90B6E08C90048A048849198B@quadbrsex1.quadrics.com> Gilad wrote: > It was proven by the same person who did the slides > you referred to, when doing the same testing on IB DDR we got much > better results with IB versus Quadrics. your theory does not really meet > reality. Care to describe to the list what these results were, and to speculate about why IB gave better results? Daniel ------------------------------------------------------------- Dr. Daniel Kidger, Quadrics Ltd. daniel.kidger@quadrics.com One Bridewell St., Mobile: +44 (0)779 209 1851 Bristol, BS1 2AA, UK Office: +44 (0)117 915 5519 ----------------------- www.quadrics.com -------------------- From dnlombar at ichips.intel.com Fri Jul 11 10:47:00 2008 From: dnlombar at ichips.intel.com (Lombard, David N) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] An annoying MPI problem In-Reply-To: <11CDCDE6-17F6-4DE5-A87A-826764808F23@staff.uni-marburg.de> References: <48741C0C.2050303@scalableinformatics.com> <20080710145309.GA10896@nlxdcldnl2.cl.intel.com> <48762474.4020908@scalableinformatics.com> <11CDCDE6-17F6-4DE5-A87A-826764808F23@staff.uni-marburg.de> Message-ID: <20080711174700.GA10796@nlxdcldnl2.cl.intel.com> On Thu, Jul 10, 2008 at 03:02:35PM -0700, Reuti wrote: > Hi, > > Am 10.07.2008 um 17:02 schrieb Joe Landman: > > Lombard, David N wrote: > >> The currently shipping Intel Trace Collector and Analyzer (7.1), > >> includes > >> message correctness checking. An option is available that adds a > >> library to an Intel MPI build that checks messages during the run. > >> You can then view any errors it found in the Intel Trace Analyzer. > >> This may find there's a problem that has only just started to trip > >> the > >> code up. I certainly have welts from those; I suspect others do too. > > > > Actually, Intel MPI and related tools are in general one of the > > things we want to try. User may be open to that (especially if it > > is more pain free than the alternative). > > isn't Intel MPI based on MPICH2 - is there any file what they added/ > changed in detail? Here's the documentation: Let me know offlist if you want more info. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From josip at lanl.gov Mon Jul 14 07:35:31 2008 From: josip at lanl.gov (Josip Loncaric) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Roadrunner picture In-Reply-To: <02f001c8cd6c$3c5c1990$4f833909@oberon> References: <02f001c8cd6c$3c5c1990$4f833909@oberon> Message-ID: <487B6433.80406@lanl.gov> Egan Ford wrote: > Perhaps this will help: > > http://www.lanl.gov/roadrunner/ > > And: > > http://www.lanl.gov/orgs/hpc/roadrunner/pdfs/Koch%20-%20Roadrunner%20Overvie > w/RR%20Seminar%20-%20System%20Overview.pdf > > Pages 20 - 29 > > IANS, the triblade is really a quadblade, blade 1 is the Opteron Blade, > blade 2 is a bridge, blades 3 and 4 are the Cell blades. > > Lots of other good stuff here: > > http://www.lanl.gov/orgs/hpc/roadrunner/rrseminars.shtml > Another good link: http://www.lanl.gov/orgs/hpc/roadrunner/rrtechnicalseminars2008.shtml found from: http://www.lanl.gov/orgs/hpc/roadrunner/index.shtml Sincerely, Josip From patrick at myri.com Mon Jul 14 10:42:07 2008 From: patrick at myri.com (Patrick Geoffray) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Infiniband modular switches In-Reply-To: <9FA59C95FFCBB34EA5E42C1A8573784F013425E3@mtiexch01.mti.com> References: <9FA59C95FFCBB34EA5E42C1A8573784F013425E3@mtiexch01.mti.com> Message-ID: <487B8FEF.4030708@myri.com> Gilad Shainer wrote: >> The injection rate is irrelevant > > The injection rate is super relevant. If your injection rate is 10% of The injection rate is absolutely irrelevant for contention due to Head-of-Line Blocking. You will have the same fabric efficiency under contention if your link rate is 1 Mb/s or 100 Gb/s. For example, the efficiency of a single-channel *full* crossbar under random traffic is 58.6%, whatever is the link rate. Google "HOL blocking" or read a book. > you referred to, when doing the same testing on IB DDR we got much > better results with IB versus Quadrics. your theory does not really meet The routing efficiency of 3-year old Quadrics QsNetII is greater than latest Mellanox IB in these tests. For IB SDR link rate, Quadrics has more throughput because of the better efficiency, as their link rate is similar. For DDR throughput, it is possible that the increased link rate compensate for the bad routing efficiency. The only comparison I have seen with IB DDR was by Woven Systems at Sandia last year, and 10 Gb/s Ethernet (higher link rate than QsNetII) was beating IB DDR for a similar benchmark. > reality. Also, when you do look on adaptive routing, make sure it is > real time solution... Not some synthetic testing .... AlltoAll of large messages is not a useless synthetic benchmark IMHO. > What Panda did is using multiple static routs, which can be chosen > according to different algorithms. It is an interesting idea, and used > in several cases with nice results. This is not adaptive routing. That's right, this is not adaptive routing, it's lip stick. And I am sorry but those are not nice results compared to real adaptive routing. Patrick From eagles051387 at gmail.com Tue Jul 15 07:59:14 2008 From: eagles051387 at gmail.com (Jon Aquilina) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Roadrunner picture In-Reply-To: <487B6433.80406@lanl.gov> References: <02f001c8cd6c$3c5c1990$4f833909@oberon> <487B6433.80406@lanl.gov> Message-ID: whats the temperature that somethign thsi big generates and what is used to cool it. also why opterons wouldnt there be a better performance gain from the new 45nm quad core intel's with 12mb cache? On Mon, Jul 14, 2008 at 4:35 PM, Josip Loncaric wrote: > Egan Ford wrote: > >> Perhaps this will help: >> >> http://www.lanl.gov/roadrunner/ >> >> And: >> >> >> http://www.lanl.gov/orgs/hpc/roadrunner/pdfs/Koch%20-%20Roadrunner%20Overvie >> w/RR%20Seminar%20-%20System%20Overview.pdf >> >> Pages 20 - 29 >> >> IANS, the triblade is really a quadblade, blade 1 is the Opteron Blade, >> blade 2 is a bridge, blades 3 and 4 are the Cell blades. >> >> Lots of other good stuff here: >> >> http://www.lanl.gov/orgs/hpc/roadrunner/rrseminars.shtml >> >> > Another good link: > > http://www.lanl.gov/orgs/hpc/roadrunner/rrtechnicalseminars2008.shtml > > found from: > > http://www.lanl.gov/orgs/hpc/roadrunner/index.shtml > > Sincerely, > Josip > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080715/e413c264/attachment.html From csamuel at vpac.org Tue Jul 15 17:43:39 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Roadrunner picture In-Reply-To: <631175139.230791216168872174.JavaMail.root@zimbra.vpac.org> Message-ID: <495547425.230811216169019459.JavaMail.root@zimbra.vpac.org> ----- "Jon Aquilina" wrote: > whats the temperature that somethign thsi big generates > and what is used to cool it. Top500 says it's measured at 2.3 megawatts. > also why opterons wouldnt there be a better performance > gain from the new 45nm quad core intel's with 12mb cache? Read the introduction to Roadrunner - the Opterons are intended to be mostly used as host processors (though I guess initially they'll be doing more than that as people will need to do a lot of porting work), the Cell's are meant to do most of the grunt work. For instance with Linpack it appears they just used the Cell to get the petaflop number, they didn't bother to add the 49TF's from the Opterons.. cheers! Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From toon at moene.indiv.nluug.nl Wed Jul 9 13:21:44 2008 From: toon at moene.indiv.nluug.nl (Toon Moene) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: <20080708175844.GA5087@nlxdcldnl2.cl.intel.com> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080708175844.GA5087@nlxdcldnl2.cl.intel.com> Message-ID: <48751DD8.1080005@moene.indiv.nluug.nl> Lombard, David N wrote: > All were after. MCC dates from '91; Ah, that explains :-) As I bought a NeXT workstation in November '91 (at the tune of ~ $ 20,000) I had no need for a PC-type Unix-look-a-like. Besides, the NeXT was good enough to start working on g77 ... It could even run our (then operational) weather forecasting model. -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ Progress of GNU Fortran: http://gcc.gnu.org/ml/gcc/2008-01/msg00009.html From mdidomenico4 at gmail.com Thu Jul 10 03:36:14 2008 From: mdidomenico4 at gmail.com (Michael Di Domenico) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] IPoIB arp's disappearing Message-ID: I'm having a bit of a weird problem that i cannot figure out. If anyone can help from the community it would be appreciated. Here's the packet flow cn(ib0)->io(ib0)->io(eth5)->pan(*) cn = compute node io = io node pan = panasas storage network We have 12 shelves of panasas network storage on a seperate network, which is being fronted by bridge servers which are routing IPoIB traffic to 10G ethernet traffic. We're using Mellanox Connect-X Ethernet/IB adapters everwhere. We're running Ofed 1.3.1 and the latest firmwares for IB/Eth everywhere. Here's the problem. I can mount the storage on the compute nodes, but if i try to send anything more then 50MB of data via dd. I seem to loose the ARP entries for the compute nodes on the IO servers. This seems to happen whether I use the filesystem or a netperf run from the compute node to the panasas storage I can run netperf between the compute node and io node and get full IPoIB line rate with no issues I can run netperf between the io node and the panasas storage and get full 10G ethernet line rate with no issues When looking at the TCP traces, i can clearly see that a big chunk of data is sent between the end-points and then it stalls. Immediately after the stall is an ARP request and then another chunk of data, and this scenario repeats over and over. Any thoughts or questions? Thanks - Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080710/410effbc/attachment.html From kspaans at student.math.uwaterloo.ca Thu Jul 10 08:00:40 2008 From: kspaans at student.math.uwaterloo.ca (Kyle Spaans) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> Message-ID: <20080710150040.GA18058@student.math> OK, is it just me? All of my messages seem to take days to get to the list because the moderator has to approve them? I'm pretty sure that I'm subscribed properly. On Thu, Jul 10, 2008 at 09:02:25AM -0400, Robert G. Brown wrote: > I do that the Titorheads will attack and I'll have to once again > unconvincingly deny that I am, in fact, John Titor. John Titor? WOW! I remember reading about ``him'' a couple of years ago when I was in highschool. I never thought I'd see a reference to ``him'' in a place like this! ;) On a "programming languages" note, my school may be a little better than most. The first-year CS cirriculum has moved entirely away from Java, and the main class is now taught in Scheme, with C as the steping-stone to 2nd-year imperative languages. There are also "CS for non-majors" classes in Python I believe. I can't comment on upper-year project type things though, because I'm only in my 2nd year. :) From kspaans at student.math.uwaterloo.ca Thu Jul 10 08:03:46 2008 From: kspaans at student.math.uwaterloo.ca (Kyle Spaans) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Re: energy costs and poor grad students In-Reply-To: References: Message-ID: <20080710150346.GB18058@student.math> On Thu, Jul 10, 2008 at 09:29:51AM -0400, Mark Kosmowski wrote: > Also, is there a recent compiler benchmark somewhere? The one at It may not be exactly what you're looking for, but I believe you'll find some useful information at The Great Programming Languages Shootout: http://shootout.alioth.debian.org/ gl & hf From dberkholz at gentoo.org Thu Jul 10 09:40:23 2008 From: dberkholz at gentoo.org (Donnie Berkholz) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Re: energy costs and poor grad students In-Reply-To: References: Message-ID: <20080710164023.GA19169@comet> On 09:29 Thu 10 Jul , Mark Kosmowski wrote: > Some people on the CPMD list (my primary code at this point - plane > wave quantum chemistry) suggest fftw as part of the math library > solution. I noticed that only fftw 2.1.5 supports MPI, while the > latest version of the 3.x series does not. Just for completeness, 3.2alpha3 does support MPI, if you're willing to use an alpha. -- Thanks, Donnie Donnie Berkholz Developer, Gentoo Linux Blog: http://dberkholz.wordpress.com -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: not available Url : http://www.scyld.com/pipermail/beowulf/attachments/20080710/fd4cfe2b/attachment.bin From per at computer.org Fri Jul 11 01:32:01 2008 From: per at computer.org (Per Jessen) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> Message-ID: Gerry Creager wrote: > I recounted my experience with an old HP system where the boot-loader > was toggled in on front panel switches and the OS was loaded with > paper tape. Yep, we had a similar one in highschool - a Danish RC7000 with 32K of genuine core-memory, papertape and three terminals. Later on (probably early 1980s) I'm pretty certain I encountered a somewhat bigger Texas Instruments machine, but with front panel toggle switches too. /Per Jessen, Z?rich From robertkubrick at gmail.com Fri Jul 11 11:33:46 2008 From: robertkubrick at gmail.com (Robert Kubrick) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] ScyId Message-ID: I was browsing through the Penguing Computing website and on the ScyId page I read the following: Single System Process Space Cluster acts and feels like a single virtual machine Does this mean that you can now write a program that allocates SYSV shared memory and semaphores and automatically spread your allocation over the entire cluster? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080711/a7c4a316/attachment.html From forum.san at gmail.com Sat Jul 12 04:56:22 2008 From: forum.san at gmail.com (Sangamesh B) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Need guidelines for NASA's NAS Parallel Benchmarks Message-ID: Dear all, This is the first time am doing benchmark of a system - Intel quad core, quad processor with RHEL 5 64 bit OS. After unpacking the NAS NPB, NPB3.3.tar.gz package, I got following directories: Changes.log NPB3.3-HPF.README NPB3.3-JAV.README NPB3.3-MPI NPB3.3-OMP NPB3.3-SER README I need to do the both MPI and OpenMP version benchmarks. I did some benchmarks on a test machine from NBP3.3-MPI. It has: BENCHMARK NAME CLASS TYPE [9] [7] [4] BT S FULL CG W SIMPLE DT A FORTRAN EP B EPIO FT C IS D LU E MG SP Obviously, the numebr of benchmarks will be 9 * 7 * 4 = 252. So I need to get the benchmarks for all 252? A sample benchmark: [root@test NPB3.3-MPI]# make BT NPROCS=4 CLASS=S SUBTYPE=full VERSION=VEC Since this benchmark is done on a test machine - dual core, dual opteron AMD64 processor , I used MPICH2 and GNU Compilers. To run the benchmark, I used the sample input data file given with NPB: [root@test btbin]# mpdtrace -l test_33638 (10.1.1.1) [root@test btbin]# mpiexec -np 4 ./bt.S.4.mpi_io_full ./inputbt.data NAS Parallel Benchmarks 3.3 -- BT Benchmark Reading from input file inputbt.data collbuf_nodes 0 collbuf_size 1000000 Size: 64x 64x 64 Iterations: 200 dt: 0.0008000 Number of active processes: 4 BTIO -- FULL MPI-IO write interval: 5 0 1 32 32 32 Problem size too big for compiled array sizes 1 1 32 32 32 Problem size too big for compiled array sizes 2 1 32 32 32 Problem size too big for compiled array sizes 3 1 32 32 32 Problem size too big for compiled array sizes [2] 48 at [0x00000000006c1088], mpid_vc.c[62] [0] 48 at [0x00000000006be4b8], mpid_vc.c[62] [1] 48 at [0x00000000006bfdf8], mpid_vc.c[62] [3] 48 at [0x00000000006c1088], mpid_vc.c[62] [root@test btbin]# Looks like 'run' is not successful. What's the wrong? The input file contains: [root@test btbin]# cat inputbt.data 200 number of time steps 0.0008d0 dt for class A = 0.0008d0. class B = 0.0003d0 class C = 0.0001d0 64 64 64 5 0 write interval (optional read interval) for BTIO 0 1000000 number of nodes in collective buffering and buffer size for BTIO [root@test btbin]# As I doing the benchmarks first time, have no idea to prepare a new input file. What parameters should be changed? How these data will affect the benchmark results? Is it ok, if I just run [root@test btbin]# mpiexec -np 4 ./bt.S.4.mpi_io_full without using any input file? The output of above dry run is: [root@test btbin]# mpiexec -np 4 ./bt.S.4.mpi_io_full NAS Parallel Benchmarks 3.3 -- BT Benchmark No input file inputbt.data. Using compiled defaults Size: 12x 12x 12 Iterations: 60 dt: 0.0100000 Number of active processes: 4 BTIO -- FULL MPI-IO write interval: 5 Time step 1 Writing data set, time step 5 Writing data set, time step 10 Writing data set, time step 15 Time step 20 Writing data set, time step 20 Writing data set, time step 25 Writing data set, time step 30 Writing data set, time step 35 Time step 40 Writing data set, time step 40 Writing data set, time step 45 Writing data set, time step 50 Writing data set, time step 55 Time step 60 Writing data set, time step 60 Reading data set 1 Reading data set 2 Reading data set 3 Reading data set 4 Reading data set 5 Reading data set 6 Reading data set 7 Reading data set 8 Reading data set 9 Reading data set 10 Reading data set 11 Reading data set 12 Verification being performed for class S accuracy setting for epsilon = 0.1000000000000E-07 Comparison of RMS-norms of residual 1 0.1703428370954E+00 0.1703428370954E+00 0.6680519237820E-14 2 0.1297525207005E-01 0.1297525207003E-01 0.9351949888112E-12 3 0.3252792698950E-01 0.3252792698949E-01 0.4859455174690E-12 4 0.2643642127515E-01 0.2643642127517E-01 0.7155062549945E-12 5 0.1921178413174E+00 0.1921178413174E+00 0.9101712010679E-14 Comparison of RMS-norms of solution error 1 0.1149036328945E+02 0.1149036328945E+02 0.4854294277047E-13 2 0.9156788904727E+00 0.9156788904727E+00 0.4195107810359E-13 3 0.2857899428614E+01 0.2857899428614E+01 0.9649723729104E-13 4 0.2598273346734E+01 0.2598273346734E+01 0.1391264769245E-12 5 0.2652795397547E+02 0.2652795397547E+02 0.3629324024933E-13 Verification Successful BTIO -- statistics: I/O timing in seconds : 0.02 I/O timing percentage : 16.06 Total data written (MB)[0] 712 at [0x00000000006d7b98], dataloop.c[505] [0] 296 at [0x00000000006d79c8], dataloop.c[324] [0] 288 at [0x00000000006d7398], dataloop.c[324] [0] 648 at [0x00000000006d7698], dataloop.c[324] [0] 296 at [0x00000000006d71c8], dataloop.c[324] [0] 288 at [0x00000000006d6ff8], dataloop.c[324] [0] 56 at [0x00000000006bf458], mpid_datatype_contents.c[62] [0] 72 at [0x00000000006bee68], mpid_datatype_contents.c[62] [0] 72 at [0x00000000006bf538], mpid_datatype_contents.c[62] [0] 864 at [0x00000000006bea58], dataloop.c[505] [0] 368 at [0x00000000006d6b18], dataloop.c[324] [0] 368 at [0x00000000006d68f8], dataloop.c[324] [0] 648 at [0x00000000006d65c8], dataloop.c[324] [0] 368 at [0x00000000006d63a8], dataloop.c[324] [0] 368 at [0x00000000006bf238], dataloop.c[324] [0] 56 at [0x00000000006be778], mpid_datatype_contents.c[62] [0] 80 at [0x00000000006be958], mpid_datatype_contents.c[62] [0] 80 at [0x00000000006be858], mpid_datatype_contents.c[62] [0] 72 at [0x00000000006be688], dataloop.c[324] [0] 72 at [0x000000000[1] 720 at [0x00000000006d7b98], dataloop.c[505] [1] 296 at [0x00000000006d79c8], dataloop.c[324] [1] 296 at [0x00000000006d7398], dataloop.c[324] [1] 648 at [0x00000000006d7698], dataloop.c[324] [1] 296 at [0x00000000006d71c8], dataloop.c[324] [1] 296 at [0x00000000006d6ff8], dataloop.c[324] [1] 56 at [0x00000000006c0d98], mpid_datatype_contents.c[62] [1] 72 at [0x00000000006c07a8], mpid_datatype_contents.c[62] [1] 72 at [0x00000000006c0e78], mpid_datatype_contents.c[62] [1] 864 at [0x00000000006c0398], dataloop.c[505] [1] 368 at [0x00000000006d6b18], dataloop.c[324] [1] 368 at [0x00000000006d68f8], dataloop.c[324] [1] 648 at [0x00000000006d65c8], dataloop.c[324] [1] 368 at [0x00000000006d63a8], dataloop.c[324] [1] 368 at [0x00000000006c0b78], dataloop.c[324] [1] 56 at [0x00000000006c00b8], mpid_datatype_contents.c[62] [1] 80 at [0x00000000006c0298], mpid_datatype_contents.c[62] [1] 80 at [0x00000000006c0198], mpid_datatype_contents.c[62] [1] 72 at [0x00000000006bffc8], dataloop.c[324] [1] 72 at [0x000000000[2] 720 at [0x00000000006d7b28], dataloop.c[505] [2] 296 at [0x00000000006d7958], dataloop.c[324] [2] 296 at [0x00000000006d6c98], dataloop.c[324] [2] 648 at [0x00000000006d7628], dataloop.c[324] [2] 296 at [0x00000000006d7158], dataloop.c[324] [2] 296 at [0x00000000006d6f88], dataloop.c[324] [2] 56 at [0x00000000006d5b98], mpid_datatype_contents.c[62] [2] 72 at [0x00000000006d5d68], mpid_datatype_contents.c[62] [2] 72 at [0x00000000006d5c78], mpid_datatype_contents.c[62] [2] 864 at [0x00000000006d5788], dataloop.c[505] [2] 368 at [0x00000000006d68f8], dataloop.c[324] [2] 368 at [0x00000000006d66d8], dataloop.c[324] [2] 648 at [0x00000000006d63a8], dataloop.c[324] [2] 368 at [0x00000000006d6188], dataloop.c[324] [2] 368 at [0x00000000006d5f68], dataloop.c[324] [2] 56 at [0x00000000006c1348], mpid_datatype_contents.c[62] [2] 80 at [0x00000000006d5688], mpid_datatype_contents.c[62] [2] 80 at [0x00000000006c1428], mpid_datatype_contents.c[62] [2] 72 at [0x00000000006c1258], dataloop.c[324] [2] 72 at [0x00000000006be598], dataloop.c[324] [0] 32 at [0x00000000006be318], mpid_datatype_contents.c[62] [0] 48 at [0x00000000006be4b8], mpid_vc.c[62] 06bfed8], dataloop.c[324] [1] 32 at [0x00000000006bfd28], mpid_datatype_contents.c[62] [1] 48 at [0x00000000006bfdf8], mpid_vc.c[62] [3] 720 at [0x00000000006d7b28], dataloop.c[505] [3] 296 at [0x00000000006d7958], dataloop.c[324] [3] 296 at [0x00000000006d6c98], dataloop.c[324] [3] 648 at [0x00000000006d7628], dataloop.c[324] [3] 296 at [0x00000000006d7158], dataloop.c[324] [3] 296 at [0x00000000006d6f88], dataloop.c[324] [3] 56 at [0x00000000006d5b98], mpid_datatype_contents.c[62] [3] 72 at [0x00000000006d5d68], mpid_datatype_contents.c[62] [3] 72 at [0x00000000006d5c78], mpid_datatype_contents.c[62] [3] 864 at [0x00000000006d5788], dataloop.c[505] [3] 368 at [0x00000000006d68f8], dataloop.c[324] [3] 368 at [0x00000000006d66d8], dataloop.c[324] [3] 648 at [0x00000000006d63a8], dataloop.c[324] [3] 368 at [0x00000000006d6188], dataloop.c[324] [3] 368 at [0x00000000006d5f68], dataloop.c[324] [3] 56 at [0x00000000006c1348], mpid_datatype_contents.c[62] [3] 80 at [0x00000000006d5688], mpid_datatype_contents.c[6206c1168], dataloop.c[324] [2] 32 at [0x00000000006c0fb8], mpid_datatype_contents.c[62] [2] 48 at [0x00000000006c1088], mpid_vc.c[62] ] [3] 80 at [0x00000000006c1428], mpid_datatype_contents.c[62] [3] 72 at [0x00000000006c1258], dataloop.c[324] [3] 72 at [0x00000000006c1168], dataloop.c[324] [3] 32 at [0x00000000006c0fb8], mpid_datatype_contents.c[62] [3] 48 at [0x00000000006c1088], mpid_vc.c[62] : 0.83 I/O data rate (MB/sec) : 49.85 BT Benchmark Completed. Class = S Size = 12x 12x 12 Iterations = 60 Time in seconds = 0.10 Total processes = 4 Compiled procs = 4 Mop/s total = 2204.23 Mop/s/process = 551.06 Operation type = floating point Verification = SUCCESSFUL Version = 3.3 Compile date = 12 Jul 2008 Compile options: MPIF77 = /opt/libs/mpi/mpich2/1.0.6p1/bin/mpif77 FLINK = $(MPIF77) FMPI_LIB = (none) FMPI_INC = (none) FFLAGS = -O FLINKFLAGS = -O RAND = (none) Please send the results of this run to: NPB Development Team Internet: npb@nas.nasa.gov If email is not available, send this to: MS T27A-1 NASA Ames Research Center Moffett Field, CA 94035-1000 Fax: 650-604-3957 [root@test btbin]# Can any one this list have the experience on NAS Parallel Benchmarks? If so, give some guidelines to do the benchmarks properly. I need to produce the Benchmark results within three days. Is this can be done? Thanks in advance, Sangamesh -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080712/a706cfce/attachment.html From Dries.Kimpe at cs.kuleuven.be Sat Jul 12 08:04:36 2008 From: Dries.Kimpe at cs.kuleuven.be (Dries Kimpe) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Update on mpi problem In-Reply-To: <1215690266.28335.44.camel@bruce.priv.wark.uk.streamline-computing.com> References: <487588E8.5030507@scalableinformatics.com> <1215690266.28335.44.camel@bruce.priv.wark.uk.streamline-computing.com> Message-ID: <20080712150436.GA2217@mhdmobile.lan> * Ashley Pittman [2008-07-10 12:44:26]: > That doesn't necessarily follow, if you are posing your sends before > your receives then you are relying on unexpected message buffering > within the MPI library. How much of this is available is up the the > library, not the standard so I think it's possible that openmpi is being > MPI compliant in both cases. The latter is easy to check: replace all MPI_Send and MPI_Rsend by MPI_Ssend. (MPI_Isend / MPI_Irsend -> MPI_Issend) You can do this through the profiling interface (create a library that provides MPI_Isend, MPI_Irsend, MPI_Send, MPI_Rsend and just calls PMPI_(I)ssend; link it before linking mpi) If it hangs, it will be at the point where the application relies on buffering within the MPI library. Greetings, Dries From mdidomenico4 at gmail.com Sat Jul 12 15:51:31 2008 From: mdidomenico4 at gmail.com (Michael Di Domenico) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] cuda benchmark Message-ID: There has been a lot of talk here lately about cuda, i'm just curious what people here use to benchmark (if at all) GPU's. I have access to twelve nodes of a larger cluster that have high end nvidia cards and infiniband. It's used for visual processing of CFD. Does anyone have a simple micro benchmark that can be spread across the nodes and crunch some numbers using cuda? I haven't been able find anything like that or i'm just looking in the wrong place. - Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080712/43765165/attachment.html From mambom1902 at yahoo.com Sun Jul 13 06:29:16 2008 From: mambom1902 at yahoo.com (loc duong ding) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Mpich problem Message-ID: <50398.76500.qm@web38804.mail.mud.yahoo.com> Dear G.Vinodh Kumar, I read your problem form internet. hi, i setup a two node cluster with mpich2-1.0. the name of the master node is aarya the name of the slave node is desktop2 i enabled the passwordless ssh session. in the mpd.hosts, i included the name of both nodes. the command, mpdboot -n 2 works fine. the command, mpdtrace gives the name of both machines. i copied the example program cpi on /home/vinodh/ on both the nodes. mpiexec -n 2 cpi gives the output, Process 0 of 2 is on aarya Process 1 of 2 is on desktop2 aborting job: Fatal error in MPI_Bcast: Other MPI error, error stack: MPI_Bcast(821): MPI_Bcast(buf=0xbfffbf28, count=1, MPI_INT, root=0, MPI_COMM_WORLD) failed MPIR_Bcast(229): MPIC_Send(48): MPIC_Wait(308): MPIDI_CH3_Progress_wait(207): an error occurred while handling an event returned by MPIDU_Sock_Wait() MPIDI_CH3I_Progress_handle_sock_event(1053): [ch3:sock] failed to connnect to remote process kvs_aarya_40892_0:1 MPIDU_Socki_handle_connect(767): connection failure (set=0,sock=1,errno=113:No route to host) rank 0 in job 1 aarya_40878 caused collective abort of all ranks exit status of rank 0: return code 13 but, the other example hellow works fine. let me know, why theres an error for the program cpi. Regards, G. Vinodh Kumar At present, I have the same problem when I install Mpich2.1.0.7. I think that you have solved this problem. Could you please instruct me how to solve this problem? I look forward to your reply. Thank you first. Sincerely, DUong Dinh Loc. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080713/dfd03e76/attachment.html From maurice at harddata.com Tue Jul 15 12:41:20 2008 From: maurice at harddata.com (Maurice Hilarius) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Roadrunner picture In-Reply-To: <200807151900.m6FJ07Xl014582@bluewest.scyld.com> References: <200807151900.m6FJ07Xl014582@bluewest.scyld.com> Message-ID: <487CFD60.4020209@harddata.com> "Jon Aquilina" wrote: > ..also why opterons wouldnt there be a better performance gain from > the new 45nm quad core intel's with 12mb cache? > The memory bandwidth on the XEONS is quite restrictive for many types of calculations, plus the overall power consumption on large memory systems using XEONS and FBDIMMS is also a factor. -- With our best regards, //Maurice W. Hilarius Telephone: 01-780-456-9771/ /Hard Data Ltd. FAX: 01-780-456-9772/ /11060 - 166 Avenue email:maurice@harddata.com/ /Edmonton, AB, Canada http://www.harddata.com// / T5X 1Y3/ / -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080715/3aca512b/attachment.html From eng_amirsaad at yahoo.com Wed Jul 16 00:28:32 2008 From: eng_amirsaad at yahoo.com (Amir Saad) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] how do I get started? Message-ID: <102077.53774.qm@web65415.mail.ac4.yahoo.com> I'm trying to build a small Beowulf cluster using 5 Ubuntu machines. I did a lot of search and yet couldn't find a getting started guide. Any ideas? All the articles I found are using RSH, is it possible to replace this with password-less SSH? Would it be possible to use Ubuntu? Which packages should I install to build the cluster? Please advice. Thank you Amir -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080716/f484af60/attachment.html From andrew.robbie at gmail.com Wed Jul 16 11:42:00 2008 From: andrew.robbie at gmail.com (Andrew Robbie (GMail)) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Roadrunner picture In-Reply-To: <487B6433.80406@lanl.gov> References: <02f001c8cd6c$3c5c1990$4f833909@oberon> <487B6433.80406@lanl.gov> Message-ID: On Tue, Jul 15, 2008 at 12:35 AM, Josip Loncaric wrote: > > Another good link: > > http://www.lanl.gov/orgs/hpc/roadrunner/rrtechnicalseminars2008.shtml As I was reading the slides, one question leap out at me: they have a huge IB network connecting every 'node', but instead of connecting direct to storage, this connects to an IB-to-10GigE bridge/IO processor board. Why? If they used avoided the protocol conversion going on that would be inherently simpler, and I've seen nothing to indicate that 10GigE is faster or cheaper (certainly not cheaper!). Is this (ethernet) a Panasas requirement? Particularly as there is one paper directly relating checkpoint time (IO performance) to overall throughput, I would have though IO was quite a central requirement. I can see not wanting to change an existing storage backend though. It would be great to see some graphs of bus contention (Cell <-> PPC, PPC<->IB<->Cell three-way etc) for the various codes, rather than just latency/bandwidth figures. And maybe GFlops/MMF (mythical man month) (or have MMFs been related to Watts in some fashion?) \Andrew From peter.st.john at gmail.com Wed Jul 16 13:47:49 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: <20080710150040.GA18058@student.math> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> Message-ID: Kyle, I received this just now, on the 16th, almost a week after you sent it on the 10th. My receipt from the list seems to come in bulk packages, sometimes. Maybe server maintenance?. Peter On 7/10/08, Kyle Spaans wrote: > > OK, is it just me? All of my messages seem to take days to get to the list > because the moderator has to approve them? > I'm pretty sure that I'm subscribed properly. > > > On Thu, Jul 10, 2008 at 09:02:25AM -0400, Robert G. Brown wrote: > > I do that the Titorheads will attack and I'll have to once again > > unconvincingly deny that I am, in fact, John Titor. > > > John Titor? WOW! I remember reading about ``him'' a couple of years ago > when I was in highschool. > I never thought I'd see a reference to ``him'' in a place like this! ;) > > > On a "programming languages" note, my school may be a little better than > most. The first-year CS cirriculum has moved entirely away from Java, and > the main class is now taught in Scheme, with C as the steping-stone to > 2nd-year imperative languages. > There are also "CS for non-majors" classes in Python I believe. I can't > comment on upper-year project type things though, because I'm only in my 2nd > year. :) > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080716/224417c9/attachment.html From peter.st.john at gmail.com Wed Jul 16 13:56:09 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] how do I get started? In-Reply-To: <102077.53774.qm@web65415.mail.ac4.yahoo.com> References: <102077.53774.qm@web65415.mail.ac4.yahoo.com> Message-ID: Amir, Did you try any of the documents referenced at wiki, http://en.wikipedia.org/wiki/Beowulf_%28computing%29#See_also? There is a free downloadable book at Professor Brown's website, start at http://www.phy.duke.edu/~rgb/Beowulf/beowulf.php And I believe yes, you can use SSH to replace RSH pretty much anywhere. I haven't done that myself but I'm only a hobbyist. Good luck, Peter On 7/16/08, Amir Saad wrote: > > I'm trying to build a small Beowulf cluster using 5 Ubuntu machines. I did > a lot of search and yet couldn't find a getting started guide. Any ideas? > > All the articles I found are using RSH, is it possible to replace this with > password-less SSH? Would it be possible to use Ubuntu? Which packages should > I install to build the cluster? > > Please advice. > > Thank you > > Amir > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080716/64f6019d/attachment.html From landman at scalableinformatics.com Wed Jul 16 14:47:35 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: <20080710150040.GA18058@student.math> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> Message-ID: <487E6C77.6080706@scalableinformatics.com> Kyle Spaans wrote: > OK, is it just me? All of my messages seem to take days to get to the list because the moderator has to approve them? > I'm pretty sure that I'm subscribed properly. Don is a massively overworked/underpaid person. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From rgb at phy.duke.edu Wed Jul 16 15:00:04 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: <48751DD8.1080005@moene.indiv.nluug.nl> References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <20080708175844.GA5087@nlxdcldnl2.cl.intel.com> <48751DD8.1080005@moene.indiv.nluug.nl> Message-ID: On Wed, 9 Jul 2008, Toon Moene wrote: > Lombard, David N wrote: > >> All were after. MCC dates from '91; > > Ah, that explains :-) > > As I bought a NeXT workstation in November '91 (at the tune of ~ $ 20,000) I > had no need for a PC-type Unix-look-a-like. You have my deep sympathies. I had to manage a stable of some ten NextStations, and while their GUI was lovely, I wanted to apply a blunt instrument to the inventor of "netinfo" as well as to the (probably same) joker who claimed that the NeXT could run on a standard Unix /etc model with NIS etc. I'd save a few good licks for the person who left all the blank commands in -- complete with their man pages -- and who wrote the NeXT versions but didn't bother making the binaries actually correspond to their man pages. Nice (if expensive) "Unix-like" personal systems; really dark side stuff to try to manage in a scalable way in a mixed Unix environment of mostly-Sun workstations. > Besides, the NeXT was good enough to start working on g77 ... It could even > run our (then operational) weather forecasting model. They also ran Mathematica beautifully. That was primarily why we got them -- they did a find MMA notebook, and we had a prof who was writing great notebooks to support some core graduate courses. But to do computations, for my $20K -- or even my $10K, as we got "great" prices -- I was all Sun. Or maybe a little bit SGI, as they had some nice (but much more expensive) MIPS based workstations that I ran a fair bit of code on. I'm a late finger -- I didn't start using linux much until 1994 or thereabouts -- some version of SLS first, and then transitioned to slackware on genuine floppies. At first I only ran it at home on 486s and then a couple of different cheap 586 clones, but in 1996 I bought some of the first dual Pentium Pro's and popped slackware on them with the 2.0.0 kernel and made them into a cluster. They weren't really bootable and stable until 2.0.5 or 2.0.6 or thereabouts -- the early SMP kernels were pretty disasterous and actually would do things like eat disks because of locking problems and some serious bugs in disk and network. So MCC's day came and went without my notice... and even SLS was a bit thin as I absolutely needed stuff like TeX and IIRC it didn't have it but slackware did. But it was long ago and I pitched my SLS and Slackware floppies (smiling face of bob and all) a few years ago, since floppies are obsolete. rgb -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From rgb at phy.duke.edu Wed Jul 16 15:22:04 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] A press release In-Reply-To: <20080710150040.GA18058@student.math> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> Message-ID: On Thu, 10 Jul 2008, Kyle Spaans wrote: > OK, is it just me? All of my messages seem to take days to get to the list because the moderator has to approve them? > I'm pretty sure that I'm subscribed properly. > > On Thu, Jul 10, 2008 at 09:02:25AM -0400, Robert G. Brown wrote: >> I do that the Titorheads will attack and I'll have to once again >> unconvincingly deny that I am, in fact, John Titor. > > John Titor? WOW! I remember reading about ``him'' a couple of years ago when I was in highschool. > I never thought I'd see a reference to ``him'' in a place like this! ;) I'd never heard of him until the moment that the 'heads decided I was he, so to speak. Now I know far, far too much about him, and have actually read and critiqued a pretty fair chunk of his primary time travel threads. If only I could figure out some way of exploiting this to make money... but writing a book entitled "I am not John Titor and Think He's a Dweeb" seems kinda lame. > On a "programming languages" note, my school may be a little better > than most. The first-year CS cirriculum has moved entirely away from > Java, and the main class is now taught in Scheme, with C as the > steping-stone to 2nd-year imperative languages. > > There are also "CS for non-majors" classes in Python I believe. I > can't comment on upper-year project type things though, because I'm only > in my 2nd year. :) C is good. Scheme I'm not so sure about. Maybe it's just my curmudgeonly upbringing, but learning to code in a "standard compiled language" has its benefits, if only separating the people destined for coding greatness from the ones who should become accountants or lawyers or something instead. In fact, it would be good for teachers of programming to take note of things like the size of the application base written in a language and what kinds of applications are represented there. Number of commonly used applications written in Scheme, hmmmm, I'm guessing that is a number of order unity, much as was the case with Pascal (another idiotic favorite of CS teachers over the years). Then count C -- uhhh, that would be tens of thousands, including nearly all the systems code in the universe. C++ -- thousands at least, probably tens of thousands if one includes Windoze. Fortran -- hundreds of thousands, although one has to be a total masochist to write character code or systems code in fortran (but it is quite nice for straight numerical code). Even Lisp (Scheme's grandparent) probably has a decent code base, and then of the scripting languages perl and python each are easily in the thousands, with octave/matlab a strong contender in the straight numerical arena. On this scale, even java isn't insane, nor is php. Lots of apps, some of them quite solid and professional. Given ALL OF THESE CHOICES -- each with an enormous base of programs, each with a strong base of commercial and research demand, each with a strong programming model that favors the development of certain kinds of commonly needed programs, why teach a language nobody actually uses to write applications, probably for some really excellent reasons? I personally would favor teaching coding with any of the really gritty languages -- C (yeah!) or Fortran, compiled, C++ maybe as a followup, perl if you're not a fascist coder, python if you are. But this is an old and standard rant by now. rgb > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From john.hearns at streamline-computing.com Wed Jul 16 15:29:05 2008 From: john.hearns at streamline-computing.com (John Hearns) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Roadrunner picture In-Reply-To: References: <02f001c8cd6c$3c5c1990$4f833909@oberon> <487B6433.80406@lanl.gov> Message-ID: <1216247355.6680.27.camel@Vigor13> On Thu, 2008-07-17 at 04:42 +1000, Andrew Robbie (GMail) wrote: > On Tue, Jul 15, 2008 at 12:35 AM, Josip Loncaric wrote: > > > > Another good link: > > > > http://www.lanl.gov/orgs/hpc/roadrunner/rrtechnicalseminars2008.shtml > > As I was reading the slides, one question leap out at me: they have a > huge IB network connecting every 'node', but instead of connecting > direct to storage, this connects to an IB-to-10GigE bridge/IO > processor board. Why? If they used avoided the protocol conversion > going on that would be inherently simpler, and I've seen nothing to > indicate that 10GigE is faster or cheaper (certainly not cheaper!). > > Is this (ethernet) a Panasas requirement? Yes indeed it is. I'll stick my head up above the parapet as someone who cares for and feeds several Panasas installations, though I don't work for the company. For installations using Infiniband, Panasas will advise on how to implement Infiniband to (10gig) Ethernet storage routers. I take it here that Roadrunner is using a blade in the Infiniband switch, rather than a discreet router. This sounds a very good idea to me, and I looked into it for a particular project in the UK, though we didn't go for that approach in the end. To answer your question more directly, Panasas is a storage cluster to complement your compute cluster. Each storage blade is connected into a shelf (chassis) with an internal ethernet network. Each shelf is then connected to your ethernet switch with at least 4Gbps of bandwidth. It might look on the front like a big RAID array - and hence the questions as to why you don't have a fibrechannel or native Infiniband connector on it. But its not really a RAID array - its a storage cluster. Files are RAIDed over the filesystem and your client stripes IO over several storage blades at any one time. And please lets not start any Infiniband good/ethernet bad wars here, or bandwidth willy-waving. Panasas have made some damned good engineering decisions and their system scales like crazy - just what you need for something like Roadrunner. From john.hearns at streamline-computing.com Wed Jul 16 15:50:28 2008 From: john.hearns at streamline-computing.com (John Hearns) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Roadrunner picture In-Reply-To: <1216247355.6680.27.camel@Vigor13> References: <02f001c8cd6c$3c5c1990$4f833909@oberon> <487B6433.80406@lanl.gov> <1216247355.6680.27.camel@Vigor13> Message-ID: <1216248639.6680.30.camel@Vigor13> On Wed, 2008-07-16 at 23:29 +0100, John Hearns wrote: > > To answer your question more directly, Panasas is a storage cluster to > complement your compute cluster. Each storage blade is connected into a > shelf (chassis) with an internal ethernet network. Each shelf is then > connected to your ethernet switch with at least 4Gbps of bandwidth. Before I damn Panasas with faint praise, there is an option for higher bandwidth connectivity which I'd hazard a guess is in use here. And remember that's the connectivity to each one of the chassis - not to the system as a whole. From atchley at myri.com Wed Jul 16 18:09:31 2008 From: atchley at myri.com (Scott Atchley) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] Roadrunner picture In-Reply-To: <1216248639.6680.30.camel@Vigor13> References: <02f001c8cd6c$3c5c1990$4f833909@oberon> <487B6433.80406@lanl.gov> <1216247355.6680.27.camel@Vigor13> <1216248639.6680.30.camel@Vigor13> Message-ID: On Jul 16, 2008, at 6:50 PM, John Hearns wrote: > On Wed, 2008-07-16 at 23:29 +0100, John Hearns wrote: > >> To answer your question more directly, Panasas is a storage cluster >> to >> complement your compute cluster. Each storage blade is connected >> into a >> shelf (chassis) with an internal ethernet network. Each shelf is then >> connected to your ethernet switch with at least 4Gbps of bandwidth. > > Before I damn Panasas with faint praise, there is an option for higher > bandwidth connectivity which I'd hazard a guess is in use here. > And remember that's the connectivity to each one of the chassis - > not to > the system as a whole. They do. Here is more info: http://www.byteandswitch.com/document.asp?doc_id=155938&WT.svl=news1_2 Scott From hahn at mcmaster.ca Wed Jul 16 21:36:30 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] cuda benchmark In-Reply-To: References: Message-ID: > There has been a lot of talk here lately about cuda, i'm just curious what > people here use to benchmark (if at all) GPU's. I have access to twelve I don't think the gp-gpu world is that far developed... > nodes of a larger cluster that have high end nvidia cards and infiniband. > It's used for visual processing of CFD. > Does anyone have a simple micro benchmark that can be spread across the > nodes and crunch some numbers using cuda? I haven't been able find anything > like that or i'm just looking in the wrong place. cuda concerns itself with a single system, so I can't see why it would return different results on different nodes... From mfatica at gmail.com Wed Jul 16 21:56:53 2008 From: mfatica at gmail.com (Massimiliano Fatica) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] cuda benchmark In-Reply-To: References: Message-ID: <8e6393ac0807162156y54b8dcddg18d3692c707a0fa6@mail.gmail.com> You can use NAMD. http://www.ks.uiuc.edu/Research/vmd/cuda/ Massimiliano On Wed, Jul 16, 2008 at 9:36 PM, Mark Hahn wrote: > There has been a lot of talk here lately about cuda, i'm just curious what >> people here use to benchmark (if at all) GPU's. I have access to twelve >> > > I don't think the gp-gpu world is that far developed... > > nodes of a larger cluster that have high end nvidia cards and infiniband. >> It's used for visual processing of CFD. >> Does anyone have a simple micro benchmark that can be spread across the >> nodes and crunch some numbers using cuda? I haven't been able find >> anything >> like that or i'm just looking in the wrong place. >> > > cuda concerns itself with a single system, so I can't see why it would > return different results on different nodes... > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080716/5eff1b19/attachment.html From Bogdan.Costescu at iwr.uni-heidelberg.de Thu Jul 17 02:07:23 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] List moderation (was: A press release) In-Reply-To: <487E6C77.6080706@scalableinformatics.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <487E6C77.6080706@scalableinformatics.com> Message-ID: On Wed, 16 Jul 2008, Joe Landman wrote: > Kyle Spaans wrote: >> OK, is it just me? All of my messages seem to take days to get to the list >> because the moderator has to approve them? >> I'm pretty sure that I'm subscribed properly. > > Don is a massively overworked/underpaid person. How about sharing the moderation load ? Especially with people from different timezones this could reduce considerably the time spent by messages in the moderation queue. And then Don could also have holidays from time to time :-) -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From mark.kosmowski at gmail.com Thu Jul 17 07:03:30 2008 From: mark.kosmowski at gmail.com (Mark Kosmowski) Date: Wed Nov 25 01:07:26 2009 Subject: [Beowulf] how do I get started? Message-ID: I have a small cluster too (ok, had, I'm condensing to one big RAM workstation at the moment). Mine was a 3 node dual Opteron setup. The first trick is to get each node to be able to ssh to every other node without getting a password prompt. It doesn't matter for the calculation whether this is done by passwordless ssh or using some keychain type applet. All that matters for the calculation is that each node be able to ssh without prompt to each other node. This means n! checks. For my 3 node baby cluster I did this manually, since 3! is 6. More nodes - might want to make a script or something. Next, you need to make your application parellel. This typically involves compiling with some sort of MPI. I use OpenMPI. MPICH is another option. This is going to be part of the learning curve. When you can and get an output for each CPU (each core?) you're ready to start wrestling with an MPI compile of your application code. If you're not used to compiling your own code, there may be a bit of a learning curve for this also. Good luck! Mark E. Kosmowski > > Amir, > Did you try any of the documents referenced at wiki, > http://en.wikipedia.org/wiki/Beowulf_%28computing%29#See_also? > > There is a free downloadable book at Professor Brown's website, start at > http://www.phy.duke.edu/~rgb/Beowulf/beowulf.php > > And I believe yes, you can use SSH to replace RSH pretty much anywhere. I > haven't done that myself but I'm only a hobbyist. > > Good luck, > Peter > > > On 7/16/08, Amir Saad wrote: > > > > I'm trying to build a small Beowulf cluster using 5 Ubuntu machines. I did > > a lot of search and yet couldn't find a getting started guide. Any ideas? > > > > All the articles I found are using RSH, is it possible to replace this with > > password-less SSH? Would it be possible to use Ubuntu? Which packages should > > I install to build the cluster? > > > > Please advice. > > > > Thank you > > > > Amir > > > > From kyron at neuralbs.com Thu Jul 17 07:55:45 2008 From: kyron at neuralbs.com (Eric Thibodeau) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] how do I get started? In-Reply-To: References: Message-ID: <487F5D71.6070709@neuralbs.com> Amir: See below, Mark Kosmowski wrote: > I have a small cluster too (ok, had, I'm condensing to one big RAM > workstation at the moment). Mine was a 3 node dual Opteron setup. > > The first trick is to get each node to be able to ssh to every other > node without getting a password prompt. It doesn't matter for the > calculation whether this is done by passwordless ssh or using some > keychain type applet. All that matters for the calculation is that > each node be able to ssh without prompt to each other node. This > means n! checks. For my 3 node baby cluster I did this manually, > since 3! is 6. More nodes - might want to make a script or something. > Here is an example with one-lined "script" for the ssh password less logon: http://wiki.neuralbs.com/index.php/Howto_Build_a_Basic_Gentoo_Beowulf_Cluster#Passwordless_Logon_to_the_Nodes_and_Node_List_Creation > Next, you need to make your application parellel. This typically > involves compiling with some sort of MPI. I use OpenMPI. MPICH is > another option. This is going to be part of the learning curve. When > you can and get an output for each CPU (each core?) > you're ready to start wrestling with an MPI compile of your > application code. > > If you're not used to compiling your own code, there may be a bit of a > learning curve for this also. > Well... once I "finish" my Google Summer of Code project, people like you will have a fun playground with graphical examples (as well as serious people for serious HPC...but they won't admit to that and keep using RH as an OS :P) http://soc.gentooexperimental.org/projects/show/gentoo-cluster-seed > Good luck! > > Mark E. Kosmowski > Enjoy, Eric Thibodeau >> Amir, >> Did you try any of the documents referenced at wiki, >> http://en.wikipedia.org/wiki/Beowulf_%28computing%29#See_also? >> >> There is a free downloadable book at Professor Brown's website, start at >> http://www.phy.duke.edu/~rgb/Beowulf/beowulf.php >> >> And I believe yes, you can use SSH to replace RSH pretty much anywhere. I >> haven't done that myself but I'm only a hobbyist. >> >> Good luck, >> Peter >> >> >> On 7/16/08, Amir Saad wrote: >> >>> I'm trying to build a small Beowulf cluster using 5 Ubuntu machines. I did >>> a lot of search and yet couldn't find a getting started guide. Any ideas? >>> >>> All the articles I found are using RSH, is it possible to replace this with >>> password-less SSH? Would it be possible to use Ubuntu? Which packages should >>> I install to build the cluster? >>> >>> Please advice. >>> >>> Thank you >>> >>> Amir >>> >>> >>> > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080717/97f01ea6/attachment.html From jpilldev at gmail.com Thu Jul 17 20:46:32 2008 From: jpilldev at gmail.com (J Pill) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] how do I get started? In-Reply-To: References: <487F5D71.6070709@neuralbs.com> Message-ID: Hello. i'm working with beowulf around 2 years, but i think that mi knowledge is not enough, so, is posible make a web portal about an beowulf academy? or that exist? Thanks a lot. > > On Thu, Jul 17, 2008 at 9:55 AM, Eric Thibodeau > wrote: > >> Amir: >> See below, >> >> Mark Kosmowski wrote: >> >> I have a small cluster too (ok, had, I'm condensing to one big RAM >> workstation at the moment). Mine was a 3 node dual Opteron setup. >> >> The first trick is to get each node to be able to ssh to every other >> node without getting a password prompt. It doesn't matter for the >> calculation whether this is done by passwordless ssh or using some >> keychain type applet. All that matters for the calculation is that >> each node be able to ssh without prompt to each other node. This >> means n! checks. For my 3 node baby cluster I did this manually, >> since 3! is 6. More nodes - might want to make a script or something. >> >> >> Here is an example with one-lined "script" for the ssh password less >> logon: >> >> >> http://wiki.neuralbs.com/index.php/Howto_Build_a_Basic_Gentoo_Beowulf_Cluster#Passwordless_Logon_to_the_Nodes_and_Node_List_Creation >> >> Next, you need to make your application parellel. This typically >> involves compiling with some sort of MPI. I use OpenMPI. MPICH is >> another option. This is going to be part of the learning curve. When >> you can and get an output for each CPU (each core?) >> you're ready to start wrestling with an MPI compile of your >> application code. >> >> If you're not used to compiling your own code, there may be a bit of a >> learning curve for this also. >> >> >> Well... once I "finish" my Google Summer of Code project, people like you >> will have a fun playground with graphical examples (as well as serious >> people for serious HPC...but they won't admit to that and keep using RH as >> an OS :P) >> >> http://soc.gentooexperimental.org/projects/show/gentoo-cluster-seed >> >> Good luck! >> >> Mark E. Kosmowski >> >> >> Enjoy, >> >> Eric Thibodeau >> >> Amir, >> Did you try any of the documents referenced at wiki,http://en.wikipedia.org/wiki/Beowulf_%28computing%29#See_also? >> >> There is a free downloadable book at Professor Brown's website, start athttp://www.phy.duke.edu/~rgb/Beowulf/beowulf.php >> >> And I believe yes, you can use SSH to replace RSH pretty much anywhere. I >> haven't done that myself but I'm only a hobbyist. >> >> Good luck, >> Peter >> >> >> On 7/16/08, Amir Saad wrote: >> >> >> I'm trying to build a small Beowulf cluster using 5 Ubuntu machines. I did >> a lot of search and yet couldn't find a getting started guide. Any ideas? >> >> All the articles I found are using RSH, is it possible to replace this with >> password-less SSH? Would it be possible to use Ubuntu? Which packages should >> I install to build the cluster? >> >> Please advice. >> >> Thank you >> >> Amir >> >> >> >> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf >> >> >> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080717/81367374/attachment.html From gdjacobs at gmail.com Fri Jul 18 06:18:59 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] ScyId In-Reply-To: References: Message-ID: <48809843.4000603@gmail.com> Robert Kubrick wrote: > I was browsing through the Penguing Computing website and on the ScyId > page I read the following: > > Single System Process Space > > * Cluster acts and feels like a single virtual machine > > Does this mean that you can now write a program that allocates SYSV > shared memory and semaphores and automatically spread your allocation > over the entire cluster? http://bproc.sourceforge.net/ No, process control and migration can be handled through the shell of the master, but memory is not shared. From gdjacobs at gmail.com Fri Jul 18 06:51:18 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] A press release In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> Message-ID: <48809FD6.8070907@gmail.com> Robert G. Brown wrote: > > C is good. Scheme I'm not so sure about. Maybe it's just my > curmudgeonly upbringing, but learning to code in a "standard compiled > language" has its benefits, if only separating the people destined for > coding greatness from the ones who should become accountants or lawyers > or something instead. > > In fact, it would be good for teachers of programming to take note of > things like the size of the application base written in a language and > what kinds of applications are represented there. Number of commonly > used applications written in Scheme, hmmmm, I'm guessing that is a > number of order unity, much as was the case with Pascal (another idiotic > favorite of CS teachers over the years). > > Then count C -- uhhh, that would be tens of thousands, including nearly > all the systems code in the universe. C++ -- thousands at least, > probably tens of thousands if one includes Windoze. Fortran -- hundreds > of thousands, although one has to be a total masochist to write > character code or systems code in fortran (but it is quite nice for > straight numerical code). Even Lisp (Scheme's grandparent) probably has > a decent code base, and then of the scripting languages perl and python > each are easily in the thousands, with octave/matlab a strong contender > in the straight numerical arena. On this scale, even java isn't insane, > nor is php. Lots of apps, some of them quite solid and professional. > > Given ALL OF THESE CHOICES -- each with an enormous base of programs, > each with a strong base of commercial and research demand, each with a > strong programming model that favors the development of certain kinds of > commonly needed programs, why teach a language nobody actually uses to > write applications, probably for some really excellent reasons? > > I personally would favor teaching coding with any of the really gritty > languages -- C (yeah!) or Fortran, compiled, C++ maybe as a followup, > perl if you're not a fascist coder, python if you are. > > But this is an old and standard rant by now. > > rgb I have never programmed Scheme, so I can't comment on it directly. I have a good base in most of the procedural programming languages, so I'll stick to that... Point in fact, Pascal was seldom used in production because Wirth didn't design it for that. Pascal was designed to be syntactically clean in order to make the semantic structure of the software more evident to the novice. It fulfilled this function well, and to this day remains a good introductory language. C is more appropriate for lowish level tasks (which was almost everything on MS-DOS computers) because C gives you the versatility and control required to extract every ounce of performance from your application. Unfortunately, C almost seems to reinforce what I would consider bad habits: pointer arithmetic, preprocessor abuse, rampant inlining. At the end of a long coding session, I very often have to consciously avoid these things which, while fun, are deadly as far as review and maintainability is concerned. It's worth noting that Borland Delphi (based on Pascal) was at one time popular for RAD programming on Windows, and a much superior alternative to VB. -- Geoffrey D. Jacobs From gdjacobs at gmail.com Fri Jul 18 06:58:38 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] how do I get started? In-Reply-To: <102077.53774.qm@web65415.mail.ac4.yahoo.com> References: <102077.53774.qm@web65415.mail.ac4.yahoo.com> Message-ID: <4880A18E.9040605@gmail.com> Amir Saad wrote: > I'm trying to build a small Beowulf cluster using 5 Ubuntu machines. I > did a lot of search and yet couldn't find a getting started guide. Any > ideas? > > All the articles I found are using RSH, is it possible to replace this > with password-less SSH? Would it be possible to use Ubuntu? Which > packages should I install to build the cluster? Yes, yes, see below. > Please advice. > > Thank you > > Amir The packages you need are openssh-server and openssh-client. The server should be on all machines you must sign in to, the client on the machines you wish to sign in from. Here's a guide on setting up ssh so passwords will not be required (just keys): http://www.debian-administration.org/articles/152 -- Geoffrey D. Jacobs From landman at scalableinformatics.com Fri Jul 18 07:13:08 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] A press release In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> Message-ID: <4880A4F4.4050509@scalableinformatics.com> Robert G. Brown wrote: > On Thu, 10 Jul 2008, Kyle Spaans wrote: [...] >> >> There are also "CS for non-majors" classes in Python I believe. I >> can't comment on upper-year project type things though, because I'm only >> in my 2nd year. :) > > C is good. Scheme I'm not so sure about. Maybe it's just my > curmudgeonly upbringing, but learning to code in a "standard compiled > language" has its benefits, if only separating the people destined for > coding greatness from the ones who should become accountants or lawyers > or something instead. Eeekk... return of the language wars. They all have the same morphology Person1: "My language is better than yours" Person2: "Oh yeah? At least I use a real OS, CP/M!" and its downhill in a cascade of responses from there ... [...] > But this is an old and standard rant by now. :) But aren't they fun to rehash ? -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From Hakon.Bugge at scali.com Fri Jul 18 07:46:59 2008 From: Hakon.Bugge at scali.com (=?iso-8859-1?Q?H=E5kon?= Bugge) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Again about NUMA (numactl and taskset) In-Reply-To: <48648B2B.40405@myri.com> References: <20080624101512.A0C2435A90B@mail.scali.no> <674893953.65671214349788737.JavaMail.root@zimbra.vpac.org> <20080626090551.7EC2A35A8CC@mail.scali.no> <48648B2B.40405@myri.com> Message-ID: <20080718144659.EF5BC35A751@mail.scali.no> At 08:39 27.06.2008, Patrick Geoffray wrote: >Hi Hakon, > >H?kon Bugge wrote: >>This is information we're using to optimize how >>pnt-to-pnt communication is implemented. The >>code-base involved is fairly complicated and I >>do not expect resource management systems to cope with it. > >Why not ? It's its job to know the resources it >has to manage. The resource manager has more >information than you, it does not have to detect >at runtime for each job, and it can manage cores >allocation across jobs. You cannot expect the >granularity of the allocation to stay at the >node level with the core count increasing. This raises two questions: a) Which job schedulers are able to optimize placement on cores thereby _improving_ application performance? b) which job schedulers are able to deduct which cores share a L3 cache and are situated on the same socket? ... and a clarification. Systems using Scali MPI Connect _can_ have finer granularity than the node level; the job scheduler must just not oversubscribe. Assignment of cores to processes is _dynamically_ done by Scali MPI Connect. >If the MPI implementation does the spawning, it >should definitively have support to enforce core >affinity (most do AFAIK). However, core affinity >should be dictated by the scheduler. Heck, the >MPI implementation should not do the spawning in the first place. > >Historically, resource managers have been pretty >dumb. These days, there is enough competition in this domain to expect better. I am fine with the schedulers dictating it, but not if the performance is hurt. H?kon From gus at ldeo.columbia.edu Thu Jul 17 08:47:16 2008 From: gus at ldeo.columbia.edu (Gus Correa) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] How much RAM per core is right? Message-ID: <487F6984.3000006@ldeo.columbia.edu> Dear Beowulf list subscribers A quick question: How much memory per core/processor is right for a Beowulf cluster node? *** I am new to this list, so please forgive me if my question hurts the list etiquette. I will take the question back if so. *** To qualify my question, here are some details of my problem: I plan to buy 8GB per node on a dual-processor quad-core machine (1GB per core), most likely AMD Opteron. I based this on some classic scaling calculations based on the programs we use and the computations we do, and shrunk it down somewhat due to budget constraints. However, I also saw this number on several postings on other mailing lists, where people described their cluster configurations, which gave me some confidence 1GB per core is acceptable and used by a number of people out there. Still, I would like to hear your thoughts and experiences about this. I may be missing an important point, and your advice is important and welcome. Actually not long ago RAM-per-core ratio used to be 512MB per core (which were physical CPUs back then), and it seems to me some non-PC HPC machines (IBM BlueGene, SiCortex machines, etc) still use the 512MB RAM-per-core ratio. PC server motherboards can fit 32, 64, even 128GB of RAM these days. Hence, one can grow really big, particularly for desktop/interactive applications like Matlab, etc. However, what IBM and others do (on machines of admittedly very different architecture) makes me think whether for PC cluster nodes the "big is better" philosophy is wise, or if there is a saturation point for the efficiency of RAM-to-core ratio. What do you think? For PC-based cluster compute nodes, is 1GB per core right? Is it too much? Is it too little? "Big is better" is really the best, and minimalism is just an excuse for the cheap and the poor? We do climate model number crunching with MPI here. We use domain decomposition, finite-differences, some FFTs, etc. The "models" are "memory intensive", with big 3D arrays being read from and written to memory all the time, and not so big 2D arrays (sub domain boundary values) being passed across processes through MPI at every time step of the simulation. I/O happens at a slower pace, typically every ~100 time steps or more, and can be either funneled through a master process, or distributed across all processes. One goal of our new "cluster-to-be" is to run the programs at higher spatial resolution. Most algorithms that march the solution in time are conditionally stable. Therefore, due to the Courant-Friedrichs-Levy stability condition, the time step must be proportional to the smallest spatial grid interval. Hence, for 3D climate problems, the computational effort scales as N**4, where N is a typical number of grid points in a spatial dimension. Our old dual-processor single-core production cluster has 1GB per node (512MB per "core"). Most of our models fit this configuration. The larger problems use up to 70-80% RAM, but safely avoid memory paging, process switching, etc. However, on multicore machines there are other issues to consider, particularly memory bandwidth, cache size vs. RAM size, NUMA, cache eviction, etc, etc. So, the classic scaling I mentioned above may need to be combined with memory bandwidth and other factors of this kind. In any case, at this point it seems to me that "get as much RAM as your money can buy and your motherboard can fit" may not be a wise choice. Is there anybody out there using 64 or 128GB per node? I wonder if there is an optimal choice of RAM-per-core. What is your rule of thumb? Or does it depend? And on what does it depend? Many thanks, Gus Correa -- --------------------------------------------------------------------- Gustavo J. Ponce Correa, PhD - Email: gus@ldeo.columbia.edu Lamont-Doherty Earth Observatory - Columbia University P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA --------------------------------------------------------------------- From dcardosoa at yahoo.com.br Fri Jul 18 06:50:43 2008 From: dcardosoa at yahoo.com.br (Daniel Cardoso Alves) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Building a BeoWulf Message-ID: <633706.21976.qm@web65602.mail.ac4.yahoo.com> Hi people... I am a newbie on Clusters... I ever have heard about BeoWulf cluster and now I need implement a Cluster... I would like where I can get the BeoWulf and How to I can build a BoeWulf Cluster... Thanks! Best Regards Daniel Cardoso Novos endere?os, o Yahoo! que voc? conhece. Crie um email novo com a sua cara @ymail.com ou @rocketmail.com. http://br.new.mail.yahoo.com/addresses -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080718/6be89bf2/attachment.html From peter.st.john at gmail.com Fri Jul 18 08:32:27 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Building a BeoWulf In-Reply-To: <633706.21976.qm@web65602.mail.ac4.yahoo.com> References: <633706.21976.qm@web65602.mail.ac4.yahoo.com> Message-ID: Daniel, Buenas dias. I'd start with these two resources: wiki, http://en.wikipedia.org/wiki/Beowulf_%28computing%29#See_also? There is a free downloadable book at Professor Brown's website, start at http://www.phy.duke.edu/~rgb/Beowulf/beowulf.php Buenas suerte, Peter (1,$s/Espan~ol/Portugues/g) On 7/18/08, Daniel Cardoso Alves wrote: > > Hi people... > > I am a newbie on Clusters... > I ever have heard about BeoWulf cluster and now I need implement a > Cluster... > > I would like where I can get the BeoWulf and How to I can build a BoeWulf > Cluster... > > Thanks! > > Best Regards > Daniel Cardoso > > > ------------------------------ > Novos endere?os, o Yahoo! que voc? conhece. Crie um email novocom a sua cara @ > ymail.com ou @rocketmail.com. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080718/ba4a3c43/attachment.html From hahn at mcmaster.ca Fri Jul 18 09:15:37 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] How much RAM per core is right? In-Reply-To: <487F6984.3000006@ldeo.columbia.edu> References: <487F6984.3000006@ldeo.columbia.edu> Message-ID: > How much memory per core/processor is right for a Beowulf cluster node? easy: depends on your app. > I plan to buy 8GB per node on a dual-processor quad-core machine (1GB per > core), that's reasonably minimal. > computations we do, and shrunk it down somewhat due to budget constraints. memory is relatively cheap right now. > Actually not long ago RAM-per-core ratio used to be 512MB per core (which > were physical CPUs back then), there's a large population of jobs which have trivial memory footprints (just a few MB). but even 5 years ago, lots of non-obscure computations needed more like 2 GB/core. > and it seems to me some non-PC HPC machines (IBM BlueGene, SiCortex machines, > etc) still use > the 512MB RAM-per-core ratio. they're outliers, since those two specific machines are what I'd call "many-mini-core" machines designed primarily for power efficiency and which assume that the workload will be specialized, highly-tuned massively parallel codes. note that they have exceptionally slow processors, for instance, and relatively fast/elaborate network fabrics. > For PC-based cluster compute nodes, is 1GB per core right? > Is it too much? > Is it too little? 2G/core with dual-socket quad-core machines seems right to me. 4G/core is definitely needed by some people, but many fewer (typical power-law falloff), but is quite a lot more expensive. ultimately, it also depends on the dimm socket config of your hardware. for instance, if you go with 4-socket amd boxes (of course more expensive), you can use lower-density dimms, so a higher mem/core ratio. it may be that the upcoming nehalem chips will permit this as well. > "Big is better" is really the best, and minimalism is just an excuse for the > cheap and the poor? no - too much memory will definitely hurt, not just the pocketbook. normally, any memory controller can run at full speed for only a limited number of dimms (actually, sides of dimms, since dual-sided dimms usually count as 2 loads.) > Is there anybody out there using 64 or 128GB per node? sure - we expect to buy some fat nodes soon, but the mainstream nodes will probably be 2G/core (16G/node). From jmdavis1 at vcu.edu Fri Jul 18 09:45:29 2008 From: jmdavis1 at vcu.edu (Mike Davis) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] How much RAM per core is right? In-Reply-To: References: <487F6984.3000006@ldeo.columbia.edu> Message-ID: <4880C8A9.8050805@vcu.edu> Since 2004, we have used 2GB/core as our standard memory. We provide a few machines with 4GB/core and even 8GB/core to meet the needs of applications requiring significantly more RAM. From lindahl at pbm.com Fri Jul 18 11:43:49 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] How much RAM per core is right? In-Reply-To: <487F6984.3000006@ldeo.columbia.edu> References: <487F6984.3000006@ldeo.columbia.edu> Message-ID: <20080718184348.GC20324@bx9.net> On Thu, Jul 17, 2008 at 11:47:16AM -0400, Gus Correa wrote: > How much memory per core/processor is right for a Beowulf cluster node? It really depends on your apps. Some people spend more than 50% of their $$ on ram, others only need a few hundred megabytes per node. A few years ago, it was the case that 1 GB/core was a number that many clusters used, but I suspect it's crept up since them. > In any case, at this point it seems to me that > "get as much RAM as your money can buy and your motherboard can fit" may > not be a wise choice. > Is there anybody out there using 64 or 128GB per node? Sure, because their problems call for it. For example, many CFD computations are just trying to find steady-state airflow around an object. These computations don't run for very many timesteps, with a very big grid, and huge messages. Now in your case it sounds like you know how much RAM to buy, given your experience on your existing machine. You can project to your new cluster: "I have $X. If I bought P cores, 1 GB/core, that gives me an N*N*L grid, it will take H hours to finish a 1000 year run. OK, that finished too quickly. So I'll buy fewer cores and more memory, run a bigger grid, that takes longer..." Iterate until done. BTW, you said it was N**4: isn't the vertical direction treated very differently from lat/lon? -- greg From gerry.creager at tamu.edu Fri Jul 18 12:07:16 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] How much RAM per core is right? In-Reply-To: <4880C8A9.8050805@vcu.edu> References: <487F6984.3000006@ldeo.columbia.edu> <4880C8A9.8050805@vcu.edu> Message-ID: <4880E9E4.40603@tamu.edu> Hmmm. Same for us. And same timeframe. I've found with the weather codes I use, regardless of how they profile, they want significantly more RAM than they claim to. We've found them very comfortable in 2GB/core. *I* am still learning about other apps with some of our new users on a cluster we're still getting into production. We've bought some nodes w/ 4GB/core for specific apps for one of our stakeholders, but we don't know if others' problems will need or benefit from that. Mike Davis wrote: > Since 2004, we have used 2GB/core as our standard memory. We provide a > few machines with 4GB/core and even 8GB/core to meet the needs of > applications requiring significantly more RAM. > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From kyron at neuralbs.com Fri Jul 18 14:26:47 2008 From: kyron at neuralbs.com (Eric Thibodeau) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Building a BeoWulf In-Reply-To: <633706.21976.qm@web65602.mail.ac4.yahoo.com> References: <633706.21976.qm@web65602.mail.ac4.yahoo.com> Message-ID: <48810A97.9060308@neuralbs.com> My Gentoo Beowulf Clustering LiveCD might be ready sometime next week...one current caveat, your hardware has to be x86_64. Details at http://soc.gentooexperimental.org/projects/show/gentoo-cluster-seed ;) Eric Thibodeau Daniel Cardoso Alves wrote: > Hi people... > > I am a newbie on Clusters... > I ever have heard about BeoWulf cluster and now I need implement a > Cluster... > > I would like where I can get the BeoWulf and How to I can build a > BoeWulf Cluster... > > Thanks! > > Best Regards > Daniel Cardoso > > > ------------------------------------------------------------------------ > Novos endere?os, o Yahoo! que voc? conhece. Crie um email novo > > com a sua cara @ymail.com ou @rocketmail.com. > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080718/cbec02c6/attachment.html From fkruggel at uci.edu Fri Jul 18 22:20:54 2008 From: fkruggel at uci.edu (fkruggel@uci.edu) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Green Cluster? Message-ID: <1184.68.109.69.214.1216444854.squirrel@webmail.uci.edu> Hi All, I am wondering whether there is any mechanism to automatically power down nodes (e.g., ACPI S3) when idle for some time, and automatically wake up when requested (e.g., by WOL, some cluster scheduler, ssh). I imagine that I could cut down power & cooling on our system by more than 50%. Any hints? Thanks, Frithjof From perry at piermont.com Sat Jul 19 09:53:59 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <1184.68.109.69.214.1216444854.squirrel@webmail.uci.edu> (fkruggel@uci.edu's message of "Fri\, 18 Jul 2008 22\:20\:54 -0700 \(PDT\)") References: <1184.68.109.69.214.1216444854.squirrel@webmail.uci.edu> Message-ID: <87d4l9stlk.fsf@snark.cb.piermont.com> fkruggel@uci.edu writes: > I am wondering whether there is any mechanism to automatically > power down nodes (e.g., ACPI S3) when idle for some time, and > automatically wake up when requested (e.g., by WOL, some cluster > scheduler, ssh). I imagine that I could cut down power & cooling > on our system by more than 50%. Any hints? Depending on the motherboard, there are ways to do this. You can do wake on network and other tricks. However, if you would really save half the power, that implies that your cluster is half idle. If it is really half idle, why aren't you simply shutting half of it down? Perry From ntmoore at gmail.com Sat Jul 19 13:01:34 2008 From: ntmoore at gmail.com (Nathan Moore) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <87d4l9stlk.fsf@snark.cb.piermont.com> References: <1184.68.109.69.214.1216444854.squirrel@webmail.uci.edu> <87d4l9stlk.fsf@snark.cb.piermont.com> Message-ID: <6009416b0807191301w3eb74fftd2af70523e1b40bf@mail.gmail.com> I think the feature you're looking for is "Wake on LAN", http://en.wikipedia.org/wiki/Wake-on-LAN I've wondered similar things - the small cluster I run for a students/departmental use is generally off, except when I'm teaching computational physics, or have a student interested in a specific research project. It would be nice to be able to "turn on" a few machines (from home, at 11:30pm) when I have to run something substantial. If you find a good step-by-step resource describing how to do this, I'd love to hear about it. Nathan Moore On Sat, Jul 19, 2008 at 11:53 AM, Perry E. Metzger wrote: > > fkruggel@uci.edu writes: > > I am wondering whether there is any mechanism to automatically > > power down nodes (e.g., ACPI S3) when idle for some time, and > > automatically wake up when requested (e.g., by WOL, some cluster > > scheduler, ssh). I imagine that I could cut down power & cooling > > on our system by more than 50%. Any hints? > > Depending on the motherboard, there are ways to do this. You can do > wake on network and other tricks. However, if you would really save > half the power, that implies that your cluster is half idle. If it is > really half idle, why aren't you simply shutting half of it down? > > Perry > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- - - - - - - - - - - - - - - - - - - - - - Nathan Moore Assistant Professor, Physics Winona State University AIM: nmoorewsu - - - - - - - - - - - - - - - - - - - - - -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080719/2fae91b5/attachment.html From geoff at galitz.org Sat Jul 19 13:42:00 2008 From: geoff at galitz.org (Geoff Galitz) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <6009416b0807191301w3eb74fftd2af70523e1b40bf@mail.gmail.com> References: <1184.68.109.69.214.1216444854.squirrel@webmail.uci.edu><87d4l9stlk.fsf@snark.cb.piermont.com> <6009416b0807191301w3eb74fftd2af70523e1b40bf@mail.gmail.com> Message-ID: <9E1DA96988FC4E6282063A9BA3C39F05@geoffPC> Many, many, many moons ago I wrote a plugin for the clustering framework (now defunct) that we used and I was a developer on back at UC Berkeley. It was quite simple... it simply checked to see if jobs were in the queue, if not it looked to see what nodes were free (using OpenPBS/Torque native commands), did the necessary parsing of a few backend config files and then issued standard shutdown commands to the idle nodes. When jobs started to back up in the queue, the plugin used WOL to start up nodes. It was in perl and less than 100 lines. I easily could have used IPMI instead, but many of the boxes we were using had better WOL support than IPMI. WOL is standard while IPMI can vary from vendor to vendor... so if your needs are no more complex than this, WOL is a good way to go. -geoff Geoff Galitz Blankenheim NRW, Deutschland http://www.galitz.org -----Original Message----- From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Nathan Moore Sent: Samstag, 19. Juli 2008 22:02 To: beowulf@beowulf.org Subject: Re: [Beowulf] Green Cluster? I think the feature you're looking for is "Wake on LAN", http://en.wikipedia.org/wiki/Wake-on-LAN I've wondered similar things - the small cluster I run for a students/departmental use is generally off, except when I'm teaching computational physics, or have a student interested in a specific research project. It would be nice to be able to "turn on" a few machines (from home, at 11:30pm) when I have to run something substantial. If you find a good step-by-step resource describing how to do this, I'd love to hear about it. Nathan Moore On Sat, Jul 19, 2008 at 11:53 AM, Perry E. Metzger wrote: fkruggel@uci.edu writes: > I am wondering whether there is any mechanism to automatically > power down nodes (e.g., ACPI S3) when idle for some time, and > automatically wake up when requested (e.g., by WOL, some cluster > scheduler, ssh). I imagine that I could cut down power & cooling > on our system by more than 50%. Any hints? Depending on the motherboard, there are ways to do this. You can do wake on network and other tricks. However, if you would really save half the power, that implies that your cluster is half idle. If it is really half idle, why aren't you simply shutting half of it down? Perry _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- - - - - - - - - - - - - - - - - - - - - - Nathan Moore Assistant Professor, Physics Winona State University AIM: nmoorewsu - - - - - - - - - - - - - - - - - - - - - From csamuel at vpac.org Sun Jul 20 01:18:07 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Again about NUMA (numactl and taskset) In-Reply-To: <435519322.253861216541470181.JavaMail.root@zimbra.vpac.org> Message-ID: <207063333.253881216541887696.JavaMail.root@zimbra.vpac.org> ----- "H?kon Bugge" wrote: > This raises two questions: a) Which job > schedulers are able to optimize placement on > cores thereby _improving_ application > performance? I don't know if building in that knowledge would be useful, it would be better, I feel, to just provide a method for the admin to configure the scheduler appropriately for their hardware. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From perry at piermont.com Sun Jul 20 10:31:27 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> (fkruggel@uci.edu's message of "Sat\, 19 Jul 2008 13\:40\:59 -0700 \(PDT\)") References: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> Message-ID: <877ibgv4wg.fsf@snark.cb.piermont.com> fkruggel@uci.edu writes: > Thanks for your suggestions. Let me be more specific. > I would like to have nodes automatically wake up when > needed and go to sleep when idle for some time. My > ganglia logs tell me that there is considerable idle > time on our cluster. So probably you don't want to turn off the nodes automatically, you want to turn them *off*. If your cluster is half idle, why bother turning it on at all? However, if you want to do this, see below. > The issue is that I would like to have the cluster adapt > *automatically* to the load, without interaction of an > administrator. > Here is how far I got: > I can set a node to sleep (suspend-to-ram) using ACPI. > But for powering on, I have to press the power button. > No automatic solution. > Is it possible to wake up a node over lan (without reboot)? Here is how you can turn them on: http://en.wikipedia.org/wiki/Wake_on_lan There is even open source software described on that page. Again, though, I'd ask if you're really solving the right problem. If you think you should be "automatically" powering down on average half the cluster, you could indeed build software to do that -- you can detect in your job scheduler that many of the machines are unneeded, ACPI sleep them, and wake them up with wake-on-lan packets. The software should be straightforward. However, why do you think you need it? If half the machines are unused, something is wrong. If you have a good job scheduler, it should be keeping the nodes automatically loaded, with a nice queue of waiting jobs. If you're like most places, there are more jobs that people want to do than there is CPU. If half of the time there is no waiting job, you're in an unusual situation. I'd just turn off the half the nodes permanently or give them to someone who can use them (and that won't be hard to find). Generally speaking, if you have a large cluster, and you have enough work for it, it is going to be running flat out 24x7. If it isn't, you've bought more hardware than you need. > How can I detect that a node was idle for some specific time? If your scheduler program doesn't schedule a node because it has no work for it, you know it is idle, right? A node could also detect that its average load was near 0, but that seems like the wrong way to do it because you want to be scheduling your job queue centrally, so your job scheduler should both know what is and is not in use, and should decide on what to shut down. Again, though, I think you might want to ask if you're doing the right thing here. If all your machines are not working flat out 100% of the time, you have hardware depreciating (and rapidly becoming obsolete) to no purpose, and there are loads of people out there, probably even on your own campus, who probably are desperate for compute cycles. (Hell, I could use extra cycles -- I can never afford enough.) Optimally, a cluster will be working 100% of the time, until one day when it is obsolete (that is, the cost in space, power, and cooling is more than replacing it with faster/better hardware), it gets shut down, replaced, and sold off. Perry -- Perry E. Metzger perry@piermont.com From lindahl at pbm.com Sun Jul 20 17:57:12 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <877ibgv4wg.fsf@snark.cb.piermont.com> References: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> <877ibgv4wg.fsf@snark.cb.piermont.com> Message-ID: <20080721005711.GB3465@bx9.net> On Sun, Jul 20, 2008 at 01:31:27PM -0400, Perry E. Metzger wrote: > So probably you don't want to turn off the nodes automatically, you > want to turn them *off*. If your cluster is half idle, why bother > turning it on at all? Some people have clusters used for undergraduate instruction. So they're very busy before assignments are due, and not so busy during the rest of the time. That's only one of many reasons someone might intentionally have idle nodes. BTW, there are many solutions to remote power management. The most common one I've seen deployed is "intelligent" PDUs, although this is pretty expensive, $100 per node. Cheaper methods include X10, IPMI, and Wake on Lan. -- greg From landman at scalableinformatics.com Sun Jul 20 18:20:27 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <20080721005711.GB3465@bx9.net> References: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> <877ibgv4wg.fsf@snark.cb.piermont.com> <20080721005711.GB3465@bx9.net> Message-ID: <4883E45B.2010902@scalableinformatics.com> Greg Lindahl wrote: > BTW, there are many solutions to remote power management. The most > common one I've seen deployed is "intelligent" PDUs, although this is > pretty expensive, $100 per node. Cheaper methods include X10, IPMI, > and Wake on Lan. Intelligent switchable PDUs cost ~$50/usable port. IPMI costs ~$80/usable port + switch costs and cabling. Oddly enough, the combination of serial port concentrators and switchable PDUs scales better for larger clusters than it does smaller. Dell DRAC costs somewhat more ($250/node ?), IBM/Sun/HP all integrate it for you. We did an analysis of the costs last year, and the crossover point was between 24 and 32 nodes in most cases (for our common configurations). That said, we like IPMI in general, and even better when it works :( Sometimes it does go south, in a hurry (gets into a strange state). In which case, removing power is the only option. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From smulcahy at aplpi.com Mon Jul 21 03:13:02 2008 From: smulcahy at aplpi.com (stephen mulcahy) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <877ibgv4wg.fsf@snark.cb.piermont.com> References: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> <877ibgv4wg.fsf@snark.cb.piermont.com> Message-ID: <4884612E.8090100@aplpi.com> Perry E. Metzger wrote: > Here is how you can turn them on: > http://en.wikipedia.org/wiki/Wake_on_lan > > There is even open source software described on that page. Wake-on-Lan works very well - when it works. We were using it quite successfully on a small cluster we built for remote power on (after remote power off). We had to upgrade the BIOS recently for an AMD erratum and the BIOS upgrade had the unfortunate side-effect of mangling WOL while fixing the processor problem. We can reset the WOL flags and re-enable it on individual nodes, but unfortunately when the nodes are power-cycled, the WOL flag goes back to its default (off) state. Given the typical use case of WOL, it seems unfortunate that a power cycle resets it. We've made a stab at raising the matter with the vendor but since the cluster in question is only 20 nodes I don't think they take us seriously (or at least not as seriously as you folks with hundreds of nodes). So YMMV. -stephen -- Stephen Mulcahy, Applepie Solutions Ltd., Innovation in Business Center, GMIT, Dublin Rd, Galway, Ireland. +353.91.751262 http://www.aplpi.com Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway) From smulcahy at aplpi.com Mon Jul 21 03:19:39 2008 From: smulcahy at aplpi.com (stephen mulcahy) Date: Wed Nov 25 01:07:27 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <4880A4F4.4050509@scalableinformatics.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> Message-ID: <488462BB.4060308@aplpi.com> Joe Landman wrote: > Eeekk... return of the language wars. They all have the same morphology > > Person1: "My language is better than yours" > > Person2: "Oh yeah? At least I use a real OS, CP/M!" > > and its downhill in a cascade of responses from there ... You probably edit your language with a terrible editor like emacs though /me ducks -stephen -- Stephen Mulcahy, Applepie Solutions Ltd., Innovation in Business Center, GMIT, Dublin Rd, Galway, Ireland. +353.91.751262 http://www.aplpi.com Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway) From mark.kosmowski at gmail.com Mon Jul 21 04:16:49 2008 From: mark.kosmowski at gmail.com (Mark Kosmowski) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Re: how do I get started? Message-ID: >>> Mark Kosmowski wrote: >>> >>> I have a small cluster too (ok, had, I'm condensing to one big RAM >>> workstation at the moment). Mine was a 3 node dual Opteron setup. >>> >>> The first trick is to get each node to be able to ssh to every other >>> node without getting a password prompt. It doesn't matter for the >>> calculation whether this is done by passwordless ssh or using some >>> keychain type applet. All that matters for the calculation is that >>> each node be able to ssh without prompt to each other node. This >>> means n! checks. For my 3 node baby cluster I did this manually, >>> since 3! is 6. More nodes - might want to make a script or something. Here is a link for the keychain script I mentioned earlier: http://www.ibm.com/developerworks/library/l-keyc2/ >>> >>> >>> Here is an example with one-lined "script" for the ssh password less >>> logon: >>> >>> >>> http://wiki.neuralbs.com/index.php/Howto_Build_a_Basic_Gentoo_Beowulf_Cluster#Passwordless_Logon_to_the_Nodes_and_Node_List_Creation >>> >>> Next, you need to make your application parellel. This typically >>> involves compiling with some sort of MPI. I use OpenMPI. MPICH is >>> another option. This is going to be part of the learning curve. When >>> you can and get an output for each CPU (each core?) >>> you're ready to start wrestling with an MPI compile of your >>> application code. >>> >>> If you're not used to compiling your own code, there may be a bit of a >>> learning curve for this also. >>> >>> >>> Well... once I "finish" my Google Summer of Code project, people like you >>> will have a fun playground with graphical examples (as well as serious >>> people for serious HPC...but they won't admit to that and keep using RH as >>> an OS :P) >>> >>> http://soc.gentooexperimental.org/projects/show/gentoo-cluster-seed >>> >>> Good luck! >>> >>> Mark E. Kosmowski >>> >>> >>> Enjoy, >>> >>> Eric Thibodeau >>> >>> Amir, >>> Did you try any of the documents referenced at wiki,http://en.wikipedia.org/wiki/Beowulf_%28computing%29#See_also? >>> >>> There is a free downloadable book at Professor Brown's website, start athttp://www.phy.duke.edu/~rgb/Beowulf/beowulf.php >>> >>> And I believe yes, you can use SSH to replace RSH pretty much anywhere. I >>> haven't done that myself but I'm only a hobbyist. >>> >>> Good luck, >>> Peter >>> >>> >>> On 7/16/08, Amir Saad wrote: >>> >>> >>> I'm trying to build a small Beowulf cluster using 5 Ubuntu machines. I did >>> a lot of search and yet couldn't find a getting started guide. Any ideas? >>> >>> All the articles I found are using RSH, is it possible to replace this with >>> password-less SSH? Would it be possible to use Ubuntu? Which packages should >>> I install to build the cluster? >>> >>> Please advice. >>> >>> Thank you >>> >>> Amir From landman at scalableinformatics.com Mon Jul 21 04:59:24 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:27 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <488462BB.4060308@aplpi.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> Message-ID: <48847A1C.1000309@scalableinformatics.com> stephen mulcahy wrote: > > > Joe Landman wrote: >> Eeekk... return of the language wars. They all have the same morphology >> >> Person1: "My language is better than yours" >> >> Person2: "Oh yeah? At least I use a real OS, CP/M!" >> >> and its downhill in a cascade of responses from there ... > > You probably edit your language with a terrible editor like emacs though :( I could never get my head around emacs. Or vi for that matter. I wrote my thesis with nedit and used latex to compile it. Yes, it had a makefile. It took ~5 minutes to build on my PCs back then and about 1 minute on the SGI Indy. This was about 1995 or so, and that was the time I started using an old Redhat on my laptop. Worked out pretty well as I remember. Had nedit (and vi and emacs). As nedit (www.nedit.org) is dying from lack of interest, I am slowly looking around for programming editor that is as good. Just an editor. Not an entire user experience. Rumor has it that C-c C-o C-f C-f C-e C-e instructs emacs to make you a cup of coffee. :^ I personally want an editor without all these fancy things: just syntax highlighting for C/C++/Perl/Bash/Tcsh/Fortran/config files, that has line numbers, and intelligent wrapping/splitting. Can run from a GUI. Does split windows. gvim does all these things. But you have to be very careful typing. Because it it vi. If Komodo had window splitting and intelligent wrapping, it would be good. I looked at kate, but it requires kde. pico/nano are ok, but they don't do line numbers, or split windows, or intelligent wrapping. Ugh. > > /me ducks > > -stephen > -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615 From perry at piermont.com Mon Jul 21 06:19:24 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <4884612E.8090100@aplpi.com> (stephen mulcahy's message of "Mon\, 21 Jul 2008 11\:13\:02 +0100") References: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> <877ibgv4wg.fsf@snark.cb.piermont.com> <4884612E.8090100@aplpi.com> Message-ID: <87zlobtlwj.fsf@snark.cb.piermont.com> stephen mulcahy writes: > We can reset the WOL flags and re-enable it on individual nodes, but > unfortunately when the nodes are power-cycled, the WOL flag goes back > to its default (off) state. Given the typical use case of WOL, it > seems unfortunate that a power cycle resets it. You might try the following: Take a machine with the flag off. Dump /dev/cmos. Change ONLY the flag in the BIOS, again dump /dev/cmos, and see if you can find where the flag resides. You can then build a hackish script to set the flag on machine boot. Perry -- Perry E. Metzger perry@piermont.com From rgb at phy.duke.edu Mon Jul 21 06:18:59 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:27 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <48847A1C.1000309@scalableinformatics.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> Message-ID: On Mon, 21 Jul 2008, Joe Landman wrote: > Rumor has it that C-c C-o C-f C-f C-e C-e instructs emacs to make you a cup > of coffee. :^ > > I personally want an editor without all these fancy things: just syntax > highlighting for C/C++/Perl/Bash/Tcsh/Fortran/config files, that has line > numbers, and intelligent wrapping/splitting. Can run from a GUI. Does split > windows. > > gvim does all these things. But you have to be very careful typing. Because > it it vi. > > If Komodo had window splitting and intelligent wrapping, it would be good. > > I looked at kate, but it requires kde. > > pico/nano are ok, but they don't do line numbers, or split windows, or > intelligent wrapping. I don't know if it has all the features you want -- line numbers? Ugh. You must be coding in runes -- oh, wait, I mean Fortran;-) -- but you might look at another ancient editor from the elder age of elves and men -- jove (Jonathan's Own Version of Emacs). Call it "emacs lite". Call it "emacs written in C instead of lisp so it isn't infinitely and pointlessly extendable". Call it "I'll have an order of emacs, please, but hold the kitchen sink". It don't be doin' colors. It is intelligent enough to do errors in only a handful of programming languages. It can be gussied up a bit with macros and keymaps, but we're talking hanging your own pictures on the wall, not rebuilding the house so it supports martian lifeforms using nothing but lisp. On a good day it can be enticed into managing indentation for you in code Now mind you, jove doesn't do GUI's. xterm, please, and none of these fancy "smart" xterms, neither, just the plain old vanilla xterm. You can split screens, edit 3 or four files at once, invoke make from inside and keystroke through errors. Once upon a time I did use it to run an editable shell in a subwindow (this was before e.g. bash or tcsh, when if you wanted editing in /bin/sh or /bin/csh you had to do it this way) but tcsh or bash are both much better native and I haven't done it for years. The bad thing about jove is that so few people still use it that it doesn't ever make it into e.g. fedora. I'm sure it is in Debian (what isn't?). I have a perfectly functional personal rpm build, though (and would be happy to donate it to your cause), and the FIRST thing I do when moving into a new system is import jove's rpm and do a rebuild and install. Otherwise I can't function. I just use one editor, you see. I'm typing this reply in jove. I use jove to write poetry and prose (latex makefiles and templates). I use jove to write C. perl. php. text. I resort to ooffice only in desperation, and then get pissed when Ctrl-A or Ctrl-Shift-< don't do what they are "supposed" to do (move me to head of line or head of document) and instead pop up some inane window offering to polish my frobnitz. Alas, we live in a dark age, and one day jove may pass beyond human ken when men and elves forget it. But it is not THIS day, and we are not THOSE men. Or elves, for that matter. Special project number 113 in my list of special projects I'll never get to is to actually make xjove work, using gtk widgets, so that it no longer needs an xterm to function correctly. And I'd really like to tweak its reformatting routines -- it sometimes gets overzealous, especially with email. And its file recovery facility is terse to the point of being cryptic and could be a tiny bit warmer and fuzzier and nurturing. Still, it stands as an example of enduring greatness. I've tried -- hard -- to wean myself from jove and move on to emacs, since emacs is "supported". My record is a whole week on emacs, at the end of which time my facial tic began to worry my wife and the boys began to wonder why I was wandering the house shaking my head and uttering obscenities. At the end of it I woke up in the middle of the night in a cold sweat, having had a horrible dream in which every other word of a document was presented to me in chartreuse, and chartreuse had some sort of >>meaning<<, a color out of space as it were, and I could hear the hounds scratching at the corners of the document trying to get in and wreak their will on all the hapless words within. I immediately sacrificed a chicken onto the keyboard to purge the hounds, cranked up jove in its comforting smooth black on white text, and spent an hour just moving up and down in the document and felt much better. rgb > > Ugh. > >> >> /me ducks >> >> -stephen >> > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From eugen at leitl.org Mon Jul 21 06:35:44 2008 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] cuda benchmark In-Reply-To: <8e6393ac0807162156y54b8dcddg18d3692c707a0fa6@mail.gmail.com> References: <8e6393ac0807162156y54b8dcddg18d3692c707a0fa6@mail.gmail.com> Message-ID: <20080721133544.GZ9875@leitl.org> On Wed, Jul 16, 2008 at 09:56:53PM -0700, Massimiliano Fatica wrote: > > You can use NAMD. > > [1]http://www.ks.uiuc.edu/Research/vmd/cuda/ Interesting. Any suggestions for a cheap consumer CUDA-suitable nVidia card which would fit in a 1U slot (2x PCIe x8 available), and don't tax the power supply (and cooling capacity) overmuch? -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE From hahn at mcmaster.ca Mon Jul 21 06:34:57 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Re: how do I get started? In-Reply-To: References: Message-ID: > Here is a link for the keychain script I mentioned earlier: > > http://www.ibm.com/developerworks/library/l-keyc2/ I've used ssh and ssh-agent for a long time, and don't really see much value to thsi keychain thing. the main premise seems to be that you want to leave your ssh-agent running even after logout. I find this kind of strange. the article mentions as desirable that by leaving ssh-agent running with keys and stashing its parameters in .ssh-agent, things like your cron jobs can act as you. I don't see this as a significant advantage - if I want unattended jobs to do ssh authentication, I do it with a dedicated, unencrypted key (which on the target machine can _only_ perform the desired function using the command= syntax, preferably also with the from= constrain.) yes, that means that someone could steal the private key and perform the function. leaving ssh-agent running with keys means that any compromise, even just of the user-level account, now _owns_ the account, locally and remotely. I prefer to run ssh-agent as part of my X session - processes inherit the SSH_AUTH_SOCK parameter in their environment, and ssh-agent goes away when I logout. I've been thinking about tweaking ssh-agent so that keys timeout when idle (ssh-add _can_ already provide a TTL, but I'd like ssh-agent to forget my keys after a period of unuse.) it's also tempting to see whether the kernel's keyring feature might be useful in handling ssh keys - I think it would remove the need for a process (and worrying about $SSH_AUTH_SOCK), but wouldn't actually add any additional safety. regards, mark hahn. From gerry.creager at tamu.edu Mon Jul 21 06:43:10 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:27 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> Message-ID: <4884926E.4040404@tamu.edu> Hey! I still code fortran. I don't require line numbers, save after those long weekends when I have trouble remembering where my office is... It's Basic that liked line numbers, RGB. Robert G. Brown wrote: > On Mon, 21 Jul 2008, Joe Landman wrote: > >> Rumor has it that C-c C-o C-f C-f C-e C-e instructs emacs to make you >> a cup of coffee. :^ >> >> I personally want an editor without all these fancy things: just >> syntax highlighting for C/C++/Perl/Bash/Tcsh/Fortran/config files, >> that has line numbers, and intelligent wrapping/splitting. Can run >> from a GUI. Does split windows. >> >> gvim does all these things. But you have to be very careful typing. >> Because it it vi. >> >> If Komodo had window splitting and intelligent wrapping, it would be >> good. >> >> I looked at kate, but it requires kde. >> >> pico/nano are ok, but they don't do line numbers, or split windows, or >> intelligent wrapping. > > I don't know if it has all the features you want -- line numbers? Ugh. > You must be coding in runes -- oh, wait, I mean Fortran;-) -- but you > might look at another ancient editor from the elder age of elves and men > -- jove (Jonathan's Own Version of Emacs). Call it "emacs lite". Call > it "emacs written in C instead of lisp so it isn't infinitely and > pointlessly extendable". Call it "I'll have an order of emacs, please, > but hold the kitchen sink". > > It don't be doin' colors. It is intelligent enough to do errors in only > a handful of programming languages. It can be gussied up a bit with > macros and keymaps, but we're talking hanging your own pictures on the > wall, not rebuilding the house so it supports martian lifeforms using > nothing but lisp. On a good day it can be enticed into managing > indentation for you in code > > Now mind you, jove doesn't do GUI's. xterm, please, and none of these > fancy "smart" xterms, neither, just the plain old vanilla xterm. You > can split screens, edit 3 or four files at once, invoke make from inside > and keystroke through errors. Once upon a time I did use it to run an > editable shell in a subwindow (this was before e.g. bash or tcsh, when > if you wanted editing in /bin/sh or /bin/csh you had to do it this way) > but tcsh or bash are both much better native and I haven't done it for > years. > > The bad thing about jove is that so few people still use it that it > doesn't ever make it into e.g. fedora. I'm sure it is in Debian (what > isn't?). I have a perfectly functional personal rpm build, though (and > would be happy to donate it to your cause), and the FIRST thing I do > when moving into a new system is import jove's rpm and do a rebuild and > install. Otherwise I can't function. > > I just use one editor, you see. I'm typing this reply in jove. I use > jove to write poetry and prose (latex makefiles and templates). I use > jove to write C. perl. php. text. I resort to ooffice only in > desperation, and then get pissed when Ctrl-A or Ctrl-Shift-< don't do > what they are "supposed" to do (move me to head of line or head of > document) and instead pop up some inane window offering to polish my > frobnitz. > > Alas, we live in a dark age, and one day jove may pass beyond human ken > when men and elves forget it. But it is not THIS day, and we are not > THOSE men. Or elves, for that matter. > > Special project number 113 in my list of special projects I'll never get > to is to actually make xjove work, using gtk widgets, so that it no > longer needs an xterm to function correctly. And I'd really like to > tweak its reformatting routines -- it sometimes gets overzealous, > especially with email. And its file recovery facility is terse to the > point of being cryptic and could be a tiny bit warmer and fuzzier and > nurturing. > > Still, it stands as an example of enduring greatness. I've tried -- > hard -- to wean myself from jove and move on to emacs, since emacs is > "supported". My record is a whole week on emacs, at the end of which > time my facial tic began to worry my wife and the boys began to wonder > why I was wandering the house shaking my head and uttering obscenities. > At the end of it I woke up in the middle of the night in a cold sweat, > having had a horrible dream in which every other word of a document was > presented to me in chartreuse, and chartreuse had some sort of >>> meaning<<, a color out of space as it were, and I could hear the > hounds scratching at the corners of the document trying to get in and > wreak their will on all the hapless words within. > > I immediately sacrificed a chicken onto the keyboard to purge the > hounds, cranked up jove in its comforting smooth black on white text, > and spent an hour just moving up and down in the document and felt much > better. > > rgb > >> >> Ugh. >> >>> >>> /me ducks >>> >>> -stephen >>> >> >> > -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From landman at scalableinformatics.com Mon Jul 21 07:11:52 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:27 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> Message-ID: <48849928.7030408@scalableinformatics.com> Robert G. Brown wrote: > On Mon, 21 Jul 2008, Joe Landman wrote: > >> Rumor has it that C-c C-o C-f C-f C-e C-e instructs emacs to make you >> a cup of coffee. :^ >> >> I personally want an editor without all these fancy things: just >> syntax highlighting for C/C++/Perl/Bash/Tcsh/Fortran/config files, >> that has line numbers, and intelligent wrapping/splitting. Can run >> from a GUI. Does split windows. >> >> gvim does all these things. But you have to be very careful typing. >> Because it it vi. >> >> If Komodo had window splitting and intelligent wrapping, it would be >> good. >> >> I looked at kate, but it requires kde. >> >> pico/nano are ok, but they don't do line numbers, or split windows, or >> intelligent wrapping. > > I don't know if it has all the features you want -- line numbers? Ugh. > You must be coding in runes -- oh, wait, I mean Fortran;-) -- but you Hey, we have a fair number of current customers with Fortran needs. It is not going away any time soon (didn't I say somethin bout them language wars?) I like line numbers to help me figure out if I have a really long line of text. Most text editors do a poor job of handling this case, happily wrapping it, without telling you, so your key navigation across the long lines looks really funky. And back to the beowulf topic ... I seem to have discovered the issue I was running into last week. Some sort of weird timing problem with MPI_Waitsome on OpenMPI with Infiniband (and shared memory). I tested the IB stack and MPI stacks, and all report full functionality. I can run MPI over the IB. And it works well. The problem is when I run this code which uses MPI_Waitsome for part of its algorithm. With gigabit ethernet, it behaves well (under OpenMPI 1.2.7-rc2). With Infiniband, it does not. Also the OpenMPI seems to get awfully confused if you have IPoIB enabled (mostly for diagnostics for the user). Turning that off helped stability. Of course, since it was remote, I used vi (vim) as my editor. And a little pico. This is a Fortran 9x code. I added a few extra debugging bits around the troublesome MPI calls. Nice to do remotely, hard to do with a remote gui over slower links. I may just give in, and go full force into VIM. It is active, has lots of features, and I already know basic vi from use for years. And its undo feature doesn't blow chunks. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From franz.marini at mi.infn.it Mon Jul 21 07:23:01 2008 From: franz.marini at mi.infn.it (Franz Marini) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] cuda benchmark In-Reply-To: <20080721133544.GZ9875@leitl.org> References: <8e6393ac0807162156y54b8dcddg18d3692c707a0fa6@mail.gmail.com> <20080721133544.GZ9875@leitl.org> Message-ID: <1216650181.15072.8.camel@merlino.mi.infn.it> On Mon, 2008-07-21 at 15:35 +0200, Eugen Leitl wrote: > On Wed, Jul 16, 2008 at 09:56:53PM -0700, Massimiliano Fatica wrote: > > > > You can use NAMD. > > > > [1]http://www.ks.uiuc.edu/Research/vmd/cuda/ > > Interesting. Any suggestions for a cheap consumer CUDA-suitable > nVidia card which would fit in a 1U slot (2x PCIe x8 available), and > don't tax the power supply (and cooling capacity) overmuch? > I'm getting quite good results with a 8800GT card, although, given the current prices, I would suggest going for a 9800GTX. If you want to test out the double-precision support of the new cards, a GTX 260 could be a good choice, provided the power supply has enough juice for it, and provided it fits in your particular 1U slot. Keep in mind that both the 9800GTX and the GTX 260 coolers take up two slots. F. --------------------------------------------------------- Franz Marini Prof. R. A. Broglia Theoretical Physics of Nuclei, Atomic Clusters and Proteins Research Group Dept. of Physics, University of Milan, Italy. email : franz.marini@mi.infn.it phone : +39 02 50317226 --------------------------------------------------------- From perry at piermont.com Mon Jul 21 07:25:35 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <20080721005711.GB3465@bx9.net> (Greg Lindahl's message of "Sun\, 20 Jul 2008 17\:57\:12 -0700") References: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> <877ibgv4wg.fsf@snark.cb.piermont.com> <20080721005711.GB3465@bx9.net> Message-ID: <87abgbtiu8.fsf@snark.cb.piermont.com> Greg Lindahl writes: > On Sun, Jul 20, 2008 at 01:31:27PM -0400, Perry E. Metzger wrote: > >> So probably you don't want to turn off the nodes automatically, you >> want to turn them *off*. If your cluster is half idle, why bother >> turning it on at all? > > Some people have clusters used for undergraduate instruction. So > they're very busy before assignments are due, and not so busy during > the rest of the time. That's only one of many reasons someone might > intentionally have idle nodes. If the cluster is large enough, there are probably other potential users on the campus who would be very very happy to get at the cycles on a "lower priority" basis. That said, if one really wants to do it, I'd go for wake-on-lan... -- Perry E. Metzger perry@piermont.com From perry at piermont.com Mon Jul 21 07:27:18 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Re: how do I get started? In-Reply-To: (Mark Hahn's message of "Mon\, 21 Jul 2008 09\:34\:57 -0400 \(EDT\)") References: Message-ID: <8763qztird.fsf@snark.cb.piermont.com> Mark Hahn writes: > I don't see this as a significant advantage - if I want unattended > jobs to do ssh authentication, I do it with a dedicated, unencrypted > key (which on the target machine can _only_ perform the desired function > using the command= syntax, preferably also with the from= constrain.) > yes, that means that someone could steal the private key and perform > the function. I agree. There is no security advantage to leaving ssh-agent running instead of just having an unencrypted key on the box. -- Perry E. Metzger perry@piermont.com From peter.st.john at gmail.com Mon Jul 21 08:05:05 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:27 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <48849928.7030408@scalableinformatics.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> Message-ID: Joe, If you go with vim (as I do), your faction with BellLabs/Unix/Berkely/C Alliance for Salvation will go up, but with Stanford/MIT/LISP Empire of Evil will go down. Be careful near their gates because the guards will aggro from a mile away. Peter (Embrace the Editor of the Beast, vi vi vi) On 7/21/08, Joe Landman wrote: > > Robert G. Brown wrote: > >> On Mon, 21 Jul 2008, Joe Landman wrote: >> >> Rumor has it that C-c C-o C-f C-f C-e C-e instructs emacs to make you a >>> cup of coffee. :^ >>> >>> I personally want an editor without all these fancy things: just syntax >>> highlighting for C/C++/Perl/Bash/Tcsh/Fortran/config files, that has line >>> numbers, and intelligent wrapping/splitting. Can run from a GUI. Does split >>> windows. >>> >>> gvim does all these things. But you have to be very careful typing. >>> Because it it vi. >>> >>> If Komodo had window splitting and intelligent wrapping, it would be >>> good. >>> >>> I looked at kate, but it requires kde. >>> >>> pico/nano are ok, but they don't do line numbers, or split windows, or >>> intelligent wrapping. >>> >> >> I don't know if it has all the features you want -- line numbers? Ugh. >> You must be coding in runes -- oh, wait, I mean Fortran;-) -- but you >> > > Hey, we have a fair number of current customers with Fortran needs. It is > not going away any time soon (didn't I say somethin bout them language > wars?) > > I like line numbers to help me figure out if I have a really long line of > text. Most text editors do a poor job of handling this case, happily > wrapping it, without telling you, so your key navigation across the long > lines looks really funky. > > And back to the beowulf topic ... > > I seem to have discovered the issue I was running into last week. Some > sort of weird timing problem with MPI_Waitsome on OpenMPI with Infiniband > (and shared memory). I tested the IB stack and MPI stacks, and all report > full functionality. I can run MPI over the IB. And it works well. The > problem is when I run this code which uses MPI_Waitsome for part of its > algorithm. With gigabit ethernet, it behaves well (under OpenMPI > 1.2.7-rc2). With Infiniband, it does not. Also the OpenMPI seems to get > awfully confused if you have IPoIB enabled (mostly for diagnostics for the > user). Turning that off helped stability. > > Of course, since it was remote, I used vi (vim) as my editor. And a little > pico. > > This is a Fortran 9x code. I added a few extra debugging bits around the > troublesome MPI calls. Nice to do remotely, hard to do with a remote gui > over slower links. > > I may just give in, and go full force into VIM. It is active, has lots of > features, and I already know basic vi from use for years. And its undo > feature doesn't blow chunks. > > > -- > Joseph Landman, Ph.D > Founder and CEO > Scalable Informatics LLC, > email: landman@scalableinformatics.com > web : http://www.scalableinformatics.com > http://jackrabbit.scalableinformatics.com > phone: +1 734 786 8423 > fax : +1 866 888 3112 > cell : +1 734 612 4615 > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080721/f980e845/attachment.html From prentice at ias.edu Mon Jul 21 08:32:17 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Re: [torqueusers] pbs_server -t create can't find torque library In-Reply-To: <4884A898.1010001@byu.edu> References: <4881124C.9050905@mines.edu> <4884A898.1010001@byu.edu> Message-ID: <4884AC01.3040606@ias.edu> Lloyd Brown wrote: > > - Add the path (/usr/local/lib) either to /etc/ld.so.conf file, or to a > file in /etc/ld.so.conf.d/, then run "ldconfig" to update the path > cache, etc. This is the recommended system-wide way of doing things. > > What happens when you have two different library paths with that contain libraries with the same name? How do you determine the search order when using individual files in /etc/ld.so.conf.d? -- Prentice From James.P.Lux at jpl.nasa.gov Mon Jul 21 08:51:53 2008 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:07:27 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <48849928.7030408@scalableinformatics.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> Message-ID: <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> At 07:11 AM 7/21/2008, Joe Landman wrote: >Robert G. Brown wrote: >>On Mon, 21 Jul 2008, Joe Landman wrote: >> >>>Rumor has it that C-c C-o C-f C-f C-e C-e instructs emacs to make >>>you a cup of coffee. :^ >>> >>>I personally want an editor without all these fancy things: just >>>syntax highlighting for C/C++/Perl/Bash/Tcsh/Fortran/config files, >>>that has line numbers, and intelligent wrapping/splitting. Can >>>run from a GUI. Does split windows. >>> >>>gvim does all these things. But you have to be very careful >>>typing. Because it it vi. >>> >>>If Komodo had window splitting and intelligent wrapping, it would be good. >>> >>>I looked at kate, but it requires kde. >>> >>>pico/nano are ok, but they don't do line numbers, or split >>>windows, or intelligent wrapping. >>I don't know if it has all the features you want -- line numbers? Ugh. >>You must be coding in runes -- oh, wait, I mean Fortran;-) -- but you > >Hey, we have a fair number of current customers with Fortran >needs. It is not going away any time soon (didn't I say somethin >bout them language wars?) > >I like line numbers to help me figure out if I have a really long >line of text. Most text editors do a poor job of handling this >case, happily wrapping it, without telling you, so your key >navigation across the long lines looks really funky. Line numbers are handy when you get that "syntax error in line 34 of file xyz.c" too.. Jim From peter.st.john at gmail.com Mon Jul 21 09:02:53 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:27 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> Message-ID: Line numbers are super convenient for peer-review, so humans can refer to lines. I've written C programs just to preprend every line with a consequtive integer. Peter On 7/21/08, Jim Lux wrote: > > At 07:11 AM 7/21/2008, Joe Landman wrote: > >> Robert G. Brown wrote: >> >>> On Mon, 21 Jul 2008, Joe Landman wrote: >>> >>> Rumor has it that C-c C-o C-f C-f C-e C-e instructs emacs to make you a >>>> cup of coffee. :^ >>>> >>>> I personally want an editor without all these fancy things: just syntax >>>> highlighting for C/C++/Perl/Bash/Tcsh/Fortran/config files, that has line >>>> numbers, and intelligent wrapping/splitting. Can run from a GUI. Does split >>>> windows. >>>> >>>> gvim does all these things. But you have to be very careful typing. >>>> Because it it vi. >>>> >>>> If Komodo had window splitting and intelligent wrapping, it would be >>>> good. >>>> >>>> I looked at kate, but it requires kde. >>>> >>>> pico/nano are ok, but they don't do line numbers, or split windows, or >>>> intelligent wrapping. >>>> >>> I don't know if it has all the features you want -- line numbers? Ugh. >>> You must be coding in runes -- oh, wait, I mean Fortran;-) -- but you >>> >> >> Hey, we have a fair number of current customers with Fortran needs. It is >> not going away any time soon (didn't I say somethin bout them language >> wars?) >> >> I like line numbers to help me figure out if I have a really long line of >> text. Most text editors do a poor job of handling this case, happily >> wrapping it, without telling you, so your key navigation across the long >> lines looks really funky. >> > > > Line numbers are handy when you get that > > "syntax error in line 34 of file xyz.c" > > > too.. > > Jim > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080721/993daa0e/attachment.html From smulcahy at aplpi.com Mon Jul 21 09:08:22 2008 From: smulcahy at aplpi.com (stephen mulcahy) Date: Wed Nov 25 01:07:27 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> Message-ID: <4884B476.8040304@aplpi.com> Peter St. John wrote: > Line numbers are super convenient for peer-review, so humans can refer > to lines. I've written C programs just to preprend every line with a > consequtive integer. > Peter cat -n is your friend. -stephen -- Stephen Mulcahy, Applepie Solutions Ltd., Innovation in Business Center, GMIT, Dublin Rd, Galway, Ireland. +353.91.751262 http://www.aplpi.com Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway) From dnlombar at ichips.intel.com Mon Jul 21 09:42:57 2008 From: dnlombar at ichips.intel.com (Lombard, David N) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <1184.68.109.69.214.1216444854.squirrel@webmail.uci.edu> References: <1184.68.109.69.214.1216444854.squirrel@webmail.uci.edu> Message-ID: <20080721164257.GB6880@nlxdcldnl2.cl.intel.com> On Fri, Jul 18, 2008 at 10:20:54PM -0700, fkruggel@uci.edu wrote: > > Hi All, > > I am wondering whether there is any mechanism to automatically > power down nodes (e.g., ACPI S3) when idle for some time, and > automatically wake up when requested (e.g., by WOL, some cluster > scheduler, ssh). I imagine that I could cut down power & cooling > on our system by more than 50%. Any hints? The key is whether S3 is supported on your nodes; it's usually not on servers. Once you have that, there are a various of resource manager issues to contend with--such work is being investigated and GIYF here. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From perry at piermont.com Mon Jul 21 09:50:26 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> (Jim Lux's message of "Mon\, 21 Jul 2008 08\:51\:53 -0700") References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> Message-ID: <87sku3rxkd.fsf@snark.cb.piermont.com> Jim Lux writes: >> I like line numbers to help me figure out if I have a really long >> line of text. Most text editors do a poor job of handling this >> case, happily wrapping it, without telling you, so your key >> navigation across the long lines looks really funky. > > Line numbers are handy when you get that > > "syntax error in line 34 of file xyz.c" > > too.. Both emacs and vi will display line numbers if you ask them. Emacs has a really nice compile mode where it will compile in a second window, and jump right to every line in the source files that caused an error in sequence as you ask it. (It even accounts for line changes because of edits.) BSD Unix has a command called "error" that does something similar for you if you are using vi. (I don't know why it doesn't seem to be in most Linuxes, but it is open source and trivially ported.) Thanks to such tools, no one with a real editor need ever find lines with problems by hand, which means that although both editors will show you line numbers, you don't really need them. Perry -- Perry E. Metzger perry@piermont.com From James.P.Lux at jpl.nasa.gov Mon Jul 21 10:00:37 2008 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <87sku3rxkd.fsf@snark.cb.piermont.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> <87sku3rxkd.fsf@snark.cb.piermont.com> Message-ID: <6.2.5.6.2.20080721095712.02d59218@jpl.nasa.gov> At 09:50 AM 7/21/2008, Perry E. Metzger wrote: >Jim Lux writes: > >> I like line numbers to help me figure out if I have a really long > >> line of text. Most text editors do a poor job of handling this > >> case, happily wrapping it, without telling you, so your key > >> navigation across the long lines looks really funky. > > > > Line numbers are handy when you get that > > > > "syntax error in line 34 of file xyz.c" > > > > too.. > >Both emacs and vi will display line numbers if you ask them. > >Emacs has a really nice compile mode where it will compile in a second >window, and jump right to every line in the source files that caused >an error in sequence as you ask it. (It even accounts for line changes >because of edits.) BSD Unix has a command called "error" that does >something similar for you if you are using vi. (I don't know why it >doesn't seem to be in most Linuxes, but it is open source and trivially >ported.) > >Thanks to such tools, no one with a real editor need ever find lines >with problems by hand, which means that although both editors will >show you line numbers, you don't really need them. A lot of "lightweight" non-IDE development environments for embedded systems tend not to provide this, particularly if you're using some form of cross compiler/cross assembler. Sometimes, you're thankful that you have a compiler at all, much less whether it happens to be well integrated with an editor. And, of course, "open source and trivially ported" still means there's non-zero work in getting it working. Jim From perry at piermont.com Mon Jul 21 10:19:33 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <6.2.5.6.2.20080721095712.02d59218@jpl.nasa.gov> (Jim Lux's message of "Mon\, 21 Jul 2008 10\:00\:37 -0700") References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> <87sku3rxkd.fsf@snark.cb.piermont.com> <6.2.5.6.2.20080721095712.02d59218@jpl.nasa.gov> Message-ID: <87hcajrw7u.fsf@snark.cb.piermont.com> Jim Lux writes: >> > Line numbers are handy when you get that >> > >> > "syntax error in line 34 of file xyz.c" >> > >> > too.. >> >>Both emacs and vi will display line numbers if you ask them. >> >>Emacs has a really nice compile mode where it will compile in a second >>window, and jump right to every line in the source files that caused >>an error in sequence as you ask it. (It even accounts for line changes >>because of edits.) BSD Unix has a command called "error" that does >>something similar for you if you are using vi. (I don't know why it >>doesn't seem to be in most Linuxes, but it is open source and trivially >>ported.) >> >>Thanks to such tools, no one with a real editor need ever find lines >>with problems by hand, which means that although both editors will >>show you line numbers, you don't really need them. > > > A lot of "lightweight" non-IDE development environments for embedded > systems tend not to provide this, particularly if you're using some > form of cross compiler/cross assembler. Sometimes, you're thankful > that you have a compiler at all, much less whether it happens to be > well integrated with an editor. Emacs and vi both handle the cross case pretty well. The "editor integration" all takes place via parsing the compiler's stdout -- it isn't difficult to make work for almost any case. > And, of course, "open source and trivially ported" still means there's > non-zero work in getting it working. If you're a developer, you know how to type "make" already pretty well. error(1) is stock C. If someone is desperate for a copy that will compile for normal Linuxes, though, I can make one available. -- Perry E. Metzger perry@piermont.com From perry at piermont.com Mon Jul 21 10:23:27 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <4884B476.8040304@aplpi.com> (stephen mulcahy's message of "Mon\, 21 Jul 2008 17\:08\:22 +0100") References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> <4884B476.8040304@aplpi.com> Message-ID: <878wvvrw1c.fsf@snark.cb.piermont.com> stephen mulcahy writes: > Peter St. John wrote: >> Line numbers are super convenient for peer-review, so humans can >> refer to lines. I've written C programs just to preprend every line >> with a consequtive integer. >> Peter > > cat -n is your friend. and if it didn't exist, the corresponding awk program is: awk '{printf "%d %s\n", FNR, $0}' and Unix has about 10 other trivial ways to do this. (That's probably not even the simplest awk program, but I'm lazy today.) Perry From lbickley at bickleywest.com Mon Jul 21 10:52:34 2008 From: lbickley at bickleywest.com (Lyle Bickley) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Distros - Fermi, Scientific Linux Message-ID: <200807211052.34591.lbickley@bickleywest.com> It's been several months since RGB and others discussed Fermi Linux (https://fermilinux.fnal.gov/), and there was some discussion on this list of Scientific Linux (https://www.scientificlinux.org/) a few months ago. I just revisited these distro's sites after reading the August issue of "Linux Journal" - which discussed both in its "Upfront" section. It appears that Fermi Linux basically supports their own users - while Scientific Linux has broader more general support. Both are based on Red Hat Enterprise sources. A major premise for using either distro is their claim for long term support (LTS). I'm curious to know how many folks on this list use either distro and what has been your experience (and usefulness) of LTS? I'd also like to hear from folks who tried either and moved elsewhere for their distro (and why). Regards, Lyle -- Lyle Bickley Bickley Consulting West Inc. Mountain View, CA 94040 http://bickleywest.com "Black holes are where God is dividing by zero" From rgb at phy.duke.edu Mon Jul 21 10:47:02 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:27 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <48849928.7030408@scalableinformatics.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> Message-ID: On Mon, 21 Jul 2008, Joe Landman wrote: > I like line numbers to help me figure out if I have a really long line of > text. Most text editors do a poor job of handling this case, happily > wrapping it, without telling you, so your key navigation across the long > lines looks really funky. Agreed, but not jove. jove simply continues on to the right and marks it with an exclamation point. Of course, C just continues lines transparently until it hits the terminating ";", so this is still a runish problem. Speaking of which, wouldn't it be a kick to design a programming language CALLED Rune, one that only can be used on GUI systems, that uses elvish runes for all the standard commands and parameters? Programming has become too easy; we need to make it more arcane again to guarantee the high salaries that support our dissipated lifestyles. If we ensure that Rune only does arithmetic in reverse polish with load/store instead of any equivalent of an equals sign we can be certain that no more than fifty people on the planet ever really master it, and that even they cannot read six month old Rune code. That way applications will have to constantly be rewritten and we can all get rich. > And back to the beowulf topic ... > > I seem to have discovered the issue I was running into last week. Some sort > of weird timing problem with MPI_Waitsome on OpenMPI with Infiniband (and > shared memory). I tested the IB stack and MPI stacks, and all report full > functionality. I can run MPI over the IB. And it works well. The problem > is when I run this code which uses MPI_Waitsome for part of its algorithm. > With gigabit ethernet, it behaves well (under OpenMPI 1.2.7-rc2). With > Infiniband, it does not. Also the OpenMPI seems to get awfully confused if > you have IPoIB enabled (mostly for diagnostics for the user). Turning that > off helped stability. > > Of course, since it was remote, I used vi (vim) as my editor. And a little > pico. This is the sad truth. I can survive without using emacs, and besides, I can use it in an emergency. But nobody can manage systems without knowing vi. You may use it only long enough to edit /etc/hosts and your firewall and your yum repo data so you can install and rebuild jove, but that much cannot be avoided... > This is a Fortran 9x code. I added a few extra debugging bits around the > troublesome MPI calls. Nice to do remotely, hard to do with a remote gui > over slower links. > > I may just give in, and go full force into VIM. It is active, has lots of > features, and I already know basic vi from use for years. And its undo > feature doesn't blow chunks. Aaaiiieeee! Spawn of Satan! Get thee behind me, Devil! Speaking personally, I'd rather burn off my pre-cancerous old-age spots with a wood-burning kit than use vi for more than two minutes at a time ("... only long enough..." see above) but to each their own, I suppose. rgb -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From mathog at caltech.edu Mon Jul 21 10:58:56 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Green Cluster? Message-ID: "Lombard, David N" wrote > On Fri, Jul 18, 2008 at 10:20:54PM -0700, fkruggel@uci.edu wrote: > > > > Hi All, > > > > I am wondering whether there is any mechanism to automatically > > power down nodes (e.g., ACPI S3) when idle for some time, and > > automatically wake up when requested (e.g., by WOL, some cluster > > scheduler, ssh). I imagine that I could cut down power & cooling > > on our system by more than 50%. Any hints? > > The key is whether S3 is supported on your nodes; it's usually not > on servers. The key is more often than not "how badly broken is the BIOS"? We have two IBM X3455 systems with identical CPUs and the one with the older BIOS has CPU frequency control working, whereas the latest and greatest BIOS breaks it. Here we're not talking about going all the way to S3, just providing the option to spin a bit more slowly when the system isn't busy. Also, as somebody mentioned earlier in this thread, WOL which works but is not correctly initialized by the BIOS at power up is incredibly common (here is one: Asus A8N5X motherboard). This is one instance where I wish the Federal government (and/or the European Union) would get off its collective duff and set a standard which says that if computer hardware supports energy saving modes, the BIOS must do so too. They should also provide for a hot line to report violators. Vendors are notorious about not fixing their energy wasting broken BIOS versions, but they would hopefully take a little more care with this software when shipping a broken BIOS would result in fines. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From rgb at phy.duke.edu Mon Jul 21 10:54:11 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:27 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> Message-ID: On Mon, 21 Jul 2008, Jim Lux wrote: > Line numbers are handy when you get that > > "syntax error in line 34 of file xyz.c" Well, in jove this is just Ctrl-X-N (next error), but in other languages or contexts, Ctrl-Shift-< (top of file) Esc 3 4 (repeat 34) Ctrl-N (down one line) is pretty easy, and moves you to precisely line 34. Seriously, for C in particular jove is close to a no-brainer. It is a really great coder's editor for C. Not bad for perl, fortran etc, but sheer genius for C. It could be more tightly integrated with latex (it doesn't do the Ctrl-X-N trick in Latex, which is a shame) but in C syntax errors simply do not happen. Ctrl-X-Ctrl-E (run makefile), step through syntax errors in real time, done, end of story. Logic errors you own, but syntax errors evaporate like the morning mist of a hot day. rgb > > > too.. > > Jim > > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From perry at piermont.com Mon Jul 21 11:46:16 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:27 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: (Robert G. Brown's message of "Mon\, 21 Jul 2008 13\:47\:02 -0400 \(EDT\)") References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> Message-ID: <87tzejqdmv.fsf@snark.cb.piermont.com> "Robert G. Brown" writes: > Speaking of which, wouldn't it be a kick to design a programming > language CALLED Rune, one that only can be used on GUI systems, that > uses elvish runes for all the standard commands and parameters? > Programming has become too easy; Given how crappy most programs I read are, I'd say we don't have to worry about it having become too easy. > This is the sad truth. I can survive without using emacs, and besides, > I can use it in an emergency. But nobody can manage systems without > knowing vi. It is also sometimes good to know how to use ed, though it has been some years since I have dealt with a machine so screwed up that this was necessary. >> I may just give in, and go full force into VIM. It is active, has >> lots of features, and I already know basic vi from use for years. >> And its undo feature doesn't blow chunks. > > Aaaiiieeee! Spawn of Satan! Get thee behind me, Devil! nvi (which is the only "real" vi worth speaking of these days) doesn't have trouble with undo etc. -- Perry E. Metzger perry@piermont.com From perry at piermont.com Mon Jul 21 11:48:19 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: (Robert G. Brown's message of "Mon\, 21 Jul 2008 13\:54\:11 -0400 \(EDT\)") References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> Message-ID: <87prp7qdjg.fsf@snark.cb.piermont.com> "Robert G. Brown" writes: > On Mon, 21 Jul 2008, Jim Lux wrote: > >> Line numbers are handy when you get that >> >> "syntax error in line 34 of file xyz.c" > > Well, in jove this is just Ctrl-X-N (next error), but in other languages > or contexts, Ctrl-Shift-< (top of file) Esc 3 4 (repeat 34) Ctrl-N (down > one line) is pretty easy, and moves you to precisely line 34. emacs has that feature (in fact, that jove feature comes from emacs), and I've already mentioned error(1) under BSD. -- Perry E. Metzger perry@piermont.com From dnlombar at ichips.intel.com Mon Jul 21 11:53:22 2008 From: dnlombar at ichips.intel.com (Lombard, David N) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: References: Message-ID: <20080721185322.GA7338@nlxdcldnl2.cl.intel.com> On Mon, Jul 21, 2008 at 10:58:56AM -0700, David Mathog wrote: > > "Lombard, David N" wrote > > > On Fri, Jul 18, 2008 at 10:20:54PM -0700, fkruggel@uci.edu wrote: > > > > > > Hi All, > > > > > > I am wondering whether there is any mechanism to automatically > > > power down nodes (e.g., ACPI S3) when idle for some time, and > > > automatically wake up when requested (e.g., by WOL, some cluster > > > scheduler, ssh). I imagine that I could cut down power & cooling > > > on our system by more than 50%. Any hints? > > > > The key is whether S3 is supported on your nodes; it's usually not > > on servers. > > The key is more often than not "how badly broken is the BIOS"? We have > two IBM X3455 systems with identical CPUs and the one with the older > BIOS has CPU frequency control working, whereas the latest and greatest > BIOS breaks it. Here we're not talking about going all the way to S3, > just providing the option to spin a bit more slowly when the system > isn't busy. Also, as somebody mentioned earlier in this thread, WOL > which works but is not correctly initialized by the BIOS at power up is > incredibly common (here is one: Asus A8N5X motherboard). I meant "supported" in the larger sense, so, in order: Hardware BIOS OS (kernel & drivers) How you configure and manage it all in the userland... with lots of fun interactions amongst them. Once you can get the node to go into, and *most* importantly, return from, S3, then you can work on managing it via the resource manager. > This is one instance where I wish the Federal government (and/or the > European Union) would get off its collective duff and set a standard ... > with this software when shipping a broken BIOS would result in fines. As a Libertarian, I will respectfully disagree :) -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From gdjacobs at gmail.com Mon Jul 21 12:45:48 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:07:28 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> Message-ID: <4884E76C.2030301@gmail.com> Robert G. Brown wrote: > Special project number 113 in my list of special projects I'll never get > to is to actually make xjove work, using gtk widgets, so that it no > longer needs an xterm to function correctly. And I'd really like to > tweak its reformatting routines -- it sometimes gets overzealous, > especially with email. And its file recovery facility is terse to the > point of being cryptic and could be a tiny bit warmer and fuzzier and > nurturing. There isn't anything wrong with a good xterm. By the way, you are forgetting the other noteworthy 'j' CLI editors: jed and joe. You can't do much better for less than a meg on either score. I used and loved nedit for a long time, but eventually decided that we would have to part. Nedit (or possibly lesstif) was glitching far more than I considered proper. Right now I'm using Scite. From hahn at mcmaster.ca Mon Jul 21 12:47:07 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <20080721185322.GA7338@nlxdcldnl2.cl.intel.com> References: <20080721185322.GA7338@nlxdcldnl2.cl.intel.com> Message-ID: >> This is one instance where I wish the Federal government (and/or the >> European Union) would get off its collective duff and set a standard > ... >> with this software when shipping a broken BIOS would result in fines. > > As a Libertarian, I will respectfully disagree :) your reasoning, I think, is that the gov shouldn't be messing around with fine-grained regulation like that. I'd argue that the gov should make it clear that the existing legal approach to product quality should apply to software and firmware as well. one or two class-actions later and companies will take bios support a lot more seriously. (as seriously as, for instance, new product quality and fixes are taken in, say, the auto industry. yes, people can die when an auto component fails, but OTOH the cost is quite small for a vendor to provide self-serve bugfixes to a motherboard bios...) I voted Libertarian in college, but grew out of it ;) From lindahl at pbm.com Mon Jul 21 13:08:11 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Distros - Fermi, Scientific Linux In-Reply-To: <200807211052.34591.lbickley@bickleywest.com> References: <200807211052.34591.lbickley@bickleywest.com> Message-ID: <20080721200810.GA12444@bx9.net> On Mon, Jul 21, 2008 at 10:52:34AM -0700, Lyle Bickley wrote: > A major premise for using either distro is their claim for long term support > (LTS). > > I'm curious to know how many folks on this list use either distro and what has > been your experience (and usefulness) of LTS? Is their LTS support any different from what RHEL is providing? e.g. these guys are just saying, "We're going to track the updates until EOL from Red Hat" ? If so, you should probably ask if there is anyone still on RHEL < 4. I see that CentOS 2 has gotten updates as recently as July 17th of this year. -- greg From perry at piermont.com Mon Jul 21 13:49:47 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: (Mark Hahn's message of "Mon\, 21 Jul 2008 15\:47\:07 -0400 \(EDT\)") References: <20080721185322.GA7338@nlxdcldnl2.cl.intel.com> Message-ID: <87r69notck.fsf@snark.cb.piermont.com> Mark Hahn writes: >>> This is one instance where I wish the Federal government (and/or the >>> European Union) would get off its collective duff and set a standard >> ... >>> with this software when shipping a broken BIOS would result in fines. >> >> As a Libertarian, I will respectfully disagree :) > > your reasoning, I think, is that the gov shouldn't be messing around > with fine-grained regulation like that. I'd argue that the gov should I'm also a Libertarian, but I'd prefer if we kept the political arguments to other mailing lists. Lets argue about clusters instead. Perry From geoff at galitz.org Mon Jul 21 14:16:17 2008 From: geoff at galitz.org (Geoff Galitz) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Distros - Fermi, Scientific Linux In-Reply-To: <20080721200810.GA12444@bx9.net> References: <200807211052.34591.lbickley@bickleywest.com> <20080721200810.GA12444@bx9.net> Message-ID: > A major premise for using either distro is their claim for long term support > (LTS). > > I'm curious to know how many folks on this list use either distro and what has > been your experience (and usefulness) of LTS? ------------------------------------------------ In my UC Berkeley days I routinely installed Scientific Linux on new clusters. The long LTS was a major factor contributing to my usage of it. Security updates, in particular, were of high-priority. I'd also do a general update if we made a major change to the cluster (new nodes or some particularly finicky major new code)... so I found the LTS to be quite useful. In terms of general support we were self-sufficient enough that we could apply bugfixes or patches out-of-stream if necessary. -geoff From peter.st.john at gmail.com Mon Jul 21 14:30:31 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <878wvvrw1c.fsf@snark.cb.piermont.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> <4884B476.8040304@aplpi.com> <878wvvrw1c.fsf@snark.cb.piermont.com> Message-ID: (re line numbers) Ah, I should have said, that was in VMS. I did get VIM for VMS though but I was never a maestro. There are happier VMS installations with unix workalike interfaces, not there then though. Peter On 7/21/08, Perry E. Metzger wrote: > > > stephen mulcahy writes: > > Peter St. John wrote: > >> Line numbers are super convenient for peer-review, so humans can > >> refer to lines. I've written C programs just to preprend every line > >> with a consequtive integer. > >> Peter > > > > cat -n is your friend. > > and if it didn't exist, the corresponding awk program is: > > awk '{printf "%d %s\n", FNR, $0}' > > and Unix has about 10 other trivial ways to do this. (That's probably > not even the simplest awk program, but I'm lazy today.) > > > Perry > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080721/3a26fabe/attachment.html From andrew at moonet.co.uk Mon Jul 21 15:10:13 2008 From: andrew at moonet.co.uk (andrew holway) Date: Wed Nov 25 01:07:28 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> Message-ID: > Speaking personally, I'd rather burn off my pre-cancerous old-age spots > with a wood-burning kit than use vi for more than two minutes at a time > ("... only long enough..." see above) but to each their own, I suppose. Now come on, I am but an ungainly fawn in the unix wilderness and have found vim to be albeit comfortably confusing but incredibly powerful editor. I thought that it was the mark of a true unix beard to be able to use vi(m). you should all be ashamed. Especially after all that talk of paper tapes and computers that made heavy clunking sounds etc. I have never even used a 5 1/4 inch floppy in anger. From landman at scalableinformatics.com Mon Jul 21 15:25:41 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:28 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> Message-ID: <48850CE5.7020201@scalableinformatics.com> andrew holway wrote: > you should all be ashamed. Especially after all that talk of paper > tapes and computers that made heavy clunking sounds etc. I have never > even used a 5 1/4 inch floppy in anger. [must ... resist ... urge...] 5.25 inch floppy? Luxury! [must ... stop ... now....] -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From rgb at phy.duke.edu Mon Jul 21 16:23:45 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:28 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <4884E76C.2030301@gmail.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4875CBFA.6090109@rri.sari.ac.uk> <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <4884E76C.2030301@gmail.com> Message-ID: On Mon, 21 Jul 2008, Geoff Jacobs wrote: > There isn't anything wrong with a good xterm. By the way, you are I agree, Geoff, but my younger computer geek friends ("younger" in their mid to late 30's, god help me:-) tend to mock me gently for using xterms/jove to code, to write books, to read my mail in xterm/pine/jove (license of evil and all) instead of modern GUI tools. And when I add hosts, I edit /etc/hosts. And when I add users I edit /etc/passwd, shadow, group. They've made printing too complex for humans to handle any more or I'd still be editing /etc/printcap (and still have to anyway, rarely, to fix things when the damn gui breaks). But jove doesn't "like" all the other modern xterm-alikes. It likes xterm. The real deal. With the others you have to tell them to stop trying to be so smart and let jove handle the interface, or they drop random lines when they don't update correctly (from the display, not from the actual memory image). So it might be worthwhile to get xjove to work just to stop the laughter ("Look, guys, I'm using a GUI editor too!":-). > forgetting the other noteworthy 'j' CLI editors: jed and joe. You can't > do much better for less than a meg on either score. Not exactly forgetting -- a good friend of mine swore by joe, which IIRC is basically wordstar implemented as a text editor. And since wordstar was my very first "real word processor" on the IBM PC, I have a soft spot for it although I've never really used it. jed I've never used at all, so someone else will have to sing its praises. But as far as image size goes: rgb@lilith|B:1330>ll /usr/bin/jove -rwxr-xr-x 1 root root 199312 2007-10-17 17:09 /usr/bin/jove in x86_64 form. Or even more impressive: rgb 17457 0.0 0.0 82484 1184 pts/8 S+ 18:43 0:00 jove /tmp/pico.452401 VSZ of 82484 is nice and tight, and a whole K of RSS. > I used and loved nedit for a long time, but eventually decided that we > would have to part. Nedit (or possibly lesstif) was glitching far more > than I considered proper. Right now I'm using Scite. If jove's keystrokes weren't engraved in my nervous system by 20 consecutive years of using nothing else (how else could the rgbbot keep up, if it weren't by having hardwired firmware interfacing with the text output interface:-) I'd give all of these a try, but truthfully, jove has everything I need, and it is absolutely bulletproof the way only 25 or 30 years of nearly frozen code can be bulletproof. I had to patch it a half dozen years ago to cope with a few posix issues as some of the underlying libraries were brought up to date, but the moribund code base is actually a blessing. Even the kinda 1/3 finished xjove sources in the tree are "cute" or "quaint" (I think they probably use athena widgets) more than a hassle. As I said, the one thing I could wish for is fixing it so it grokked latex errors the way it groks C errors. And one day, when it is time, I'm sure that one of the six living humans who still use it will patch it up so that it does. No need to be hasty -- it is easy enough to switch windows and read the errors from make (which are available, after all) and switch back, and latex is notoriously lousy about locating the probable line source of its errors anyway. The bottom line is that I can open a file (<1 second), page down to a particular string (Ctrl-\ string, Ctrl-\ Ctrl-\... interactive), change it to something else (four seconds total), save and exit (another second) in the time something like OpenOffice is still thinking about coming up, without EVER stressing the memory or disk resources of a modern system. Fast, small, and oh so powerful, the way C-sourced programs should be...;-) rgb -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From rgb at phy.duke.edu Mon Jul 21 16:29:53 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: References: <20080721185322.GA7338@nlxdcldnl2.cl.intel.com> Message-ID: On Mon, 21 Jul 2008, Mark Hahn wrote: > I voted Libertarian in college, but grew out of it ;) There's libertarian, as believing in freedom and so on, there's rabid libertarian (betrayed by foaming at the mouth about the wickedness of having to pay taxes to keep up things like roads and police forces), and there's practical libertarian, which maintains the libertarian philosophy internally, recognizes that there are still a whole lot of things that are best done by a government, and votes sensibly in such a way that things work towards a better, and sure, freer, future. So maybe you're STILL a libertarian, but no longer rabid...;-) rgb > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From kilian at stanford.edu Mon Jul 21 16:49:15 2008 From: kilian at stanford.edu (Kilian CAVALOTTI) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <4883E45B.2010902@scalableinformatics.com> References: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> <20080721005711.GB3465@bx9.net> <4883E45B.2010902@scalableinformatics.com> Message-ID: <200807211649.15461.kilian@stanford.edu> On Sunday 20 July 2008 06:20:27 pm Joe Landman wrote: > Dell DRAC costs > somewhat more ($250/node ?), IBM/Sun/HP all integrate it for you. Actually, you don't need a DRAC to use IPMI on modern Dell hardware: all the recent PowerEdge machines (>=8th generation) come with an integrated BMC (Baseboard Management Controller) which does IPMI. So you can remotely power cycle your Dell servers for free (no extra card to buy) Cheers, -- Kilian From bob at drzyzgula.org Mon Jul 21 16:53:26 2008 From: bob at drzyzgula.org (Bob Drzyzgula) Date: Wed Nov 25 01:07:28 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: References: <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> Message-ID: <20080721235326.GA21388@drzyzgula.org> On Mon, Jul 21, 2008 at 01:47:02PM -0400, Robert G. Brown wrote: > > On Mon, 21 Jul 2008, Joe Landman wrote: > > > This is the sad truth. I can survive without using emacs, and besides, > I can use it in an emergency. But nobody can manage systems without > knowing vi. You may use it only long enough to edit /etc/hosts and your > firewall and your yum repo data so you can install and rebuild jove, but > that much cannot be avoided... This, I find, is a strong dividing line. By and large (not with exclusivity, but IME there is certainly a trend) systems programmers use vi and applications programmers use emacs. Systems programmers spend far to much time just getting in and out to make quick fixes to things -- and for that matter spend far too much time working with broken machines -- to ever allow themselves to become dependant on anything with as much overhead as Emacs. Some of them will master both, and use Emacs for scripting and such. But most that I've known just never bother with it. > Speaking personally, I'd rather burn off my pre-cancerous old-age spots > with a wood-burning kit than use vi for more than two minutes at a time > ("... only long enough..." see above) but to each their own, I suppose. See, I cut my teeth [1] on a Sun 2/120 with a multibus SCSI adapter, with a 71MB hard drive and a QIC tape drive. This was running SunOS 1.1 (cf. BSD 4.1), and I can assure you that it didn't have no stinkin' Emacs; Bill Joy ran the OS development for Sun and anyway, James Gosling's Unix/C port of Emacs was just starting to make the rounds [2]. The only real choices were ed, ex and vi -- vi of course being a mode you entered from ex, which still was important and a vast improvement over ed. By the time there were any other reasonable editors available to me, the vi command set had moved down into my brain stem. The only command I ever mastered in Emacs was . FWIW, with vi being so cryptic and Emacs being even worse, for a while we supported the Rand Editor -- in particular e19 [3]. Now there was an editor for the masses -- virtually the whole thing was driven by function keys. --Bob [1] Unix teeth, that is. The first machine I programmed -- with punchcards -- was an IBM 1130... [2] We did at one point buy some licenses for Unipress Emacs (the commercialized version of Gosling Emacs), but only a few hardy souls ever forced themselves to make use of it. [3] http://www.rand.org/pubs/notes/N2239-1/ http://perrioll.web.cern.ch/perrioll/Rand_Editor/Linux/ http://www.beowulf.org/archive/2001-April/002901.html From rgb at phy.duke.edu Mon Jul 21 17:05:39 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:28 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> Message-ID: On Mon, 21 Jul 2008, andrew holway wrote: >> Speaking personally, I'd rather burn off my pre-cancerous old-age spots >> with a wood-burning kit than use vi for more than two minutes at a time >> ("... only long enough..." see above) but to each their own, I suppose. > > Now come on, I am but an ungainly fawn in the unix wilderness and have > found vim to be albeit comfortably confusing but incredibly powerful > editor. > > I thought that it was the mark of a true unix beard to be able to use vi(m). > > you should all be ashamed. Especially after all that talk of paper > tapes and computers that made heavy clunking sounds etc. I have never > even used a 5 1/4 inch floppy in anger. I just never liked hopping into and out of text insertion mode, and even though nethack DOES use the vi cursor movement keys to move around (this was one of the original motivations for the game, IIRC) that doesn't mean that I find them particularly natural. IMO, one can be good, really good, at just one editor. One's fingers learn to do things without thinking, as they should, and the keyboard becomes a true extension of one's consciousness, as it should. Long, long ago, I tried both and preferred jove. As a consequence, while I CAN use vi -- and when I must, I do -- the lack of the direct neural connection I have with jove is frustrating and slows me down. Alas, this means that you've found me out. I'm no true beard, but merely a pretender. A slacker, in fact. rgb -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From laytonjb at charter.net Mon Jul 21 17:23:06 2008 From: laytonjb at charter.net (laytonjb@charter.net) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: Message-ID: <20080721202306.Z5Y5V.74329.root@fepweb14> ---- "Robert G. Brown" wrote: > On Mon, 21 Jul 2008, Mark Hahn wrote: > > > I voted Libertarian in college, but grew out of it ;) > > There's libertarian, as believing in freedom and so on, there's rabid > libertarian (betrayed by foaming at the mouth about the wickedness of > having to pay taxes to keep up things like roads and police forces), > and there's practical libertarian, which maintains the libertarian > philosophy internally, recognizes that there are still a whole lot of > things that are best done by a government, and votes sensibly in such a > way that things work towards a better, and sure, freer, future. > > So maybe you're STILL a libertarian, but no longer rabid...;-) But... But... Fortran is the one language to rules them all!!! Libertarian, Republican, Democrat, Communist, Socialist, Jedi, whatever. Fortran can solve all of the world's problems and allow man kind to progress!!! Oops... Never mind. From gdjacobs at gmail.com Mon Jul 21 17:27:06 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> <4884B476.8040304@aplpi.com> <878wvvrw1c.fsf@snark.cb.piermont.com> Message-ID: <4885295A.20207@gmail.com> Peter St. John wrote: > (re line numbers) Ah, I should have said, that was in VMS. I did get VIM > for VMS though but I was never a maestro. There are happier VMS > installations with unix workalike interfaces, not there then though. > Peter What was your poison? EDT or TPU? -- Geoffrey D. Jacobs From rgb at phy.duke.edu Mon Jul 21 17:28:19 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:28 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <20080721235326.GA21388@drzyzgula.org> References: <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> Message-ID: On Mon, 21 Jul 2008, Bob Drzyzgula wrote: > On Mon, Jul 21, 2008 at 01:47:02PM -0400, Robert G. Brown wrote: >> >> On Mon, 21 Jul 2008, Joe Landman wrote: >> >> >> This is the sad truth. I can survive without using emacs, and besides, >> I can use it in an emergency. But nobody can manage systems without >> knowing vi. You may use it only long enough to edit /etc/hosts and your >> firewall and your yum repo data so you can install and rebuild jove, but >> that much cannot be avoided... > > This, I find, is a strong dividing line. By and large > (not with exclusivity, but IME there is certainly a trend) > systems programmers use vi and applications programmers > use emacs. Systems programmers spend far to much time just > getting in and out to make quick fixes to things -- and > for that matter spend far too much time working with broken > machines -- to ever allow themselves to become dependant > on anything with as much overhead as Emacs. Some of them > will master both, and use Emacs for scripting and such. > But most that I've known just never bother with it. Fair enough. At the time I started out with Unix was a physicist first, a coder second (to program my physics stuff) and a novice unix sysadmin third (but with a very good guru). vi (as of the latter 1980's) was not particularly coder-friendly, no matter what the really old hands would say. >> Speaking personally, I'd rather burn off my pre-cancerous old-age spots >> with a wood-burning kit than use vi for more than two minutes at a time >> ("... only long enough..." see above) but to each their own, I suppose. > > See, I cut my teeth [1] on a Sun 2/120 with a multibus SCSI > adapter, with a 71MB hard drive and a QIC tape drive. This > was running SunOS 1.1 (cf. BSD 4.1), and I can assure you > that it didn't have no stinkin' Emacs; Bill Joy ran the Whew! I didn't do Suns (or Unix) until the 4/110, which didn't have much more of a hard drive (two 60's, IIRC) but supported some 16 serial terminals and several sun 3 and 386i and SGI clients. None of the systems "came" with emacs back then. It was one of the packages one built from source, once you got it set up and cc installed and working, if you had room and could get the sources to build. To be able to set up the system, to be able to edit the makefiles to be able to do the build, to be able to hack the sources as needed to be able to complete the build successfully, you needed vi; minimal competence with vi was (and really is still not) an option for a sysadmin full or part time. What you had in /sbin and maybe /bin was it on a new/naked system. > OS development for Sun and anyway, James Gosling's Unix/C > port of Emacs was just starting to make the rounds [2]. > The only real choices were ed, ex and vi -- vi of course > being a mode you entered from ex, which still was important > and a vast improvement over ed. By the time there were any > other reasonable editors available to me, the vi command > set had moved down into my brain stem. The only command > I ever mastered in Emacs was . Yeah, I've heard people speak of the joys of just using raw ed, and used QED and a few other editors (edlin?) of that ilk. Just long enough to write a front end (in basica) to hide it, in the case of QED. I was possibly the only person on the planet to have a fullscreen editor on a TSO connection to a QED session on a mainframe for several years, with all its line-at-a-time commands carefully hidden and virtualized on a local PC display of the sources. Or, maybe not. There are a lot of hackers out there, and QED was slow (and TSO expensive), hence it sucked. But by the time I got to Unix from the mainframes, jove was by far the way to go for somebody that did both sysadmin and coding, with the latter marginally dominating. Emacs had already grown incredibly bloated, and was the source of some rather famous root exploits (it had a suid root component in there somewhere) -- got hit by that indirectly back in maybe 1989. jove has hardly changed from 1989 on. A tiny bit more/better compiler support. Perhaps a few bugfixes. But to change it, one has to write C and do a full recompile. I've always felt that emacs bloat resulted from the fact that anybody could bloat the damn thing via lisp without needing to actually work through the core interface and insert code that is sufficiently bugfree that it would recompile. rgb > FWIW, with vi being so cryptic and Emacs being even worse, > for a while we supported the Rand Editor -- in particular > e19 [3]. Now there was an editor for the masses -- virtually > the whole thing was driven by function keys. > > --Bob > > [1] Unix teeth, that is. The first machine I programmed -- > with punchcards -- was an IBM 1130... > > [2] We did at one point buy some licenses for Unipress > Emacs (the commercialized version of Gosling Emacs), but > only a few hardy souls ever forced themselves to make use > of it. > > [3] http://www.rand.org/pubs/notes/N2239-1/ > http://perrioll.web.cern.ch/perrioll/Rand_Editor/Linux/ > http://www.beowulf.org/archive/2001-April/002901.html > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From larry.stewart at sicortex.com Mon Jul 21 17:50:25 2008 From: larry.stewart at sicortex.com (Lawrence Stewart) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <20080721202306.Z5Y5V.74329.root@fepweb14> References: <20080721202306.Z5Y5V.74329.root@fepweb14> Message-ID: <543E6E0E-1701-4012-8BC3-CA923B1DE159@sicortex.com> On Jul 21, 2008, at 8:23 PM, wrote: > > But... But... Fortran is the one language to rules them all!!! > Libertarian, > Republican, Democrat, Communist, Socialist, Jedi, whatever. Fortran > can > solve all of the world's problems and allow man kind to progress!!! I don't know what the language of the 21st century will be like, but it will be called FORTRAN. -C.A.R Hoare The fastest programs are Fortran programs, no matter what language they are written in. -Larry From rgb at phy.duke.edu Mon Jul 21 16:29:53 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: References: <20080721185322.GA7338@nlxdcldnl2.cl.intel.com> Message-ID: On Mon, 21 Jul 2008, Mark Hahn wrote: > I voted Libertarian in college, but grew out of it ;) There's libertarian, as believing in freedom and so on, there's rabid libertarian (betrayed by foaming at the mouth about the wickedness of having to pay taxes to keep up things like roads and police forces), and there's practical libertarian, which maintains the libertarian philosophy internally, recognizes that there are still a whole lot of things that are best done by a government, and votes sensibly in such a way that things work towards a better, and sure, freer, future. So maybe you're STILL a libertarian, but no longer rabid...;-) rgb > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From peter.st.john at gmail.com Mon Jul 21 19:22:29 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <4885295A.20207@gmail.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <6.2.5.6.2.20080721085122.02d8ea38@jpl.nasa.gov> <4884B476.8040304@aplpi.com> <878wvvrw1c.fsf@snark.cb.piermont.com> <4885295A.20207@gmail.com> Message-ID: all the guys (with long VMS experience) use(d) EDT, but I installed VIM a week or two into the job. DCL wasn't so bad but your fingers are happy with their happy editor. But then we installed perl so DCL was pretty useless too, except for legacy stuff. Peter On 7/21/08, Geoff Jacobs wrote: > > Peter St. John wrote: > > (re line numbers) Ah, I should have said, that was in VMS. I did get VIM > > for VMS though but I was never a maestro. There are happier VMS > > installations with unix workalike interfaces, not there then though. > > Peter > > > What was your poison? EDT or TPU? > > > -- > Geoffrey D. Jacobs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080721/43e01503/attachment.html From peter.st.john at gmail.com Mon Jul 21 19:28:35 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:28 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <20080721235326.GA21388@drzyzgula.org> References: <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> Message-ID: My feeling is that some of us like to construct long sentences from a small vocabulary, while others like short sentences from a huge vocabulary. Or substitute expressions and alphabets. Long proofs of symbolic logic or short proofs citing lemmas. Emacs is for one, vi the other. I prefer long chains built with few commands. Like planning many moves ahead with a few chesspieces. Peter On 7/21/08, Bob Drzyzgula wrote: > > On Mon, Jul 21, 2008 at 01:47:02PM -0400, Robert G. Brown wrote: > > > > On Mon, 21 Jul 2008, Joe Landman wrote: > > > > > > > This is the sad truth. I can survive without using emacs, and besides, > > I can use it in an emergency. But nobody can manage systems without > > knowing vi. You may use it only long enough to edit /etc/hosts and your > > firewall and your yum repo data so you can install and rebuild jove, but > > that much cannot be avoided... > > > This, I find, is a strong dividing line. By and large > (not with exclusivity, but IME there is certainly a trend) > systems programmers use vi and applications programmers > use emacs. Systems programmers spend far to much time just > getting in and out to make quick fixes to things -- and > for that matter spend far too much time working with broken > machines -- to ever allow themselves to become dependant > on anything with as much overhead as Emacs. Some of them > will master both, and use Emacs for scripting and such. > But most that I've known just never bother with it. > > > > Speaking personally, I'd rather burn off my pre-cancerous old-age spots > > with a wood-burning kit than use vi for more than two minutes at a time > > ("... only long enough..." see above) but to each their own, I suppose. > > > See, I cut my teeth [1] on a Sun 2/120 with a multibus SCSI > adapter, with a 71MB hard drive and a QIC tape drive. This > was running SunOS 1.1 (cf. BSD 4.1), and I can assure you > that it didn't have no stinkin' Emacs; Bill Joy ran the > OS development for Sun and anyway, James Gosling's Unix/C > port of Emacs was just starting to make the rounds [2]. > The only real choices were ed, ex and vi -- vi of course > being a mode you entered from ex, which still was important > and a vast improvement over ed. By the time there were any > other reasonable editors available to me, the vi command > set had moved down into my brain stem. The only command > I ever mastered in Emacs was . > > FWIW, with vi being so cryptic and Emacs being even worse, > for a while we supported the Rand Editor -- in particular > e19 [3]. Now there was an editor for the masses -- virtually > the whole thing was driven by function keys. > > --Bob > > [1] Unix teeth, that is. The first machine I programmed -- > with punchcards -- was an IBM 1130... > > [2] We did at one point buy some licenses for Unipress > Emacs (the commercialized version of Gosling Emacs), but > only a few hardy souls ever forced themselves to make use > of it. > > [3] http://www.rand.org/pubs/notes/N2239-1/ > http://perrioll.web.cern.ch/perrioll/Rand_Editor/Linux/ > http://www.beowulf.org/archive/2001-April/002901.html > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080721/2bab1ef9/attachment.html From peter.st.john at gmail.com Mon Jul 21 19:49:23 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Human vs Compuer Go on a cluster Message-ID: There is going to be a match at Go between a human professional and a computer (a 3000 node cluster in France). KGS is a free Go server; you can download a java client and watch the game in progress. KGS is www.goKGS.com, and you can get CGoban from http://www.gokgs.com/download.xhtml. Drop me a note if any questions. The human is "8p", meaning 8-dan professional; not quite 3 stones stronger than the bottom 1d pro, who in turn would give me (an amateur 1d) at least 6 stones (probably more). Edward Lasker said that 3 stones handicap at Go is comparable to knight odds at chess (although I think that overstates). Peter The notice from the AGA is: *HUMAN-COMPUTER SHOWDOWN AT CONGRESS*: While computers long ago surpassed humans at chess, the best go programs haven't been able to hold a candle to professional human players. In 1997, Janice Kim -- then a professional 1-dan -- beat Handtalk, then the strongest program, despite giving the program a 25-stone handicap. There has been considerable progress in computer go research since then - do humans still reign supreme? Find out on Thursday, August 7 at 1P, when Kim MyungWan 8P takes on MoGo, the world's strongest computer go program. MoGo will connect remotely from France, where it will be running on a supercomputer boasting over 3,000 processor cores. The game will be broadcast live on KGS. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080721/f2417ebe/attachment.html From bdobbins at gmail.com Mon Jul 21 21:09:22 2008 From: bdobbins at gmail.com (Brian Dobbins) Date: Wed Nov 25 01:07:28 2009 Subject: Q: Fortran 'era' - 66/77/90/95/03/08? [Was: Re: [Beowulf] Green Cluster?] Message-ID: <2b5e0c120807212109r38cd1cdciab23341a51faea18@mail.gmail.com> Hi everyone, I don't know what the language of the 21st century will be like, > but it will be called FORTRAN. > -C.A.R Hoare I know this was made in a somewhat off-topic thread touching on politics, but I couldn't resist adding my two cents and seeing what other people see from codes these days. This post stems from a few conversations I've had with people at Yale on development practices and languages, and also from the realization that a university operates quite differently (I think) than research labs or corporations. Thus, this list is a great way to ping the 'real world' and learn from some different points of view... See, I love Fortran.. or, more specifically, sensible Fortran. The problem is that there's a *lot* of nightmare-inducing Fortran code (mostly F77/F66) that is chock-full of things like computed GOTOs, 'equivalence' statements, Hollerith constants, variable names that need their own Rosetta stone to comprehend, etc. To borrow an image from Douglas Adams, that stuff can be like Vogon Poetry - it's torture to me, plain and simple, but somehow other people love it. ;-) Now, I grew up on C/C++, and I definitely agree that Fortran is going to be the language of choice for codes (in the physical sciences) in the near to mid-distant future, but *as* a C/C++ guy who has seen more than his fair share of 'object oriented' inspired abstraction where it was most definitely not warranted, I worry a little bit about people leapfrogging from the ancient F66/F77 ways to, "Ooh, someone said Fortran 2003 has lots of OO features! Let's redesign everything with 34,782 levels of abstraction!". I cringe at the thought. On the plus side, as evidenced by the fact that much of the code I see *is* so old, it seems many developers aren't keen on adopting 'new' things in their language, but on the other, with the size of the projects faced in scientific computing and the increasing importance of multi-threading or even well-designed MPI-based parallelism, *lots* of people in the physical sciences could benefit from people with a software engineering background... one that *often* comes with C/C++ familiarty and its assorted demons. D'oh. In terms of the in-house applications I work on, I readily use F90's allocatable arrays and parse tons of command-line arguments, since I figure having flexible codes that are easy for a user to alter without recompiling are worth their weight in gold, and I also tend to use array syntax (sometimes with explicit indices for clarity) to eliminate some loops and make the code more natural in the mathematical sense. And, of course, I use more descriptive variable names and modules for organizing things. In terms of the complexity of the data structures and flow, though, I try not to mess with simplicity much - this seems to not only be a half-decent application of the 'KISS' principle, but it also lets people who are not too familiar with F90 still understand what is going on since most things are left relatively unchanged. The above 'changes' to the F66/F77 base often seem a bit radical to some, so I'd love to know what people see, in general, at other places. I know some places such as NASA run training classes on the advanced features of F2003 (and even F2008, I think!), and surely the National Labs do, too, but are modern codes taking advantage of such features in any real capacity yet? (Ex: IBM's XLF already has procedure pointers implemented, so *surely *someone asked for that? Is it being used in any open production codes?) In other words, what 'era' of Fortran do most people find themselves in, and are things changing? (PS. I hope nobody minds the title change -- I figured it was 'different enough' from "Green Computing" to warrant it!) Cheers, - Brian Brian Dobbins Yale Engineering HPC -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080722/eed83aa3/attachment.html From kyron at neuralbs.com Mon Jul 21 21:30:14 2008 From: kyron at neuralbs.com (Eric Thibodeau) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] A press release In-Reply-To: References: <4863E551.8090802@scalableinformatics.com> <0D49B15ACFDF2F46BF90B6E08C90048A04884918AC@quadbrsex1.quadrics.com> <486A6760.5010006@ias.edu> <20080702072709.GX11428@casco.aei.mpg.de> <4873750E.1020407@ias.edu> <487377C2.2070503@aplpi.com> <50761.69.139.186.42.1215531006.squirrel@mail.eadline.org> <4874C2C3.2040007@ias.edu> Message-ID: <48856256.6020808@neuralbs.com> Matt Lawrence wrote: > On Wed, 9 Jul 2008, Prentice Bisbal wrote: > >> Douglas Eadline wrote: >>> A blast from the past. I have a copy of the Yggdrasil "Linux Bible". >>> A phone book of Linux How-To's and other docs from around 1995. >>> Quite useful before Google became the help desk. >>> >>> -- >>> Doug >>> >> >> Translation: Doug is a pack rat. > > No, he is merely satisfying his genetic tendancy toward archivism. I have windows 3.1 in it's original packaging...sealed! (The guy that gave it to me asked if it could be of any use...I smirked and told him "no way, you can do all sorts of things with this!"...an went on reading the boxe's cool features... > > -- Matt > It's not what I know that counts. > It's what I can remember in time to use. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From perry at piermont.com Mon Jul 21 21:49:22 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <20080721235326.GA21388@drzyzgula.org> (Bob Drzyzgula's message of "Mon\, 21 Jul 2008 19\:53\:26 -0400") References: <487604C8.8020602@tamu.edu> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> Message-ID: <87ljzuo759.fsf@snark.cb.piermont.com> Bob Drzyzgula writes: > This, I find, is a strong dividing line. By and large > (not with exclusivity, but IME there is certainly a trend) > systems programmers use vi and applications programmers > use emacs. I've seen more than my share of Unix hackers over the years, and I spot no such trends. Generally, it seems to be a question of what you learned first. > See, I cut my teeth [1] on a Sun 2/120 with a multibus SCSI > adapter, with a 71MB hard drive and a QIC tape drive. This > was running SunOS 1.1 (cf. BSD 4.1), Vax 11/750 for me. Before that I hacked on Tops-20, where the editor of choice was the original emacs written in teco, thus my brain has been wired for Emacs for a quarter century or so. (The stuff I used before that was all line editor oriented and didn't stick in my brain.) > and I can assure you that it didn't have no stinkin' Emacs; It most certainly did, you simply didn't install it. :) > Bill Joy ran the OS development for Sun and anyway, James Gosling's > Unix/C port of Emacs was just starting to make the rounds [2]. By the time of SunOS 1.1, I believe there was Unipress emacs around, as you note. In any case, the Suns I used of that vintage had Emacs available. (I have a genuine Sun 1 sitting in my mom's garage still -- double digit serial number.) > [2] We did at one point buy some licenses for Unipress > Emacs (the commercialized version of Gosling Emacs), but > only a few hardy souls ever forced themselves to make use > of it. Where I was, the '20 heads kind of insisted on Emacsen. Unfortunately, gosmacs didn't have a real extension language, so Gnu Emacs (which arrived quite shortly) was considered a big plus... Perry -- Perry E. Metzger perry@piermont.com From gerry.creager at tamu.edu Mon Jul 21 22:30:21 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:28 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <48850CE5.7020201@scalableinformatics.com> References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <48850CE5.7020201@scalableinformatics.com> Message-ID: <4885706D.2030903@tamu.edu> Joe Landman wrote: > andrew holway wrote: > >> you should all be ashamed. Especially after all that talk of paper >> tapes and computers that made heavy clunking sounds etc. I have never >> even used a 5 1/4 inch floppy in anger. > > [must ... resist ... urge...] > > 5.25 inch floppy? Luxury! > > [must ... stop ... now....] Ah, but I've used (and flung) 8" floppies, usually in either anger or anguish. And remember when "maximum Maytag" described a hard seek cycle rather than a washing machine. Oh, yeah: The washing machine was considerably lighter than the disk drive described by the term. -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From tjrc at sanger.ac.uk Mon Jul 21 22:42:22 2008 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed Nov 25 01:07:28 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: References: <635365514.192961215654266556.JavaMail.root@zimbra.vpac.org> <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> Message-ID: <60FEF374-FCEF-413A-93F2-3EAF8589B98A@sanger.ac.uk> On 22 Jul 2008, at 1:05 am, Robert G. Brown wrote: > I just never liked hopping into and out of text insertion mode, and > even > though nethack DOES use the vi cursor movement keys to move around > (this > was one of the original motivations for the game, IIRC) that doesn't > mean that I find them particularly natural. That's them main reason I prefer vim to normal vi. You can navigate around the document using the cursor keys *without* having to leave insertion mode, which makes it a vast amount more useable than normal vi, for my tastes. I use vim for most editing tasks, but I do still use emacs for some things, especially programming - its integration with the perl debugger and with gdb is very useful, as is the make error-following mode someone else mentioned. > IMO, one can be good, really good, at just one editor. Oh, I don't know - I seem to be able to swap vi and emacs bindings in and out of my fingers fairly quickly. Tim -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From lindahl at pbm.com Mon Jul 21 23:14:23 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:28 2009 Subject: Religious wars (was Re: [Beowulf] A press release) In-Reply-To: <60FEF374-FCEF-413A-93F2-3EAF8589B98A@sanger.ac.uk> References: <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <60FEF374-FCEF-413A-93F2-3EAF8589B98A@sanger.ac.uk> Message-ID: <20080722061423.GA26772@bx9.net> >> IMO, one can be good, really good, at just one editor. > > Oh, I don't know - I seem to be able to swap vi and emacs bindings in > and out of my fingers fairly quickly. I think the rule to be learned is that people always extrapolate from their personal experiences, instead of asking other people questions. I had to laugh about the "system programmers use vi" comment. And resource usage? Eight MEGABYTES And Constantly Swapping, yeah, that's big. On my system emacs and vim are about the same size at startup. -- greg From prentice at ias.edu Tue Jul 22 06:31:30 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: [torqueusers] pbs_server -t create can't find torque library In-Reply-To: <4884AC01.3040606@ias.edu> References: <4881124C.9050905@mines.edu> <4884A898.1010001@byu.edu> <4884AC01.3040606@ias.edu> Message-ID: <4885E132.9020900@ias.edu> Oops. I sent this to the wrong list. Sorry! Prentice Bisbal wrote: > Lloyd Brown wrote: >> - Add the path (/usr/local/lib) either to /etc/ld.so.conf file, or to a >> file in /etc/ld.so.conf.d/, then run "ldconfig" to update the path >> cache, etc. This is the recommended system-wide way of doing things. >> >> > > What happens when you have two different library paths with that contain > libraries with the same name? How do you determine the search order when > using individual files in /etc/ld.so.conf.d? > > -- > Prentice > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From bob at drzyzgula.org Tue Jul 22 07:54:47 2008 From: bob at drzyzgula.org (Bob Drzyzgula) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <87ljzuo759.fsf@snark.cb.piermont.com> References: <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> Message-ID: <20080722145447.GB2451@drzyzgula.org> On Tue, Jul 22, 2008 at 12:49:22AM -0400, Perry E. Metzger wrote: > > Bob Drzyzgula writes: > > This, I find, is a strong dividing line. By and large > > (not with exclusivity, but IME there is certainly a trend) > > systems programmers use vi and applications programmers > > use emacs. > > I've seen more than my share of Unix hackers over the years, and I > spot no such trends. Generally, it seems to be a question of what you > learned first. Well, I did say "IME". This remains a split -- not a perfect dividing line, but still identifiable -- today in my office, even among the younger staff that comes in. Yes, it is always dangerous to extrapolate from such limited anecdotal evidence. Still, even today almost any Linux repair or minimal "live boot cd" will have vi installed, but rarely will they have any version of Emacs. It is not even certain that the default, base install of a Linux system will include Emacs -- I just checked a couple of Ubuntu 8.04 Desktop and Server systems I have here; neither have Emacs, and I did nothing to actively exclude it. Whatever the case, I believe that it is true that no systems programmer can work *exclusively* in Emacs, while an applications programmer typically is able to do so. > > See, I cut my teeth [1] on a Sun 2/120 with a Multibus SCSI > > adapter, with a 71MB hard drive and a QIC tape drive. This > > was running SunOS 1.1 (cf. BSD 4.1), > > Vax 11/750 for me. Before that I hacked on Tops-20, where the editor > of choice was the original emacs written in teco, thus my brain has > been wired for Emacs for a quarter century or so. (The stuff I used > before that was all line editor oriented and didn't stick in my > brain.) Well, pre-Unix my experience was card punches, TSO, Wylbur and ISPF/PDF (a TSO-based full screen environment that had it's own text editor), pretty much in that order, so not much of that gave me any initial bias, except that maybe my expectations weren't all that high. That first 2/120 was brought in by my Division (actually it was a long-term loaner from Sun), set down in an office in a standalone mode, and I pretty much had to figure out how to use it by myself from the man pages, Kernighan & Ritchie, Kernighan & Pike, and a one-week class in System V -- I found no BSD classes available, at least locally. > > and I can assure you that it didn't have no stinkin' Emacs; > > It most certainly did, you simply didn't install it. :) Absolutely SunOS 1.1 did not include any version of Emacs. I didn't have a uucp connection, much less an Internet connection at that time, or know anyone, other than the Sun staff, who had access to these resources. Our first uucp connection was through the a local systems integrator, around 1987; our first TCP/IP connection was through uunet a couple of years later (we'd also moved our uucp to them by then). And thus I was largely limited to what came on the OS distribution. When we wanted to install mh and e19, we paid the Rand Corporation a nominal fee to put the source code on a 9-track tape and mail it to us -- this was a common enough request that they had a standard price for doing this. > > Bill Joy ran the OS development for Sun and anyway, James Gosling's > > Unix/C port of Emacs was just starting to make the rounds [2]. Here I see my memory was somewhat faulty. According to http://www.666.com/xemacs-internals/internals_3.html Gosling wrote his Emacs in 1981, and Unipress Emacs started shipping in 1983 for $399 per seat. That first 2/120 showed up in 1984; there was no way at that time that I could have gotten that kind of money for a text editor when the OS already included one. And even if I could have gotten my hands on a copy of Gosling's pre-Unipress code, I'm not sure what -- given that I was working pretty much just by myself -- might have driven me in that direction. FWIW, Stallman didn't start writing GNU Emacs until the same year -- 1984 -- that SunOS 1.1 was released. It wasn't until we had a few of these machines around and a few dozen users that any interest in Emacs started to surface. By that time the vi firmware had already been loaded into my brain stem. > By the time of SunOS 1.1, I believe there was Unipress emacs > around, as you note. In any case, the Suns I used of that vintage had > Emacs available. (I have a genuine Sun 1 sitting in my mom's garage > still -- double digit serial number.) As I mentioned, Emacs was not included in the OS distribution from Sun. If the Sun systems you were working on had Emacs, someone went to the trouble of installing it from some other source. > > [2] We did at one point buy some licenses for Unipress > > Emacs (the commercialized version of Gosling Emacs), but > > only a few hardy souls ever forced themselves to make use > > of it. > > Where I was, the '20 heads kind of insisted on Emacsen. Unfortunately, > gosmacs didn't have a real extension language, so Gnu Emacs (which > arrived quite shortly) was considered a big plus... IIRC there were maybe three or four of our users who toughed it out with Unipress, in maybe the 1985-1986 timeframe. It wasn't until GNU Emacs became available to us that Emacs got any traction in my office. --Bob From bob at drzyzgula.org Tue Jul 22 08:09:03 2008 From: bob at drzyzgula.org (Bob Drzyzgula) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <20080722145447.GB2451@drzyzgula.org> References: <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> Message-ID: <20080722150903.GA5720@drzyzgula.org> On Tue, Jul 22, 2008 at 10:54:47AM -0400, Bob Drzyzgula wrote: > > Gosling wrote his Emacs in 1981, and Unipress Emacs started > shipping in 1983 for $399 per seat. Sorry -- actually looking again I see it said $395, not $399, not that this makes any difference. But thinking back I expect that this was $395 per *system*, which is only per seat if you're talking about workstations, and in those days we used workstations as multi-user systems anyway. I doubt there was any license management mechanism available at the time that could have enforced a per-user license. Still, it was a lot of money. --Bob From peter.st.john at gmail.com Tue Jul 22 09:01:23 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <20080722150903.GA5720@drzyzgula.org> References: <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722150903.GA5720@drzyzgula.org> Message-ID: That's a nice point; I got thrown into the deep end by a bunch of crazy mathematicians in the summer of 81: C, K&R, Unix, commodities forecasting. I learned vi then. So at the time there was no choice. My first experience with emacs I can't quite date; my macsyma program crashed, the OS (whatever it was, probabaly VMS on vaxen) command interpreter prompt suddenly went away, and I was flabbergasted. Did I just crash the DuPont Experimental Station Vax network? So I learned to quit out of emacs before I learned to enter into it. Peter On 7/22/08, Bob Drzyzgula wrote: > > On Tue, Jul 22, 2008 at 10:54:47AM -0400, Bob Drzyzgula wrote: > > > > Gosling wrote his Emacs in 1981, and Unipress Emacs started > > shipping in 1983 for $399 per seat. > > > Sorry -- actually looking again I see it said $395, not > $399, not that this makes any difference. But thinking > back I expect that this was $395 per *system*, which is > only per seat if you're talking about workstations, and > in those days we used workstations as multi-user systems > anyway. I doubt there was any license management mechanism > available at the time that could have enforced a per-user > license. Still, it was a lot of money. > > > --Bob > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080722/0d28e8d6/attachment.html From perry at piermont.com Tue Jul 22 09:02:33 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <20080722145447.GB2451@drzyzgula.org> (Bob Drzyzgula's message of "Tue\, 22 Jul 2008 10\:54\:47 -0400") References: <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> Message-ID: <87vdyxlxeu.fsf@snark.cb.piermont.com> Bob Drzyzgula writes: >> > and I can assure you that it didn't have no stinkin' Emacs; >> >> It most certainly did, you simply didn't install it. :) > > Absolutely SunOS 1.1 did not include any version of Emacs. No, it didn't *include* it. As I said, we got gosmacs from Unipress. The environment I was at bought it because they had so many Dec-20 heads that they needed it. I'm sorry if I was ambiguous on that. -- Perry E. Metzger perry@piermont.com From gerry.creager at tamu.edu Tue Jul 22 09:16:19 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: References: <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722150903.GA5720@drzyzgula.org> Message-ID: <488607D3.9010006@tamu.edu> Didn't EVERYONE learn c by reading K&R? I started with joe because of muscle memory in my fingers for WordStar but went to vi 'cause it's almost everywhere. Peter St. John wrote: > That's a nice point; I got thrown into the deep end by a bunch of crazy > mathematicians in the summer of 81: C, K&R, Unix, commodities > forecasting. I learned vi then. So at the time there was no choice. > > My first experience with emacs I can't quite date; my macsyma program > crashed, the OS (whatever it was, probabaly VMS on vaxen) command > interpreter prompt suddenly went away, and I was flabbergasted. Did I > just crash the DuPont Experimental Station Vax network? So I learned to > quit out of emacs before I learned to enter into it. > > Peter > > On 7/22/08, *Bob Drzyzgula* > wrote: > > On Tue, Jul 22, 2008 at 10:54:47AM -0400, Bob Drzyzgula wrote: > > > > Gosling wrote his Emacs in 1981, and Unipress Emacs started > > shipping in 1983 for $399 per seat. > > > Sorry -- actually looking again I see it said $395, not > $399, not that this makes any difference. But thinking > back I expect that this was $395 per *system*, which is > only per seat if you're talking about workstations, and > in those days we used workstations as multi-user systems > anyway. I doubt there was any license management mechanism > available at the time that could have enforced a per-user > license. Still, it was a lot of money. > > > --Bob > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From bob at drzyzgula.org Tue Jul 22 09:30:23 2008 From: bob at drzyzgula.org (Bob Drzyzgula) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] [OT] Re: Religious wars In-Reply-To: References: <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722150903.GA5720@drzyzgula.org> Message-ID: <20080722163023.GC2451@drzyzgula.org> On Tue, Jul 22, 2008 at 12:01:23PM -0400, Peter St. John wrote: > > Did I just > crash the DuPont Experimental Station Vax network? So I learned to quit out > of emacs before I learned to enter into it. When I first started at my current employer in 1983, there were a scattering of mainframe (3270) terminals, all of which presented a short list of login options. One of these was TSO; that's what I had an account for and was supposed to be using. But one of the other choices was IMS [1]. One day, curiosity got the better of me and I selected that. Anyone who has used IMS probably knows what happened next: Big blank screen. I tried everything I could think of to get out of there: "quit", "exit", "bye", "logoff", "logout" -- nothing worked. Even power cycling the terminal of course does nothing on a 3270. Panicing at that point, I sheepishly called the mainframe help desk, and was told -- of course in a BOFH-ish tone of voice -- that the way you get out of there is with the "/RCL" command. As with in emacs, this was the only command I ever learned in IMS. --Bob [1] http://www-306.ibm.com/software/data/ims/ From rgb at phy.duke.edu Tue Jul 22 10:39:48 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <20080722145447.GB2451@drzyzgula.org> References: <20080710150040.GA18058@student.math> <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> Message-ID: On Tue, 22 Jul 2008, Bob Drzyzgula wrote: >> By the time of SunOS 1.1, I believe there was Unipress emacs >> around, as you note. In any case, the Suns I used of that vintage had >> Emacs available. (I have a genuine Sun 1 sitting in my mom's garage >> still -- double digit serial number.) > > As I mentioned, Emacs was not included in the OS > distribution from Sun. If the Sun systems you were working > on had Emacs, someone went to the trouble of installing > it from some other source. This mirrors my own experience, except that I took over managing our Unix network from a REAL Unix Old Hand, who was running it on a PDP (20?) for several years in the department before I started doing unix (and sysadmin) in 1986/1987. We had jove in part because we WERE on uucp, and actually were on the early internet with our own IP numbers and domain in the 128.x.x.x block (the first one heavily populated in .edu) over a 56K leased line by what maybe 1983 or 1984? Cyrus had the PDP nicely outfitted with the earliest sources -- the PDP ran homebuilt BSD IIRC, and "came" with various build your own packages that were being passed around on the early internet even back then via FTP as well as uucp. I think jove was just smaller and tighter than emacs (which was already suffering from bloat, and back then megabytes WERE extremely dear, both on disk and in "core". Which was still core, in many cases...;-). By 88 or 89 I was the department's more or less full time part time unix sysadmin (as well as teaching physics, doing research, coding). I had a sun 386i originally (department server a 4/110 that we upgraded to a 4/310 before it finally went away), and got the very first sparcstation 1 in the department if not the school (followed in due time with a 2) a few years later. By maybe 1988 I was already in the habit of going in and rebuilding the sources in /usr/local (NFS mounted) per architecture, per new machine, per hacks as we got e.g. SGIs (sysv-ish) and ran them with Suns (BSD-ish). At that point jove was WAY better than VI, as it is a coder's editor and rebuilding all that source was really a coding problem, but you absolutely had to use vi -- no "m" appended on a new, naked system as that was what you had. Adequate for editing /etc/passwd. Not so good for editing a few thousand lines of code, a couple of Makefiles, running Make from inside and flipping through errors. So I was "spoiled" by being on the internet from the beginning, basically. Not its beginning, but my own experience with Unix. And having an uberunix perfect master for a guru. So I took source access for granted; heck, we were just down the road from one of the first big source repos at UNC and I did work oat ORNL where another (netlib) resided. rgb > >>> [2] We did at one point buy some licenses for Unipress >>> Emacs (the commercialized version of Gosling Emacs), but >>> only a few hardy souls ever forced themselves to make use >>> of it. >> >> Where I was, the '20 heads kind of insisted on Emacsen. Unfortunately, >> gosmacs didn't have a real extension language, so Gnu Emacs (which >> arrived quite shortly) was considered a big plus... > > IIRC there were maybe three or four of our users who toughed it out > with Unipress, in maybe the 1985-1986 timeframe. It wasn't until > GNU Emacs became available to us that Emacs got any traction in > my office. > > --Bob > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From lindahl at pbm.com Tue Jul 22 10:54:06 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <20080722145447.GB2451@drzyzgula.org> References: <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> Message-ID: <20080722175405.GA17358@bx9.net> On Tue, Jul 22, 2008 at 10:54:47AM -0400, Bob Drzyzgula wrote: > It is not even certain that the default, base install of a Linux > system will include Emacs This just indicates a conspiracy of vi users. Or, more likely, vi users complained that emacs was in the default. Emacs users aren't bothered by having vi around. Interestingly, looking at the Red Hat RPMs, a full emacs install is only 2X the size of a full vim install. The main difference is that a minimal vim install is very small because it doesn't need vim-common, but no one's done the work to get emacs-nox to run sans emacs-common. Either way, it's a tiny fraction of a DVD, so you can look forward to full emacs on your rescue disk soon. Assuming the conspirators and complainers don't have their way. -- greg From mark.kosmowski at gmail.com Tue Jul 22 12:33:12 2008 From: mark.kosmowski at gmail.com (Mark Kosmowski) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars Message-ID: > > > Bob Drzyzgula writes: > >> > and I can assure you that it didn't have no stinkin' Emacs; > >> > >> It most certainly did, you simply didn't install it. :) > > > > Absolutely SunOS 1.1 did not include any version of Emacs. > > No, it didn't *include* it. As I said, we got gosmacs from > Unipress. The environment I was at bought it because they had so many > Dec-20 heads that they needed it. I'm sorry if I was ambiguous on > that. > > > -- > Perry E. Metzger perry@piermont.com > > I dabbled with Linux as an enthusiast as a wee lad back in the 80386 days. Took a C in the Unix environment class at college back in the late 1990s and the instructor taught vi since "There's always vi - but maybe not something else - you can read chapter 11 [of the Sobell text we were using - the emacs chapter - going from memory so chapter number may be wrong] on your own if you're interested." I briefly played a little with emacs on my own but having no compelling reason to use it didn't get very far. Right now I use kate for most user level editing and vi for most root editing or editing over ssh. I also try to remember to use a sed script (I even modified the one that came with the O'Reilly book to span directories semi-recursively!) for making minor changes to bunches of input files all at once. Sometimes (for n < 10 or so) this just gets done manually though. I will agree [with rgb?] that OpenOffice takes a hideously long time to load, but, in fairness, this is, to an extent, an apples and oranges comparison. Kind of like using bash for a quick arithmetic calculation vs. firing up Mathematica or a similar program. Having said all this, I'm not a programmer by any stretch. The best units for measuring my coding productivity is projects / decade. I do tend to try the gui admin tools before going to the text files, though /etc/hosts and /etc/fstab are pretty common to at least read when something gets broken. In fact, I do so little programming that it has been a year since I've compiled anything - running into some issues with a bunch of upgrades - if by two weekends from now I don't even have a serial CPMD 3.13.1 running I'm going to break down and seek help from the appropriate lists. Want to wrestle with it a little more - I feel very, very close. Mark Kosmowski From bob at drzyzgula.org Tue Jul 22 13:19:10 2008 From: bob at drzyzgula.org (Bob Drzyzgula) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <20080722175405.GA17358@bx9.net> References: <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722175405.GA17358@bx9.net> Message-ID: <20080722201910.GD2451@drzyzgula.org> On Tue, Jul 22, 2008 at 10:54:06AM -0700, Greg Lindahl wrote: > > > It is not even certain that the default, base install of a Linux > > system will include Emacs > > This just indicates a conspiracy of vi users. Or, more likely, > vi users complained that emacs was in the default. Emacs users > aren't bothered by having vi around. But I don't understand... if resources aren't an issue (and certainly they haven't been for at least a decade, since BIOSs started supporting El Torito) and systems programmers are *not* more likely to be vi users than emacs users, would't we be seeing emacs on more live and rescue CDs by now? I'm curious as to how the vi conspiracy effects its apparent influence... :-) :-) :-) --Bob From john.hearns at streamline-computing.com Tue Jul 22 14:20:40 2008 From: john.hearns at streamline-computing.com (John Hearns) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <20080722201910.GD2451@drzyzgula.org> References: <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722175405.GA17358@bx9.net> <20080722201910.GD2451@drzyzgula.org> Message-ID: <1216761650.4901.9.camel@Vigor13> On Tue, 2008-07-22 at 16:19 -0400, Bob Drzyzgula wrote: > But I don't understand... if resources aren't an issue (and > certainly they haven't been for at least a decade, since > BIOSs started supporting El Torito) and systems programmers > are *not* more likely to be vi users than emacs users, > would't we be seeing emacs on more live and rescue CDs by > now? I'm curious as to how the vi conspiracy effects its > apparent influence... :-) :-) :-) We have our methods. Let's just say that the dental probe I carry around for freeing the latches on Infiniband cables has.... other uses. ps. when Dan Brown's book on the Vi Conspiracy is made into a movie I bags Jean Reno to play me. From peter.st.john at gmail.com Tue Jul 22 14:36:36 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <1216761650.4901.9.camel@Vigor13> References: <4880A4F4.4050509@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722175405.GA17358@bx9.net> <20080722201910.GD2451@drzyzgula.org> <1216761650.4901.9.camel@Vigor13> Message-ID: Fair enough, I'll settle for Gary Oldman. We'll let RGB have Anthony Hopkins. Peter On 7/22/08, John Hearns wrote: > > On Tue, 2008-07-22 at 16:19 -0400, Bob Drzyzgula wrote: > > > But I don't understand... if resources aren't an issue (and > > certainly they haven't been for at least a decade, since > > BIOSs started supporting El Torito) and systems programmers > > are *not* more likely to be vi users than emacs users, > > would't we be seeing emacs on more live and rescue CDs by > > now? I'm curious as to how the vi conspiracy effects its > > apparent influence... :-) :-) :-) > > > We have our methods. > Let's just say that the dental probe I carry around for freeing the > latches on Infiniband cables has.... other uses. > > ps. when Dan Brown's book on the Vi Conspiracy is made into a movie I > bags Jean Reno to play me. > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080722/748dd966/attachment.html From fkruggel at uci.edu Sat Jul 19 13:40:59 2008 From: fkruggel at uci.edu (fkruggel@uci.edu) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? Message-ID: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> Thanks for your suggestions. Let me be more specific. I would like to have nodes automatically wake up when needed and go to sleep when idle for some time. My ganglia logs tell me that there is considerable idle time on our cluster. The issue is that I would like to have the cluster adapt *automatically* to the load, without interaction of an administrator. Here is how far I got: I can set a node to sleep (suspend-to-ram) using ACPI. But for powering on, I have to press the power button. No automatic solution. I can shut down and wake up a node over the network (e.g., using etherwake). But I consider the time to boot a node as too long. Is it possible to wake up a node over lan (without reboot)? How can I detect that a node was idle for some specific time? Thanks, Frithjof From alex at tinkergeek.com Sun Jul 20 08:25:21 2008 From: alex at tinkergeek.com (Alex Younts) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <9E1DA96988FC4E6282063A9BA3C39F05@geoffPC> References: <1184.68.109.69.214.1216444854.squirrel@webmail.uci.edu><87d4l9stlk.fsf@snark.cb.piermont.com> <6009416b0807191301w3eb74fftd2af70523e1b40bf@mail.gmail.com> <9E1DA96988FC4E6282063A9BA3C39F05@geoffPC> Message-ID: <488358E1.6030807@tinkergeek.com> A graduate student at Purdue did research into this topic. He presented his work at the 2007 Linux Cluster Institute conference and the professor he works with still uses the same technique to dynamically add or remove nodes from his cluster. His paper can be found at: http://www.linuxclustersinstitute.org/conferences/archive/2007/PDF/fengping_24145.pdf -Alex Geoff Galitz wrote: > > > Many, many, many moons ago I wrote a plugin for the clustering framework > (now defunct) that we used and I was a developer on back at UC Berkeley. It > was quite simple... it simply checked to see if jobs were in the queue, if > not it looked to see what nodes were free (using OpenPBS/Torque native > commands), did the necessary parsing of a few backend config files and then > issued standard shutdown commands to the idle nodes. When jobs started to > back up in the queue, the plugin used WOL to start up nodes. > > It was in perl and less than 100 lines. > > I easily could have used IPMI instead, but many of the boxes we were using > had better WOL support than IPMI. WOL is standard while IPMI can vary from > vendor to vendor... so if your needs are no more complex than this, WOL is a > good way to go. > > > > > > -geoff > > Geoff Galitz > Blankenheim NRW, Deutschland > http://www.galitz.org > > -----Original Message----- > From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On > Behalf Of Nathan Moore > Sent: Samstag, 19. Juli 2008 22:02 > To: beowulf@beowulf.org > Subject: Re: [Beowulf] Green Cluster? > > I think the feature you're looking for is "Wake on LAN", > http://en.wikipedia.org/wiki/Wake-on-LAN > > I've wondered similar things - the small cluster I run for a > students/departmental use is generally off, except when I'm teaching > computational physics, or have a student interested in a specific research > project. It would be nice to be able to "turn on" a few machines (from > home, at 11:30pm) when I have to run something substantial. > > If you find a good step-by-step resource describing how to do this, I'd love > to hear about it. > > Nathan Moore > > > On Sat, Jul 19, 2008 at 11:53 AM, Perry E. Metzger > wrote: > > > > fkruggel@uci.edu writes: > > I am wondering whether there is any mechanism to automatically > > power down nodes (e.g., ACPI S3) when idle for some time, and > > automatically wake up when requested (e.g., by WOL, some cluster > > scheduler, ssh). I imagine that I could cut down power & cooling > > on our system by more than 50%. Any hints? > > > Depending on the motherboard, there are ways to do this. You can do > wake on network and other tricks. However, if you would really save > half the power, that implies that your cluster is half idle. If it > is > really half idle, why aren't you simply shutting half of it down? > > Perry > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > From oborn at iac.isu.edu Sun Jul 20 15:15:39 2008 From: oborn at iac.isu.edu (Brian Oborn) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <877ibgv4wg.fsf@snark.cb.piermont.com> References: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> <877ibgv4wg.fsf@snark.cb.piermont.com> Message-ID: <4883B90B.2070408@iac.isu.edu> [snip] > Generally speaking, if you have a large cluster, and you have enough > work for it, it is going to be running flat out 24x7. If it isn't, > you've bought more hardware than you need. > There are many cases where a cluster is not used for continuously for calculations, but rather to reduce turn-around. Our cluster for instance is sitting idle at least 1/2 the time, but keeping it's size to the point where when people do run they can get their results back in 8 hours or less makes them much more productive. This way they can look at the results, tweak things as necessary, and run again while the setup is still fresh in their mind. > Again, though, I think you might want to ask if you're doing the right > thing here. If all your machines are not working flat out 100% of the > time, you have hardware depreciating (and rapidly becoming obsolete) > to no purpose, and there are loads of people out there, probably even > on your own campus, who probably are desperate for compute > cycles. (Hell, I could use extra cycles -- I can never afford enough.) > We've been trying to encourage other departments on campus to use some of our extra cluster time, but with rather mixed results. > Optimally, a cluster will be working 100% of the time, until one day > when it is obsolete (that is, the cost in space, power, and cooling > is more than replacing it with faster/better hardware), it gets shut > down, replaced, and sold off. > > Perry > From lynesh at cardiff.ac.uk Mon Jul 21 01:42:49 2008 From: lynesh at cardiff.ac.uk (Huw Lynes) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <4883E45B.2010902@scalableinformatics.com> References: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> <877ibgv4wg.fsf@snark.cb.piermont.com> <20080721005711.GB3465@bx9.net> <4883E45B.2010902@scalableinformatics.com> Message-ID: <1216629769.3177.10.camel@w1199.insrv.cf.ac.uk> On Sun, 2008-07-20 at 21:20 -0400, Joe Landman wrote: > Greg Lindahl wrote: > That said, we like IPMI in general, and even better when it works :( > Sometimes it does go south, in a hurry (gets into a strange state). In > which case, removing power is the only option. Agreed. Currently my most common reason for going into the machine room is to pull the power cable out of a node in the hope that the BMC will reset in a sane state, which it usually does. If I had network switched PDUs I'd never have to leave my office. The advantage of smart PDUs is that they can switch off anything whereas IPMI and other lights-out systems usually only exist on computers. All things being equal I'd rather have both. There is also a green issue here. Modern systems draw power even when they are "off" in order to keep the BMC and network card live. Obviously a system switched off at the PDU draws no power. Having said that I haven't looked at how much power an unloaded smart PDU draws. Thanks, Huw -- Huw Lynes | Advanced Research Computing HEC Sysadmin | Cardiff University | Redwood Building, Tel: +44 (0) 29208 70626 | King Edward VII Avenue, CF10 3NB From sassmannshausen at tugraz.at Tue Jul 22 03:04:26 2008 From: sassmannshausen at tugraz.at (=?iso-8859-15?q?J=F6rg_Sa=DFmannshausen?=) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] problem with binary file on NFS Message-ID: <200807221204.26132.sassmannshausen@tugraz.at> Dear all, I have a problem with a selfwritten program on my small cluster. The cluster nodes are PIII 500/800 MHz machines, the /home is distributed via NFS from a PIII 1 GHz machine. All nodes are running on Debian Etch. The program in question (polymc_s) is in the users /home directory and is running on all nodes but one. I get the following error messages on that particular node (node4): ldd polymc_s not a dynamic executable file polymc_s polymc_s: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.4.1, dynamically linked (uses shared libs), for GNU/Linux 2.4.1, not stripped strace ./polymc_s execve("./polymc_s", ["./polymc_s"], [/* 18 vars */]) = -1 ENOMEM (Cannot allocate memory) +++ killed by SIGKILL +++ Process 2177 detached I am currently running memtest, no problems thus far. Other programs like the BOINC stuff (I am using that for stress-testing) are ok. Reuti already suggested to do: readelf -a polymc_s which gave identical outputs on node4 and node3 (both are PIII 500 MHz machines). I am somehow stuck here, has anybody got a good idea? Running the software locally does not make any difference, i.e. same errors as above. Changing the file permissions did not mend it either. I am aware these are old nodes, but for the purpose they are ok. All the best from Graz! J?rg -- ************************************************************* J?rg Sa?mannshausen Institut f?r Chemische Technologie von Materialien TU-Graz Stremayrgasse 16 8010 Graz Austria phone: +43 (0)316 873 8954 fax: +43 (0)316 873 4959 homepage: http://sassy.formativ.net/ Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html From schoenk at utulsa.edu Tue Jul 22 14:54:53 2008 From: schoenk at utulsa.edu (Schoenefeld, Keith) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Strange SGE scheduling problem Message-ID: <5E0BB54BEC5EBA44B373175A080E64010A733879@ophelia.ad.utulsa.edu> My cluster has 8 slots (cores)/node in the form of two quad-core processors. Only recently we've started running jobs on it that require 12 slots. We've noticed significant speed problems running multiple 12 slot jobs, and quickly discovered that the node that was running 4 slots on one job and 4 slots on another job was running both jobs on the same processor cores (i.e. both job1 and job2 were running on CPU's #0-#3, and the CPUs #4-#7 were left idling. The result is that the jobs were competing for time on half the processors that were available. In addition, a 4 slot job started well after the 12 slot job has ramped up results in the same problem (both the 12 slot job and the four slot job get assigned to the same slots on a given node). Any insight as to what is occurring here and how I could prevent it from happening? We were are using SGE + mvapich 1.0 and a PE that has the $fill_up allocation rule. I have also posted this question to the hpc_training-l@georgetown.edu mailing list, so my apologies for people who get this email multiple times. Any help is appreciated. -- KS From rgb at phy.duke.edu Wed Jul 23 13:37:16 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <20080722175405.GA17358@bx9.net> References: <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722175405.GA17358@bx9.net> Message-ID: On Tue, 22 Jul 2008, Greg Lindahl wrote: > On Tue, Jul 22, 2008 at 10:54:47AM -0400, Bob Drzyzgula wrote: > >> It is not even certain that the default, base install of a Linux >> system will include Emacs > > This just indicates a conspiracy of vi users. Or, more likely, > vi users complained that emacs was in the default. Emacs users > aren't bothered by having vi around. More likely it is history and inertia. I think even systems people tend to be a bit dazed by Moore's Law. Note that Bob and I started out on systems with far less than 100 MB of DISK and perhaps a MB of system memory on a fat SERVER in the latter 80's. And the P(o)DP(eople) made do with even less in the early 80's. And the costs for these servers were staggeringly higher -- $40K and up without bound (hundreds of thousands of dollars to millions of dollars in 1980 money), until pretty much the end of the 80's when Suns got sufficiently commoditized and pressure from the descendants of the IBM PC continued to mount and prices started to drop on non-PC iron and workstations. Still, we paid just about $100K even for a refrigerator-sized SGI 220S with two processors, 8 MB of memory, and sometime like a 100 MB drive in 1989 or 1990 or thereabouts. Software maintenance alone on it was $3500 a year. We sold it in maybe 1995 for $3500 -- Sparc 2's were down to that or even less, ultrasparcs were coming out, one could put that much compute power on your desk with much more disk for the maintenance cost alone (and no need for 1.5 tons of AC just to run two processors!). Back then (80's) resources were tight and expensive and fitting into a small footprint was key, as the 100 MB or so one could afford had to hold the OS itself, all e.g. emacs etc sources and build spaces, and /home for all users. Our Sun 4/110 served something like 50 or 100 users on a mix of tty connections (into big multiport serial bus interfaces) and IP connections, on a processor far slower and with far less memory than the one in my phone or PDA, and its performance was quite acceptable. BUT, things like emacs had big memory footprints for the time; some people wouldn't install it on a public server just because enough simultaneous users would bring it to its knees, shared libraries or not. People bought whole workstations in part so they could run emacs and code development tools off the public servers without resource competition. vi back then was little more than a shell on ed IIRC -- tiny and efficient. More importantly, ed and vi were compiled static, along with a handful of other key tools, and lived in /sbin on early systems. They were literally the first things installed in the bootstrap install of any unixoid operating system, and they were the only things that would WORK if libc failed (or the drive/partition holding libc failed, but the / or /boot partition survived with the /sbin image intact). Which happened. Not even that infrequently. So if you wanted to NOT have to do a full reinstall from a QIC tape following an elaborate and arcane bootstrapping procedure, followed by a rebuild and reconfiguration, or a restore from a tape backup that one would pray actually worked and in any event would cost SOME user SOME time or critical files, you learned vi. It was the editor that worked when all others failed. I learned this (like so many other lessons back then) the hard way -- a system I was managing died -- I think it was one of the SGIs -- and I had to go in to perform some /etc surgery to try to bring it back without a reinstall. However, its network was gone, its access to /usr/share was gone, and /usr (a separate partition) had come disconnected somehow. jove (which I had already learned and mostly used) Did Not Work. Emacs Did Not Work. Both binaries needed something that was gone. I thought I was dead in the water, and talked to my guru about it looking for help as I did NOT want to do a reinstall and he was a guru, right? A magic worker. At which time he gave me his usual withering look, mumbled about the importance of my reading all of the man pages -- I mean ALL of them, and I mean whether or not I ever needed to use what they described -- and whacked me upside the head with a banana while describing the proper function and purpose of /etc/sbin, what the "s" stood for, and why I had to learn to use (and remember!) vi or even ed because without an editor in the base system life was bad. I humbly spent weeks popping in and out of insert mode and learning the embedded ed commands until I could at least reliably survive, but I missed my jove. NOW it doesn't matter any more. I carry a workable, bootable linux around in my pocket at all times nowadays on a USB thumb drive that has more memory than any of my computers or servers did (including those that served whole departments) until maybe twelve or thirteen years ago. The thumb drive there at this very moment holds a bit of a small linux, just a rescue system, and I think it has both vim and emacs on it (but not jove, alas). In my bag I have my somewhat more expensive 8 GB thumb drive, large enough to hold a kitchen-sink installation of Fedora 8 and still leave me a few GB to use as a "small" home directory. And by next year, I'm guessing that 16 GB thumbs will cost what 8's do now (if not 4's), and one will be able to carry around a kitchen sink linux PLUS 10-12 GB of personal workspace for maybe $50. Give me a machine that can boot USB and four minutes and I'll be working on a linux machine no matter what it has installed, that kind of thing. Even networks and devices, the things that plagued that sort of freedom for so long, are being tamed by hal and friends so that e.g. NetworkManager makes the network "just work" no matter what hardware it finds at boot time. The issue of resource consumption between vi(m) and emacs is hence truly irrelevant to modern resource scales, and becoming more irrelevant daily. RAM size is scaling up at roughly 10 MB a day, amortized over a year. HD is scaling up by what, 10 GB a day? More? So if an 8 MB memory footprint was relevant yesterday, it probably isn't today (he says typing the reply on a system with 4 GB of RAM, where last year I bought a system with 2 GB of RAM and two years previously got a system with 512 MB of RAM). Like that. Even mighty emacs vanishes without a trace in 4 GB, far less than 1% of the resource -- now bloat is represented by the ever expanding maw of e.g. ooffice (or better yet, by Vista of Evil, which crawls in 2 GB and needs 4 GB to really get happy). X at over 100 MB is "suddenly" almost inconsequential. ooffice eating 100 MB more on a personal laptop (not even a server) -- who cares? And next year, or the year after that, 8 GB RAM systems for $1000, thumb drives with 32 GB, computer/PDAs that run at a GHz or more and have many GB of internal memory hard and soft, TBs of disk standard. > Interestingly, looking at the Red Hat RPMs, a full emacs install is > only 2X the size of a full vim install. The main difference is that a > minimal vim install is very small because it doesn't need vim-common, > but no one's done the work to get emacs-nox to run sans emacs-common. > Either way, it's a tiny fraction of a DVD, so you can look forward > to full emacs on your rescue disk soon. Assuming the conspirators and > complainers don't have their way. This is dead certain correct. A tiny fraction of a DVD, a thumb drive, of main memory, and a truly miniscule fraction of the TB scale OTC drives here and coming. But it was not always so, and it is the Old Guys that still configure many of the base setups. Its like my parents -- they grew up in the great depression, and never quite got used to the idea that they weren't actually hungry and poor even when they were quite comfortably off. They would still go dumpster diving into their late 70's, because why throw a chair away only because it had a broken leg or a stain on the seat? Why buy a chair when one can find one in a dumpster? Never mind that it takes days of work to fix up the chair but only hours of work to earn the money to buy a new one. Bad ecology, but (perhaps unfortunately) good economics. We are now embarrassed by computer resource riches, but our minds were set by our early experiences with poverty and scarcity. The same issue (this isn't entire OT) comes up in coding practice. I know people who work for days, sometimes weeks, tightening up code so it is absolutely efficient, and everything is done in a clean way that doesn't waste memory or time. OTOH, I personally write code (and teach my students to write code) that is resource aware, but to be SENSIBLE about it. By this I mean that if one can accomplish some task in 1 millisecond in the initialation phase of some program using an hour of programming in a straightforward way, but reduce it to a microsecond if one spends a week reordering loops and and optimizing, NOBODY CARES. 1 millisecond is less than human reaction time -- nobody could tell the difference, literally. Even a half second is probably irrelevant. If there is a way of programming that reduces the memory footprint by a few hundred K but requires great care at managing the memory and coding vs just allocating a few big blocks ditto -- a few hundred K is quite irrelevant now, in nearly all cases (where once it would have been TERRIBLE practice). The opposite is true (of course) in core loops. Memory leaks, wasted cycles, all add up there (depending). Even there, adding a millisecond to a loop that takes a minute to complete is invisible, where adding it to a loop that takes 100 microseconds to complete is a disaster. This sort of style infuriates some purists. They'll work for days to avoid wasting something that there is a nearly inexhaustible supply of, a supply that is exponentially growing so fast that in the time it takes them to complain about it, the net resource has grown more than the marginal difference consumed. This is the ssh vs rsh questions -- rsh is maybe 10 or 20 times faster than ssh, but for most cluster purposes nobody should really care -- ssh isn't the actual IPC channel, it is just used out of band to start and stop tasks and if it takes ten seconds to do this instead of one second on a task that will run for a week, what difference does it make? There the time trade off the OTHER way is the time it might take you to cope with a cracking episode caused by rsh's utter lack of security, which can be an issue even inside a cluster, if multiple people use the cluster and some of them are untrustworthy. You know. Grad students. Postdocs. Disgruntled employees. IP thieves. Most privacy abuses are internal, IIRC -- originate inside the so-called "trusted" space from employees or people with the "right" to be there. So even if emacs IS pure, unadulterated bloat (and it's not - it is a damn powerful tool, just one that is too complex for most normal humans to master any more except for their own specific narrow context) I'm here to announce to the world that it is IRRELEVANTLY SMALL as of several years ago, and getting smaller by the day as Moore's Law advances even faster than Lisp programmers can expand to fill the available space. But jove is still smaller, tighter, faster, and better, if all you do is code in C, fortran, simple scripting languages, and write straight text. It's as good or better than vi(m) for editing system files or anything else. Small, tight, and fast are irrelevant now -- if it ran ten times faster than emacs proper nobody could tell, if it were ten times smaller nobody would care, if the code were ten times perfecter -- well, people DO care about tightness and quality. jove is probably better debugged and more trouble free than emacs (in any modern flavor). But as you note, people use what they learn first, and they learn first what they need to learn given their task mix. For really old sysadmins and coders, that was almost certainly vi (I'm a bit of an exception). For almost as old coders, it is emacs, but sysadmins still needed to do vi first and foremost until people stopped separating out /usr as a partition (which was done to facilitate early installs and upgrades, where one might want to preserve a big chunk of /, especially /etc). For coders of the early 90's on, it was pretty much both, although even through the early 90's having emacs relied on having a good sysadmin who built and installed it and who used a nonstandard layout and installed /usr in the / partition so libraries emacs relied on wouldn't go away while the system still functioned. From the mid 90's on (with linux on the move and packaging systems flourishing) people no longer HAD to build emacs, and it became truly universal. At the same time it went through its greatest period of bloat -- xml coders wanted xml indented and color coded, ditto html coders, ditto php coders, it split into an x-only version and a tty but still GUIish version, people integrated mice to make it more gui-like either way, etc etc. Somewhere in there somebody introduced mouse-clickable buttons on a button panel, and I stopped even LOOKING at emacs or xemacs in new releases. It now largely looks like, and functions like, other WYSIWYG editors in many contexts and for many users. The cleanness and fingers-on-the-keysness of it that were its original appeal are now distant memories; even compiling tends to be done by means of pulldown mouse menus. I can now go head to head with many emacs users just as I could any ooffice user, and open a file, make a key-based edit, compile it, and run it in about the time it takes them to open it, make the same change, and reach for their mouse to initiate the compile. Ease of learning has at long last started to win out over speed. Learn to use a Mac in a day, pay for it forever, used to be an instructive adage. GUIs are easy to learn and use, but slow as molasses when you want to do certain kinds of work. Many/most new emacs users I know -- ones that have started in the 2000's, say -- don't even know the elementary cursor movement key combinations. They just use the cursor keys -- it is easy, if slow. They don't know how to move or invoke make or split screens with just their fingers. They use the mouse. They don't grok Ctrl-space, Alt-F, Alt-F, Ctrl-W, Ctrl-Ifrog, space, Ctrl-Y to move two words from whereever you are to the location of the word "frog", which takes less than the time required to actually get to the mouse with your hand, let alone move the cursor, highlight the text, cut it, scroll down to the word frog (which might be ten pages down), click on an insert point, and select "paste" from a menu. And leaves your fingers right on the home keys, still typing, your train of thought unbroken. Ultimately it is THIS that is the real shame. The really good Unix text editors, especially the ones for programmers, were expert friendly to be sure -- one has to WORK to learn to do this sort of thing at the speed of thought. As they are GUIfied, made idiot-simple, made to look like all those Mac interfaces and MS Word and Ooffice-alikes, something important is lost. Speed. I mean serious speed, speed of work done by humans. Human time is the resource that really costs money, just as much today as it did back in 1982 or 1988 when computers were extraordinarily expensive where now they are cheap. I've written entire functioning multilevel autobuilding mysql integrated websites in php in a day, and debugged them and extended them in a day or two more -- less than 40 hours of work -- using jove. I can rip out code in jove (and y'all KNOW I can type like the wind in jove, as jove is one reason I am so list-prolific:-). One can pop in and out of it to test, flop windows and test, run multiple source windows and test, in fractions of a second and without conscious thought, so the connection of brain to actual task semantics is never broken. And this isn't to tout it -- I'm sure vim or emacs users of the old school could do the same as long as they keep the fingers on those home keys and have mastered the keystroke-based shortcuts. Or joe users -- wordstar may have been the last great PC/DOS editor to keep fingers firmly on the keys where they belong when working with text. But show me a "programmer" who cannot work without their mouse and a GUI-based text editor, who has to scroll slowly up and down or constantly move hands from the keys to the mouse and back to select even elementary functions, and I'll show you somebody that will take a month to do the work I did in 3 days... rgb > > -- greg > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From rgb at phy.duke.edu Wed Jul 23 13:48:03 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: References: <4880A4F4.4050509@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722175405.GA17358@bx9.net> <20080722201910.GD2451@drzyzgula.org> <1216761650.4901.9.camel@Vigor13> Message-ID: On Tue, 22 Jul 2008, Peter St. John wrote: > Fair enough, I'll settle for Gary Oldman. We'll let RGB have Anthony > Hopkins. No, no, no. John Malkovitch. The resemblance is actually fairly striking. Bald, pudgy, whiny sardonic voice, sexy as all hell. Might even fool my wife...;-) rgb > Peter > > On 7/22/08, John Hearns wrote: >> >> On Tue, 2008-07-22 at 16:19 -0400, Bob Drzyzgula wrote: >> >>> But I don't understand... if resources aren't an issue (and >>> certainly they haven't been for at least a decade, since >>> BIOSs started supporting El Torito) and systems programmers >>> are *not* more likely to be vi users than emacs users, >>> would't we be seeing emacs on more live and rescue CDs by >>> now? I'm curious as to how the vi conspiracy effects its >>> apparent influence... :-) :-) :-) >> >> >> We have our methods. >> Let's just say that the dental probe I carry around for freeing the >> latches on Infiniband cables has.... other uses. >> >> ps. when Dan Brown's book on the Vi Conspiracy is made into a movie I >> bags Jean Reno to play me. >> >> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From reuti at staff.uni-marburg.de Wed Jul 23 14:35:25 2008 From: reuti at staff.uni-marburg.de (Reuti) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Strange SGE scheduling problem In-Reply-To: <5E0BB54BEC5EBA44B373175A080E64010A733879@ophelia.ad.utulsa.edu> References: <5E0BB54BEC5EBA44B373175A080E64010A733879@ophelia.ad.utulsa.edu> Message-ID: Hi, Am 22.07.2008 um 23:54 schrieb Schoenefeld, Keith: > My cluster has 8 slots (cores)/node in the form of two quad-core > processors. Only recently we've started running jobs on it that > require > 12 slots. We've noticed significant speed problems running > multiple 12 > slot jobs, and quickly discovered that the node that was running 4 > slots > on one job and 4 slots on another job was running both jobs on the > same > processor cores (i.e. both job1 and job2 were running on CPU's #0-#3, > and the CPUs #4-#7 were left idling. The result is that the jobs were > competing for time on half the processors that were available. how did you check this? With `top`? You have one queue with 8 slots per machine? -- Reuti > In addition, a 4 slot job started well after the 12 slot job has > ramped > up results in the same problem (both the 12 slot job and the four slot > job get assigned to the same slots on a given node). > > Any insight as to what is occurring here and how I could prevent it > from > happening? We were are using SGE + mvapich 1.0 and a PE that has the > $fill_up allocation rule. > > I have also posted this question to the hpc_training-l@georgetown.edu > mailing list, so my apologies for people who get this email multiple > times. > > Any help is appreciated. > > -- KS > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From dnlombar at ichips.intel.com Wed Jul 23 14:43:55 2008 From: dnlombar at ichips.intel.com (Lombard, David N) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> References: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> Message-ID: <20080723214355.GA14808@nlxdcldnl2.cl.intel.com> On Sat, Jul 19, 2008 at 01:40:59PM -0700, fkruggel@uci.edu wrote: > Thanks for your suggestions. Let me be more specific. > I would like to have nodes automatically wake up when > needed and go to sleep when idle for some time. My > ganglia logs tell me that there is considerable idle > time on our cluster. The issue is that I would like to > have the cluster adapt *automatically* to the load, > without interaction of an administrator. Sounds like a plan... > Here is how far I got: > I can set a node to sleep (suspend-to-ram) using ACPI. > But for powering on, I have to press the power button. > No automatic solution. ... > Is it possible to wake up a node over lan (without reboot)? It depends. (Did you actually expect a different answer?) Setting the wakeup events *may* help. What does /proc/acpi/wakeup show? Here's an example from a D975PBZ running F7's 2.6.23: Device S-state Status Sysfs node TANA S4 disabled pci:0000:02:01.0 P0P3 S4 disabled pci:0000:00:1e.0 AC97 S4 disabled USB0 S3 disabled pci:0000:00:1d.0 USB1 S3 disabled pci:0000:00:1d.1 USB2 S3 disabled pci:0000:00:1d.2 USB3 S3 disabled pci:0000:00:1d.3 USB7 S3 disabled pci:0000:00:1d.7 UAR1 S4 disabled pnp:00:07 SLPB S4 *enabled Note, only SLPB (sleep button) is enabled by default on this system. NB: - the "TANA" device on *this* system is the NIC - setting wol via ethtool doesn't affect the above. And here's a old Dell Inspiron running kernel.org's 2.6.23.8: # cat /proc/acpi/wakeup Device S-state Status Sysfs node LID S3 *enabled PBTN S4 *enabled PCI0 S3 disabled no-bus:pci0000:00 UAR1 S3 disabled pnp:00:0d MPCI S3 disabled Where both the lid (LID) and power (PBTN) buttons are enabled by default. Also note the maximum ACPI sleep levels whence the wakeup will work. If you need to enable a device, use # echo _device_ enable > /proc/acpi/wakeup where _device_ is the name listed in /proc/acpi/wakeup Here's the Dell responding to a lid close in a very very minimal system (kernel, busybox, uClibc): # Stopping tasks ... done. Suspending console(s) Opening the lid produces this after about 6 seconds: pnp: Device 00:0d disabled. ACPI: PCI Interrupt 0000:00:03.0[A] -> Link [LNKD] -> GSI 11 (level, low) -> IR1 ACPI: PCI Interrupt 0000:00:03.1[A] -> Link [LNKD] -> GSI 11 (level, low) -> IR1 pnp: Device 00:0d activated. Restarting tasks ... done. # > How can I detect that a node was idle for some specific time? This all really needs to be run from the RM (resource manager). The RM can know when a job ends on a node and that a node will or will not be free in the future. The RM can also manage the scheduler to avoid bringing sleeping nodes up until they're actually needed--a SMOP left as an exercise to the reader ;) I *think* Moab may do some of this stuff already. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From mathog at caltech.edu Wed Jul 23 15:25:44 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:07:28 2009 Subject: [Beowulf] Drive screw fixed with LocTite Message-ID: A vendor who shall remain nameless graced us with a hot swappable drive caddy in which one of the three mounting screws used to fasten the drive to the caddy had been treated with blue LocTite. This wasn't obvious from external inspection, but the telltale blue glop was on the threads when the screw finally let go and came out. It was beginning to look like power tools were going to be needed to get it out, and the screw head was pretty badly torn up after removal. This is the first time I have encountered a drive screw on a removable drive which was, well, unremovable. Is this a trend or are we just dealing with a sadistic assembler? Thanks, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From matt at technoronin.com Wed Jul 23 15:49:54 2008 From: matt at technoronin.com (Matt Lawrence) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Green Cluster? In-Reply-To: <1216629769.3177.10.camel@w1199.insrv.cf.ac.uk> References: <1373.68.109.69.214.1216500059.squirrel@webmail.uci.edu> <877ibgv4wg.fsf@snark.cb.piermont.com> <20080721005711.GB3465@bx9.net> <4883E45B.2010902@scalableinformatics.com> <1216629769.3177.10.camel@w1199.insrv.cf.ac.uk> Message-ID: On Mon, 21 Jul 2008, Huw Lynes wrote: > The advantage of smart PDUs is that they can switch off anything whereas > IPMI and other lights-out systems usually only exist on computers. All > things being equal I'd rather have both. It's also entertaining watching upper management do a doubletake when you say "I'm waiting for the power strip to boot up". -- Matt It's not what I know that counts. It's what I can remember in time to use. From kilian at stanford.edu Wed Jul 23 16:20:25 2008 From: kilian at stanford.edu (Kilian CAVALOTTI) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: References: <20080722175405.GA17358@bx9.net> Message-ID: <200807231620.25900.kilian@stanford.edu> On Wednesday 23 July 2008 01:37:16 pm Robert G. Brown wrote: > But show me a "programmer" who cannot work without their mouse > and a GUI-based text editor, who has to scroll slowly up and down or > constantly move hands from the keys to the mouse and back to select > even elementary functions, and I'll show you somebody that will take > a month to do the work I did in 3 days... That's exactly why human species will evolve to grow a third hand. We *need* it, so we won't have to take our hands off the keyboard to reach the mouse. I see no other way. Cheers, -- Kilian From svdavidson at charter.net Wed Jul 23 14:02:08 2008 From: svdavidson at charter.net (Shannon V. Davidson) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Strange SGE scheduling problem In-Reply-To: <5E0BB54BEC5EBA44B373175A080E64010A733879@ophelia.ad.utulsa.edu> References: <5E0BB54BEC5EBA44B373175A080E64010A733879@ophelia.ad.utulsa.edu> Message-ID: <48879C50.5000607@charter.net> Schoenefeld, Keith wrote: > My cluster has 8 slots (cores)/node in the form of two quad-core > processors. Only recently we've started running jobs on it that require > 12 slots. We've noticed significant speed problems running multiple 12 > slot jobs, and quickly discovered that the node that was running 4 slots > on one job and 4 slots on another job was running both jobs on the same > processor cores (i.e. both job1 and job2 were running on CPU's #0-#3, > and the CPUs #4-#7 were left idling. The result is that the jobs were > competing for time on half the processors that were available. > > In addition, a 4 slot job started well after the 12 slot job has ramped > up results in the same problem (both the 12 slot job and the four slot > job get assigned to the same slots on a given node). > > Any insight as to what is occurring here and how I could prevent it from > happening? We were are using SGE + mvapich 1.0 and a PE that has the > $fill_up allocation rule. > > I have also posted this question to the hpc_training-l@georgetown.edu > mailing list, so my apologies for people who get this email multiple > times. > Any insight as to what is occurring here and how I could prevent it from > happening? We were are using SGE + mvapich 1.0 and a PE that has the > $fill_up allocation rule. > This sounds like MVAPICH is assigning your MPI tasks to your CPUs starting with CPU#0. If you are going to run multiple MVAPICH jobs on the same host, turn off CPU affinity by starting the MPI tasks with the environment variable VIADEV_USE_AFFINITY=0 and VIADEV_ENABLE_AFFINITY=0. Cheers, Shannon > Any help is appreciated. > > -- KS > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > -- _________________________________________ Shannon V. Davidson Software Engineer Appro International 636-633-0380 (office) 443-383-0331 (fax) _________________________________________ From perry at piermont.com Wed Jul 23 18:06:03 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: (Robert G. Brown's message of "Wed\, 23 Jul 2008 16\:37\:16 -0400 \(EDT\)") References: <4880A4F4.4050509@scalableinformatics.com> <488462BB.4060308@aplpi.com> <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722175405.GA17358@bx9.net> Message-ID: <87prp4f5vo.fsf@snark.cb.piermont.com> "Robert G. Brown" writes: > Note that Bob and I started out on systems with far less than 100 MB > of DISK and perhaps a MB of system memory on a fat SERVER in the > latter 80's. And the P(o)DP(eople) made do with even less in the > early 80's. My first machine was a PDP-8. 4k of 12 bit words of genuine magnetic core memory, and two DECtape units with some small amount of storage (I can't remember, but I think it was on the order of 100k). I believe there are icons on my modern desktop that take up more space than that whole machine had for core. > vi back then was little more than a shell on ed IIRC It was (for nvi, is) the visual mode of ex, which is/was an extended line editor in the lineage of ed, kind of an extended ed. Perry -- Perry E. Metzger perry@piermont.com From perry at piermont.com Wed Jul 23 18:09:54 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Drive screw fixed with LocTite In-Reply-To: (David Mathog's message of "Wed\, 23 Jul 2008 15\:25\:44 -0700") References: Message-ID: <87ljzsf5p9.fsf@snark.cb.piermont.com> "David Mathog" writes: > A vendor who shall remain nameless graced us with a hot swappable drive > caddy in which one of the three mounting screws used to fasten the drive > to the caddy had been treated with blue LocTite. This wasn't obvious > from external inspection, but the telltale blue glop was on the threads > when the screw finally let go and came out. It was beginning to look > like power tools were going to be needed to get it out, and the screw > head was pretty badly torn up after removal. I believe a touch from a soldering iron will usually loosen LocTite, but that might also damage a drive, so be careful. > This is the first time I have encountered a drive screw on a removable > drive which was, well, unremovable. Is this a trend or are we just > dealing with a sadistic assembler? I've never seen it used with a drive, it is certainly not normal. Perry From bob at drzyzgula.org Wed Jul 23 18:31:10 2008 From: bob at drzyzgula.org (Bob Drzyzgula) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Drive screw fixed with LocTite In-Reply-To: <87ljzsf5p9.fsf@snark.cb.piermont.com> References: <87ljzsf5p9.fsf@snark.cb.piermont.com> Message-ID: <4887DB5E.50307@drzyzgula.org> If the hot-swappable drives are sold by the nameless vendor pre-installed in the caddy, it is possible that the LocTite's primary purpose was for tamper evidence, as in "if you pulled that screw you must have been messing with the drives and we don't have honor the warranty no more". Perry E. Metzger wrote: > "David Mathog" writes: > >> A vendor who shall remain nameless graced us with a hot swappable drive >> caddy in which one of the three mounting screws used to fasten the drive >> to the caddy had been treated with blue LocTite. This wasn't obvious >> from external inspection, but the telltale blue glop was on the threads >> when the screw finally let go and came out. It was beginning to look >> like power tools were going to be needed to get it out, and the screw >> head was pretty badly torn up after removal. >> > > I believe a touch from a soldering iron will usually loosen LocTite, > but that might also damage a drive, so be careful. > > >> This is the first time I have encountered a drive screw on a removable >> drive which was, well, unremovable. Is this a trend or are we just >> dealing with a sadistic assembler? >> > > I've never seen it used with a drive, it is certainly not normal. > > Perry > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From gerry.creager at tamu.edu Wed Jul 23 18:55:30 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: References: <4880A4F4.4050509@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722175405.GA17358@bx9.net> <20080722201910.GD2451@drzyzgula.org> <1216761650.4901.9.camel@Vigor13> Message-ID: <4887E112.2090800@tamu.edu> Too much information. Robert G. Brown wrote: > On Tue, 22 Jul 2008, Peter St. John wrote: > >> Fair enough, I'll settle for Gary Oldman. We'll let RGB have Anthony >> Hopkins. > > No, no, no. John Malkovitch. > > The resemblance is actually fairly striking. Bald, pudgy, whiny > sardonic voice, sexy as all hell. Might even fool my wife...;-) > > rgb > >> Peter >> >> On 7/22/08, John Hearns wrote: >>> >>> On Tue, 2008-07-22 at 16:19 -0400, Bob Drzyzgula wrote: >>> >>>> But I don't understand... if resources aren't an issue (and >>>> certainly they haven't been for at least a decade, since >>>> BIOSs started supporting El Torito) and systems programmers >>>> are *not* more likely to be vi users than emacs users, >>>> would't we be seeing emacs on more live and rescue CDs by >>>> now? I'm curious as to how the vi conspiracy effects its >>>> apparent influence... :-) :-) :-) >>> >>> >>> We have our methods. >>> Let's just say that the dental probe I carry around for freeing the >>> latches on Infiniband cables has.... other uses. >>> >>> ps. when Dan Brown's book on the Vi Conspiracy is made into a movie I >>> bags Jean Reno to play me. >>> >>> >>> _______________________________________________ >>> Beowulf mailing list, Beowulf@beowulf.org >>> To change your subscription (digest mode or unsubscribe) visit >>> http://www.beowulf.org/mailman/listinfo/beowulf >>> >> > -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From rgb at phy.duke.edu Wed Jul 23 15:27:58 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <200807231620.25900.kilian@stanford.edu> References: <20080722175405.GA17358@bx9.net> <200807231620.25900.kilian@stanford.edu> Message-ID: On Wed, 23 Jul 2008, Kilian CAVALOTTI wrote: > On Wednesday 23 July 2008 01:37:16 pm Robert G. Brown wrote: >> But show me a "programmer" who cannot work without their mouse >> and a GUI-based text editor, who has to scroll slowly up and down or >> constantly move hands from the keys to the mouse and back to select >> even elementary functions, and I'll show you somebody that will take >> a month to do the work I did in 3 days... > > That's exactly why human species will evolve to grow a third hand. We > *need* it, so we won't have to take our hands off the keyboard to reach > the mouse. I see no other way. A Lysenko-Lamarcian heretic, I see. Sigh. Well, the way natural selection works, we have to create selection pressure. Either humans have to preferentially select three handed mates (which is a somewhat interesting idea, hmmm, but it seems a bit unlikely) or we have to go out and start killing people who have only two hands, ideally before they reproduce. So ('ching-kachink' as he chambers a round) -- if you want to survive the very first round of "selection", please raise your hands...;-) rgb P.S. -- maybe genetic engineering is a better option... > > Cheers, > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From bob at drzyzgula.org Wed Jul 23 19:54:13 2008 From: bob at drzyzgula.org (Bob Drzyzgula) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <87prp4f5vo.fsf@snark.cb.piermont.com> References: <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722175405.GA17358@bx9.net> <87prp4f5vo.fsf@snark.cb.piermont.com> Message-ID: <20080724025413.GA20372@drzyzgula.org> On Wed, Jul 23, 2008 at 09:06:03PM -0400, Perry E. Metzger wrote: > > "Robert G. Brown" writes: > > Note that Bob and I started out on systems with far less than 100 MB > > of DISK and perhaps a MB of system memory on a fat SERVER in the > > latter 80's. And the P(o)DP(eople) made do with even less in the > > early 80's. > > My first machine was a PDP-8. 4k of 12 bit words of genuine magnetic > core memory, and two DECtape units with some small amount of storage > (I can't remember, but I think it was on the order of 100k). I believe > there are icons on my modern desktop that take up more space than that > whole machine had for core. Although it wasn't my first machine [1], I did work with an admittedly-old-at-the-time PDP-8 for a while in the early 1980s. It was used to run a Perkin-Elmer microdensitometer (think quarter-million-dollar film scanner). IIRC it had no non-volatile memory, and thus one needed to hand-load the bootstrap program using the front panel switches [2]. With the one I worked on, there was a hand-written sequence of octal codes taped up on the machine rack, and to fire it up you would mount a certain 9-track tape in the drive, toggle [3] the bootstrap code into memory using switches, and start it to running. The toggled-in code would load the rest of the OS from the tape drive. There was another version that would load the OS from a paper tape reader attached to the teletype, but no one ever bothered with it because it was such a PITA to use. Once you got it going it would read the data from the microdensitometer and write it to another 9-track tape (same drive, you'd unload the OS tape and mount the data tape). And once you had the data tape, you'd take it over to a PDP-11 and process the image using routines coded in Forth... Anyway, this is a good example of the sort of expectation management that many of us went through in those early days. By comparison, even ed starts to look pretty darned functional. Did I ever mention the months of my life I lost to an attempt to get TeX to (a) compile and run in TSO on OS/MVS, and (b) get it to generate output for an IBM 3820 remote SNA-attached laser printer? I suppose the bright side, we didn't have to trouble ourselves with firewalls, encryption, virus scanners, security patches, or in many cases even authentication systems... > > vi back then was little more than a shell on ed IIRC > > It was (for nvi, is) the visual mode of ex, which is/was an extended > line editor in the lineage of ed, kind of an extended ed. Correct. In ex you would enter the command "vi" at the colon-prompt to enter visual mode. You should still be able to do this on any system with vi installed -- give it a try! :-) FWIW, the shell command "vi" simply fires up ex in that mode to start with. --Bob [1] that was an IBM 1130 which bootsrapped off a single, 80-column puchcard containing a small amount of object code. [2] http://en.wikipedia.org/wiki/Image:Multiplex-80_after_30_years.jpg [3] You would enter the address you wanted to start at in octal (actually just binary grouped into three digits) using the switches -- IIRC up for "1" and down for "0", and then throw another switch that would load that number into the address register. Then you'd reset the switches to the pattern for the data you wanted there, and throw the "deposit" switch. Again IIRC as long as you were loading data into sequential addresses, it would auto-increment the address register, so from then on you needed only to keep entering each data value and pressing the deposit switch. And as long as I'm blathering about toggling things in from the front panel, I will go ahead and mention that I'm just old enough to have once been invited over to the home of one of my college professors to see this new Altair 8800 thing he was putting together... From gerry.creager at tamu.edu Wed Jul 23 20:12:02 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Drive screw fixed with LocTite In-Reply-To: <87ljzsf5p9.fsf@snark.cb.piermont.com> References: <87ljzsf5p9.fsf@snark.cb.piermont.com> Message-ID: <4887F302.8060909@tamu.edu> Perry E. Metzger wrote: > "David Mathog" writes: >> A vendor who shall remain nameless graced us with a hot swappable drive >> caddy in which one of the three mounting screws used to fasten the drive >> to the caddy had been treated with blue LocTite. This wasn't obvious >> from external inspection, but the telltale blue glop was on the threads >> when the screw finally let go and came out. It was beginning to look >> like power tools were going to be needed to get it out, and the screw >> head was pretty badly torn up after removal. > > I believe a touch from a soldering iron will usually loosen LocTite, > but that might also damage a drive, so be careful. Acetone or mineral spirits will also take care of locktite. Based on some rather harsh experience showed that the piddly little heat generated by a soldering iron won't really cause much damage. >> This is the first time I have encountered a drive screw on a removable >> drive which was, well, unremovable. Is this a trend or are we just >> dealing with a sadistic assembler? > > I've never seen it used with a drive, it is certainly not normal. > > Perry > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From rgb at phy.duke.edu Wed Jul 23 16:23:00 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <20080724025413.GA20372@drzyzgula.org> References: <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722175405.GA17358@bx9.net> <87prp4f5vo.fsf@snark.cb.piermont.com> <20080724025413.GA20372@drzyzgula.org> Message-ID: On Wed, 23 Jul 2008, Bob Drzyzgula wrote: > Although it wasn't my first machine [1], I did work with an > admittedly-old-at-the-time PDP-8 for a while in the early > 1980s. It was used to run a Perkin-Elmer microdensitometer > (think quarter-million-dollar film scanner). IIRC it had > no non-volatile memory, and thus one needed to hand-load > the bootstrap program using the front panel switches [2]. > With the one I worked on, there was a hand-written sequence > of octal codes taped up on the machine rack, and to fire > it up you would mount a certain 9-track tape in the drive, > toggle [3] the bootstrap code into memory using switches, > and start it to running. The toggled-in code would load > the rest of the OS from the tape drive. There was another > version that would load the OS from a paper tape reader > attached to the teletype, but no one ever bothered with it > because it was such a PITA to use. Once you got it going it > would read the data from the microdensitometer and write > it to another 9-track tape (same drive, you'd unload the > OS tape and mount the data tape). And once you had the > data tape, you'd take it over to a PDP-11 and process the > image using routines coded in Forth... That's how I booted the PDP-1, almost exactly. Good memory. Except it had the boot program on a paper loop that was permanently installed. Toggle, fire, paper loop goes swoosh, tape drive lives, boot continues. > Anyway, this is a good example of the sort of expectation > management that many of us went through in those early > days. By comparison, even ed starts to look pretty darned > functional. > > Did I ever mention the months of my life I lost to an > attempt to get TeX to (a) compile and run in TSO on OS/MVS, > and (b) get it to generate output for an IBM 3820 > remote SNA-attached laser printer? Sounds fascinating;-) > I suppose the bright side, we didn't have to trouble > ourselves with firewalls, encryption, virus scanners, > security patches, or in many cases even authentication > systems... > >>> vi back then was little more than a shell on ed IIRC >> >> It was (for nvi, is) the visual mode of ex, which is/was an extended >> line editor in the lineage of ed, kind of an extended ed. > > Correct. In ex you would enter the command "vi" at the > colon-prompt to enter visual mode. You should still be > able to do this on any system with vi installed -- give > it a try! :-) FWIW, the shell command "vi" simply fires > up ex in that mode to start with. Fires or fired -- I have no idea what vim does now. I'd have thought that it long ago divorced itself from ed (or em, en, ... ex) at the source level. But I used that trick (hopping from ed/ex into and out of vi) fewer times than I have fingers on one hand back in the day. Why, if one had fullscreen, would one ever use single line? Unless, of course, one was working on a genuine tty lineprinter, which I exceedingly rarely had to do (gnashing teeth most of the time) because the console had crashed somehow and yes, we had a teletype console to log all the messages. Just from working on PC's for five years before starting on Unix, I had higher expectations than ed if there was anything BUT a teletype -- anything with an actual screen. Ed reminded me of edlin, and edlin was a pretty pitiful editor (probably derived in some way from ed, come to think of it). As in one could write a better editor for any PC in maybe 500 lines of basica, and a WAY better editor with any compiler (and still have it fit on a floppy, or at most two). sed, on the other hand, I still use quite regularly, and it is basically an ed extension as well. Being scriptable and grokking regex's makes all the difference in the world, and if you have to change frog to toad in an entire directory of files or manage any number of other clever global changes, sed is hard to beat. Again an arcane tool and not for the timid (and more than a bit dangerous in terms of side effects:-) but if you ask your average a Windows MCSE to go through a directory tree and change all those frogs into toads (or perhaps princes:-) either he'll still be working a week later with some of the frogs turned into prinecs or pirnces or he'll have installed cygwin and done it using sed in less than an hour INCLUDING the cygwin download and install. rgb > > --Bob > > [1] that was an IBM 1130 which bootsrapped off a single, 80-column > puchcard containing a small amount of object code. > > [2] http://en.wikipedia.org/wiki/Image:Multiplex-80_after_30_years.jpg > > [3] You would enter the address you wanted to start at in > octal (actually just binary grouped into three digits) > using the switches -- IIRC up for "1" and down for "0", > and then throw another switch that would load that number > into the address register. Then you'd reset the switches to > the pattern for the data you wanted there, and throw the > "deposit" switch. Again IIRC as long as you were loading > data into sequential addresses, it would auto-increment the > address register, so from then on you needed only to keep > entering each data value and pressing the deposit switch. > > And as long as I'm blathering about toggling things in from > the front panel, I will go ahead and mention that I'm just > old enough to have once been invited over to the home of > one of my college professors to see this new Altair 8800 > thing he was putting together... > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From john.hearns at streamline-computing.com Thu Jul 24 00:47:15 2008 From: john.hearns at streamline-computing.com (John Hearns) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: <20080724025413.GA20372@drzyzgula.org> References: <48847A1C.1000309@scalableinformatics.com> <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722175405.GA17358@bx9.net> <87prp4f5vo.fsf@snark.cb.piermont.com> <20080724025413.GA20372@drzyzgula.org> Message-ID: <1216885646.5066.15.camel@Vigor13> On Wed, 2008-07-23 at 22:54 -0400, Bob Drzyzgula wrote: > On Wed, Jul 23, 2008 at 09:06:03PM -0400, Perry E. Metzger wrote: > > > > "Robert G. Brown" writes: > > > Note that Bob and I started out on systems with far less than 100 MB > > > of DISK and perhaps a MB of system memory on a fat SERVER in the > > > latter 80's. And the P(o)DP(eople) made do with even less in the > > > early 80's. > > > > My first machine was a PDP-8. 4k of 12 bit words of genuine magnetic > > core memory, and two DECtape units with some small amount of storage > > (I can't remember, but I think it was on the order of 100k). I believe > > there are icons on my modern desktop that take up more space than that > > whole machine had for core. > > Although it wasn't my first machine [1], I did work with an > admittedly-old-at-the-time PDP-8 for a while in the early > 1980s. It was used to run a Perkin-Elmer microdensitometer > (think quarter-million-dollar film scanner). My first machine was a PDP 11/45, which was installed at my fathers place of work in the Southern General Hospital in Glasgow. The Diagnostic Methodology Researtch Unit - they did early research in computer assisted diagnosis of GI complaints, using a green CRT terminal which asked the patient questions. Coded in Fortran (of course), and they used Baysian statistics which in those days was pretty cutting edge stuff. I also remember programming on the HP 85 belonging to the Professor in the unit http://www.hpmuseum.org/hp85.htm From bob at drzyzgula.org Thu Jul 24 02:50:00 2008 From: bob at drzyzgula.org (Bob Drzyzgula) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Re: Religious wars In-Reply-To: References: <48849928.7030408@scalableinformatics.com> <20080721235326.GA21388@drzyzgula.org> <87ljzuo759.fsf@snark.cb.piermont.com> <20080722145447.GB2451@drzyzgula.org> <20080722175405.GA17358@bx9.net> <87prp4f5vo.fsf@snark.cb.piermont.com> <20080724025413.GA20372@drzyzgula.org> Message-ID: <20080724095000.GA20953@drzyzgula.org> On Wed, Jul 23, 2008 at 07:23:00PM -0400, Robert G. Brown wrote: > > On Wed, 23 Jul 2008, Bob Drzyzgula wrote: > >> >>>> vi back then was little more than a shell on ed IIRC >>> >>> It was (for nvi, is) the visual mode of ex, which is/was an extended >>> line editor in the lineage of ed, kind of an extended ed. >> >> Correct. In ex you would enter the command "vi" at the >> colon-prompt to enter visual mode. You should still be >> able to do this on any system with vi installed -- give >> it a try! :-) FWIW, the shell command "vi" simply fires >> up ex in that mode to start with. > > Fires or fired -- I have no idea what vim does now. I'd have thought > that it long ago divorced itself from ed (or em, en, ... ex) at the > source level. On an ubuntu system here: bob@ubi:~$ ls -l /usr/bin/vi lrwxrwxrwx 1 root root 20 2008-06-25 04:55 /usr/bin/vi -> /etc/alternatives/vi bob@ubi:~$ ls -l /etc/alternatives/vi lrwxrwxrwx 1 root root 17 2008-06-25 04:55 /etc/alternatives/vi -> /usr/bin/vim.tiny bob@ubi:~$ ls -l /usr/bin/vim.tiny -rwxr-xr-x 1 root root 703496 2008-01-31 07:26 /usr/bin/vim.tiny bob@ubi:~$ ls -l /usr/bin/ex lrwxrwxrwx 1 root root 20 2008-06-25 04:55 /usr/bin/ex -> /etc/alternatives/ex bob@ubi:~$ ls -l /etc/alternatives/ex lrwxrwxrwx 1 root root 17 2008-06-25 04:55 /etc/alternatives/ex -> /usr/bin/vim.tiny Thus at least today, with vim, it appears that ex just starts up vi *not* in visual mode. But that's just a matter of what you call the executable -- the main point is that they are typically the same executable, or at least they have been for as long as I've been using Unix. In earlier systems they may well have been hard linked rather than soft linked, and thus the question of which was the real name of the executable was moot. > But I used that trick (hopping from ed/ex into and out of > vi) fewer times than I have fingers on one hand back in the day. Why, > if one had fullscreen, would one ever use single line? Unless, of > course, one was working on a genuine tty lineprinter, which I > exceedingly rarely had to do (gnashing teeth most of the time) because > the console had crashed somehow and yes, we had a teletype console to > log all the messages. The other reason is if you are working on a system that is so broken that termcap (or terminfo or whatever) isn't set up correctly. In the early days of working with Suns this was frequently the case, IIRC, in single-user mode. Your choices then were usually ed or ex, and ex was a lot more familiar than ed if you were used to vi. --Bob From Daniel.Pfenniger at obs.unige.ch Thu Jul 24 03:06:14 2008 From: Daniel.Pfenniger at obs.unige.ch (Daniel Pfenniger) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network Message-ID: <48885416.1030805@obs.unige.ch> Hi, I have the problem of connecting with InfiniBand 50 1-HCA nodes with 6 24-port switches. Several configurations may be imagined, but which one is the best? What is the general method to solve such a problem? Thanks, Dan From gerry.creager at tamu.edu Thu Jul 24 05:33:02 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <981498.39566.qm@web54106.mail.re2.yahoo.com> References: <981498.39566.qm@web54106.mail.re2.yahoo.com> Message-ID: <4888767E.90905@tamu.edu> My next home will have multiple fiber pairs to high-use rooms, plus convenience wireless. I don't intend to pull copper through the walls. I plan to put switches in rooms that need multi-drop and have at least one pair of fiber for high-speed access for a server, NAS, or cluster leading back to the wiring closet and patch panel. With a glass infrastructure you can support a lot of technologies. gerry MDG wrote: > I am, wirin my home for a high-speed intranet, to internet fateway. I > had ;planned 10/100/1000 and Catb6 cables, but with the merger ofr > entertainment, my 50,000 book digitial ,ibrary (anffrowing), as well as > statistcal modeling of econometrics and comnpamies including hedging > (Monte Carlo Simulations) will als be doing video and audio editinmg and > some web site hosing. > > > > wuith 40/1000 ethernet being talked about as well as InfinioBand should > i just wire forone of those insted as senseless to cut acallopen twice. > > The storgage area for the centralized computers and data storage, nodes > will als have some, is already wired, cooling vents cut and installed to > dump excess het into the building, it is a condo, exhaust stsrem as > wellas the room can be closde and kept air=conditioned with the heat > dumbs turned off. > > > > My question is the wiring with the wotk i do, my2 terabytes is full > bringing 3 more on line, and exopct much more, it only makes sense to > look at the backbone to see if it will be abottle neck. > > > > What are your feelings. 10/00/100, 40 gigabyte ethernet,or 100 gigabyte > eithernet or InfiniBand? I can run the CAR 6 and just change switches > and routers later as needed but is far cheaoer to put it the wire growth > path now. What do you recommend. we will be runnng anywhere from 6 at > the start to 40 cores, te database will be a dedicated node, maybe if > overloaded a 2nd database or nas WILL BE ADDED. i USE scsi SYSTEM > REFERRABLY AS TRAINED THAT WAY BUT MAY ALSO LOOK AT rAID AT LEAST rAID 5 > sata SYATES, WITH FAST DUAL or a QUAD iore, or multile Dual or Quad > Cores in the groth path. > > > > whawould you sujest as homes will soon neeed a central data management > vault where even game consoles feed the system instead of multiole > cmputers everywhere. > > > > Later we will be doing the smae to a TESDA accreduted Private Philippone > Technical Collegee with approximaeyly 150 nodes, and muliple servers and > NAS systems, so plannong goes for both. and my home HPU may be the daily > offsite, out of the cointry even,daily back up, I canm get guranteed > bandwidgth so tey could actually use server here but that pushes it with > internationak work in real time as the Philippines is far ferom haHawaii > in rebilility. And the Philippine Static Modem is tooslowfor that many > to access in real time. Thank y > > > > Mike > --- On *Wed, 7/23/08, Gerry Creager //* wrote: > > From: Gerry Creager > Subject: Re: [Beowulf] Drive screw fixed with LocTite > To: "Perry E. Metzger" > Cc: beowulf@beowulf.org, "David Mathog" > Date: Wednesday, July 23, 2008, 5:12 PM > > Perry E. Metzger wrote: > > "David Mathog" writes: > >> A vendor who shall remain nameless graced us with a hot swappable > drive > >> caddy in which one of the three mounting screws used to fasten the > drive > >> to the caddy had been treated with blue LocTite. This wasn't > obvious > >> from external inspection, but the telltale blue glop was on the > threads > >> when the screw finally let go and came out. It was beginning to look > >> like power tools were going to be needed to get it out, and the screw > >> head was pretty badly torn up after removal. > > > > I believe a touch from a soldering iron will usually loosen LocTite, > > but that might also damage a drive, so be careful. > > Acetone or mineral spirits will also take care of locktite. Based on > some rather harsh experience showed that the piddly little heat > generated by a soldering iron won't really cause much damage. > > >> This is the first time I have encountered a drive screw on a removable > >> drive which was, well, unremovable. Is this a trend or are we just > >> dealing with a sadistic assembler? > > > > I've never seen it used with a drive, it is certainly not normal. > > > > Perry > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org > > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > Gerry Creager -- gerry.creager@tamu.edu > Texas Mesonet -- AATLT, Texas A&M University > Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 > Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From andrew at moonet.co.uk Thu Jul 24 05:42:22 2008 From: andrew at moonet.co.uk (andrew holway) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: <48885416.1030805@obs.unige.ch> References: <48885416.1030805@obs.unige.ch> Message-ID: Daniel To give a half bisectional bandwidth the best approach is to set up two as core switches and the other 4 as edge switches. Each edge switch will have four connections to each core switch leaving 16 node connections on each edge switch. Should provide a 64 port network. Make sense? Ta Andy On Thu, Jul 24, 2008 at 11:06 AM, Daniel Pfenniger wrote: > Hi, > > I have the problem of connecting with InfiniBand 50 1-HCA nodes with 6 > 24-port switches. Several configurations may be imagined, but which one is > the best? What is the general method to solve such a problem? > > Thanks, > > Dan > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From kspaans at student.math.uwaterloo.ca Thu Jul 24 06:03:34 2008 From: kspaans at student.math.uwaterloo.ca (Kyle Spaans) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <4888767E.90905@tamu.edu> References: <981498.39566.qm@web54106.mail.re2.yahoo.com> <4888767E.90905@tamu.edu> Message-ID: <20080724130334.GB20718@student.math> On Thu, Jul 24, 2008 at 07:33:02AM -0500, Gerry Creager wrote: > My next home will have multiple fiber pairs to high-use rooms, plus > convenience wireless. I don't intend to pull copper through the walls. Sorry, but won't you still have to pull fiber through the walls? Is fiber getting close enough to commodity pricing that it could overtake Cat[56] UTP ethernet cabling? From gerry.creager at tamu.edu Thu Jul 24 06:21:31 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080724130334.GB20718@student.math> References: <981498.39566.qm@web54106.mail.re2.yahoo.com> <4888767E.90905@tamu.edu> <20080724130334.GB20718@student.math> Message-ID: <488881DB.9030003@tamu.edu> I'll put it in conduit but I've been doing that in my office/shop (as well as electrical cabling, antenna feedlines, etc, not in the SAME conduit) for years. Among other things, with good conduit practice, it makes replacement easier. It does drive residential builders nuts, though. Allows me to have a real, reliable single-point house-wide grounding system, too. The price of multimode fiber has dropped off nicely, but single-mode's still a little pricier. Still, for the utility and potential benefits, it's the way I'm planning on and budgeting for. Kyle Spaans wrote: > On Thu, Jul 24, 2008 at 07:33:02AM -0500, Gerry Creager wrote: >> My next home will have multiple fiber pairs to high-use rooms, plus >> convenience wireless. I don't intend to pull copper through the walls. > > Sorry, but won't you still have to pull fiber through the walls? Is fiber getting close enough to commodity pricing that it could overtake Cat[56] UTP ethernet cabling? > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From peter.st.john at gmail.com Thu Jul 24 06:22:35 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] problem with binary file on NFS In-Reply-To: <200807221204.26132.sassmannshausen@tugraz.at> References: <200807221204.26132.sassmannshausen@tugraz.at> Message-ID: Jorg, I checked the man page for ldd and it says that it may not work if an old comiler was used to produced the executable. I think it's like symbolic debugging, you need to compile with a switch to build the symbol table; the compiler has to know you will want library information later, and builds a table embedded into the executable that ldd can read (but as always, this is something I haven't done myself :-( Do you have other DLLs you made with the same compiler that work OK and report to ldd OK? Peter On 7/22/08, J?rg Sa?mannshausen wrote: > > Dear all, > > I have a problem with a selfwritten program on my small cluster. The > cluster > nodes are PIII 500/800 MHz machines, the /home is distributed via NFS from > a > PIII 1 GHz machine. All nodes are running on Debian Etch. The program in > question (polymc_s) is in the users /home directory and is running on all > nodes but one. I get the following error messages on that particular node > (node4): > ldd polymc_s > not a dynamic executable > > file polymc_s > polymc_s: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for > GNU/Linux 2.4.1, dynamically linked (uses shared libs), for GNU/Linux > 2.4.1, > not stripped > > strace ./polymc_s > execve("./polymc_s", ["./polymc_s"], [/* 18 vars */]) = -1 ENOMEM (Cannot > allocate memory) > +++ killed by SIGKILL +++ > Process 2177 detached > > I am currently running memtest, no problems thus far. Other programs like > the > BOINC stuff (I am using that for stress-testing) are ok. Reuti already > suggested to do: > readelf -a polymc_s > which gave identical outputs on node4 and node3 (both are PIII 500 MHz > machines). I am somehow stuck here, has anybody got a good idea? Running > the > software locally does not make any difference, i.e. same errors as above. > Changing the file permissions did not mend it either. I am aware these are > old nodes, but for the purpose they are ok. > > All the best from Graz! > > J?rg > -- > ************************************************************* > J?rg Sa?mannshausen > Institut f?r Chemische Technologie von Materialien > TU-Graz > Stremayrgasse 16 > 8010 Graz > Austria > > phone: +43 (0)316 873 8954 > fax: +43 (0)316 873 4959 > homepage: http://sassy.formativ.net/ > > Please avoid sending me Word or PowerPoint attachments. > See http://www.gnu.org/philosophy/no-word-attachments.html > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080724/ab88e6cf/attachment.html From hahn at mcmaster.ca Thu Jul 24 06:51:17 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] problem with binary file on NFS In-Reply-To: References: <200807221204.26132.sassmannshausen@tugraz.at> Message-ID: > I checked the man page for ldd and it says that it may not work if an old > comiler was used to produced the executable. I think it's like symbolic ldd basically just runs the executable with some special flags talking to ld.so. >> PIII 1 GHz machine. All nodes are running on Debian Etch. The program in all nodes have exactly the same configuration? same versions, same /etc/ld.so.conf, etc? >> ldd polymc_s >> not a dynamic executable which just means that ld.so failed for it, I think. >> strace ./polymc_s >> execve("./polymc_s", ["./polymc_s"], [/* 18 vars */]) = -1 ENOMEM (Cannot >> allocate memory) >> +++ killed by SIGKILL +++ >> Process 2177 detached >> >> I am currently running memtest, no problems thus far. Other programs like no, those are not the symptoms of flakey memory, but rather not enough. there's nothing else on that node consuming memory? does it actually have the same amount as the other nodes? what does "ulimit -a" say? how about /proc/meminfo? are you using /proc/sys/vm/overcommit_memory=2? From mark.kosmowski at gmail.com Thu Jul 24 07:31:45 2008 From: mark.kosmowski at gmail.com (Mark Kosmowski) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Drive screw fixed with LocTite Message-ID: > "David Mathog" writes: >> A vendor who shall remain nameless graced us with a hot swappable drive >> caddy in which one of the three mounting screws used to fasten the drive >> to the caddy had been treated with blue LocTite. This wasn't obvious >> from external inspection, but the telltale blue glop was on the threads >> when the screw finally let go and came out. It was beginning to look >> like power tools were going to be needed to get it out, and the screw >> head was pretty badly torn up after removal. > > I believe a touch from a soldering iron will usually loosen LocTite, > but that might also damage a drive, so be careful. Blue Loctite is removable with just a little more force than needed with mechanical lock washers. It is critical to get a good, solid fit with the tool to the bolt / screw though. Were you using the correct driver or just one that fit "good enough"? Red Loctite is a permanent Loctite. > >> This is the first time I have encountered a drive screw on a removable >> drive which was, well, unremovable. Is this a trend or are we just >> dealing with a sadistic assembler? > > I've never seen it used with a drive, it is certainly not normal. > > Perry From peter.st.john at gmail.com Thu Jul 24 07:38:49 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Drive screw fixed with LocTite In-Reply-To: References: Message-ID: Perhaps the hole for that screw was defective (e.g., somebody tried to screw in a screw one size too big) and rather than replace the case, he covered his mistake with the loctite. I'm still mistified by the hard drive soldered into the case of a brand-name computer, the trend to preventing user maintenance makes me nuts, and sometimes gets excessive, but in a hot-swappable array it would be patently oxymoronic. I think just some coincidence of mistakes. Peter On 7/23/08, David Mathog wrote: > > A vendor who shall remain nameless graced us with a hot swappable drive > caddy in which one of the three mounting screws used to fasten the drive > to the caddy had been treated with blue LocTite. This wasn't obvious > from external inspection, but the telltale blue glop was on the threads > when the screw finally let go and came out. It was beginning to look > like power tools were going to be needed to get it out, and the screw > head was pretty badly torn up after removal. > > This is the first time I have encountered a drive screw on a removable > drive which was, well, unremovable. Is this a trend or are we just > dealing with a sadistic assembler? > > Thanks, > > > David Mathog > mathog@caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080724/631544ec/attachment.html From peter.st.john at gmail.com Thu Jul 24 08:20:50 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] regarding Religious Wars, vi Message-ID: My last post bounced (I cancelled it out of the moderator queue, he's busy enough) on account of excessive length (growing included predecessors that I can forget, on account of the way Gmail keeps track and hides text I've seen already). So in brief: I just realized that vi is the **only** thing where I never use the mouse. I even use the mouse (now) in the DOS command interpreter ("edit", "mark"...) instead of control-V etc. It must just be that there was no mouse when I learned vi. It's the only environment where I'm really fast. I vim scripts when real power-hacker sysadmins would compose at the command line, and I code C rather than hack scripts if I don't have exactly the command I want for my script. Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080724/d55c5b1c/attachment.html From Michael.Frese at NumerEx-LLC.com Thu Jul 24 08:38:28 2008 From: Michael.Frese at NumerEx-LLC.com (Michael H. Frese) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] problem with binary file on NFS In-Reply-To: <200807221204.26132.sassmannshausen@tugraz.at> References: <200807221204.26132.sassmannshausen@tugraz.at> Message-ID: <6.2.5.6.2.20080724093141.04f95d30@NumerEx-LLC.com> Jorg, It might be that the executable is corrupted by NFS during delivery to that node. Once that happens, the cached copy can stay bad. You can check it by comparing md5sum results on that node and on the node that owns the original. There's a thread back in December of last year titled "NFS Read Errors" that tells about my experience. I never found a solution except to get rid of the Redhat 9 systems.... Mike At 07:24 AM 7/23/2008, you wrote: >Dear all, > >I have a problem with a selfwritten program on my small cluster. The cluster >nodes are PIII 500/800 MHz machines, the /home is distributed via NFS from a >PIII 1 GHz machine. All nodes are running on Debian Etch. The program in >question (polymc_s) is in the users /home directory and is running on all >nodes but one. I get the following error messages on that particular node >(node4): >ldd polymc_s > not a dynamic executable > > file polymc_s >polymc_s: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for >GNU/Linux 2.4.1, dynamically linked (uses shared libs), for GNU/Linux 2.4.1, >not stripped > > strace ./polymc_s >execve("./polymc_s", ["./polymc_s"], [/* 18 vars */]) = -1 ENOMEM (Cannot >allocate memory) >+++ killed by SIGKILL +++ >Process 2177 detached > >I am currently running memtest, no problems thus far. Other programs like the >BOINC stuff (I am using that for stress-testing) are ok. Reuti already >suggested to do: >readelf -a polymc_s >which gave identical outputs on node4 and node3 (both are PIII 500 MHz >machines). I am somehow stuck here, has anybody got a good idea? Running the >software locally does not make any difference, i.e. same errors as above. >Changing the file permissions did not mend it either. I am aware these are >old nodes, but for the purpose they are ok. > >All the best from Graz! > >J?rg >-- >************************************************************* >J?rg Sa?mannshausen >Institut f?r Chemische Technologie von Materialien >TU-Graz >Stremayrgasse 16 >8010 Graz >Austria > >phone: +43 (0)316 873 8954 >fax: +43 (0)316 873 4959 >homepage: http://sassy.formativ.net/ > >Please avoid sending me Word or PowerPoint attachments. >See http://www.gnu.org/philosophy/no-word-attachments.html >_______________________________________________ >Beowulf mailing list, Beowulf@beowulf.org >To change your subscription (digest mode or >unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080724/a76414ca/attachment.html From kilian at stanford.edu Thu Jul 24 09:42:57 2008 From: kilian at stanford.edu (Kilian CAVALOTTI) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> Message-ID: <200807240942.57699.kilian@stanford.edu> On Thursday 24 July 2008 05:42:22 am andrew holway wrote: > To give a half bisectional bandwidth the best approach is to set up > two as core switches and the other 4 as edge switches. > > Each edge switch will have four connections to each core switch > leaving 16 node connections on each edge switch. > > Should provide a 64 port network. I'm also curious to know if there's a general formula to determine the required number of IB swicthes (given their ports count) to create a full (or half) bisectional network capable of interconnecting say N leaf nodes, and especially, if there's a way to deterministically infer the manner to (inter)connect them. I've seen numerous examples involving small amounts of nodes and swicthes, but I can't figure a way to scale those examples to larger networks. Any pointers? Thanks, -- Kilian From Daniel.Pfenniger at obs.unige.ch Thu Jul 24 10:16:05 2008 From: Daniel.Pfenniger at obs.unige.ch (Daniel Pfenniger) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> Message-ID: <4888B8D5.4000308@obs.unige.ch> Andrew, Here are joined some possible topologies I was contemplating, with some remarks about them. Many other topologies are possible. The first one is the one you mention. If 12 nodes linked to one switch communicate with 12 nodes on another switch the bandwidth is reduced to 8/12 = 2/3. All the packets needs either 1 or 3 hops through a switch. The second topology improves the bandwidth between the core and edge switches. The bandwidth for the above case is not reduced except that some routes need 4 hops. The third topology has the feature that 2 hops node to node communications are possible, but global communications are slightly degraded with respect to the previous case. In the fourth case we have one core switch and 4 edge switches. When 10 nodes communicate with 10 other nodes on another edge switch 2 or 3 routes need 2 hops and the rest 3 hops, without bandwidth reduction. It seems to me that this topology is better than the previous ones. Finally the last topology has no core switch. All the routes need either 1 or 2 hops. This one seems to me even better. Since I am not network expert I would be glad if somebody explains why the first solution is the best one. Dan andrew holway wrote: > Daniel > > To give a half bisectional bandwidth the best approach is to set up > two as core switches and the other 4 as edge switches. > > Each edge switch will have four connections to each core switch > leaving 16 node connections on each edge switch. > > Should provide a 64 port network. > Make sense? > > Ta > > Andy > > > On Thu, Jul 24, 2008 at 11:06 AM, Daniel Pfenniger > wrote: >> Hi, >> >> I have the problem of connecting with InfiniBand 50 1-HCA nodes with 6 >> 24-port switches. Several configurations may be imagined, but which one is >> the best? What is the general method to solve such a problem? >> >> Thanks, >> >> Dan >> >> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> -------------- next part -------------- A non-text attachment was scrubbed... Name: Connexion of 50 nodes with 6 24-port IB switches.pdf Type: application/pdf Size: 12626 bytes Desc: not available Url : http://www.scyld.com/pipermail/beowulf/attachments/20080724/4a55ab58/Connexionof50nodeswith624-portIBswitches.pdf From niftyompi at niftyegg.com Thu Jul 24 10:42:40 2008 From: niftyompi at niftyegg.com (Nifty niftyompi Mitch) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: <200807240942.57699.kilian@stanford.edu> References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> Message-ID: <20080724174240.GA4349@hpegg.niftyegg.com> On Thu, Jul 24, 2008 at 09:42:57AM -0700, Kilian CAVALOTTI wrote: > On Thursday 24 July 2008 05:42:22 am andrew holway wrote: > > To give a half bisectional bandwidth the best approach is to set up > > two as core switches and the other 4 as edge switches. > > > > Each edge switch will have four connections to each core switch > > leaving 16 node connections on each edge switch. > > > > Should provide a 64 port network. > > I'm also curious to know if there's a general formula to determine the > required number of IB switches (given their ports count) to create a > full (or half) bisectional network capable of interconnecting say N > leaf nodes, and especially, if there's a way to deterministically infer > the manner to (inter)connect them. > > I've seen numerous examples involving small amounts of nodes and > switches, but I can't figure a way to scale those examples to larger > networks. > > Any pointers? Pointers yes... clear answers not sure. http://www.infinibandta.org/home Since 99% of all the IB switch silicon is from Mellanox today give the Mellanox web site a big look. Lots of vendors build switches with Mellanox silicon... Cisco and QLogic come to mind. http://www.cisco.com/en/US/prod/collateral/ps6418/ps6419/ps6421/prod_white_paper0900aecd8043ba1d.html Your most cost effective solution will be a large port count switch. Most are not 'ideal' but they are close to ideal and cost effective. At the bottom of all this is cross-bar technology (KeyHint=cross-bar). Some good research was done at Stanford on this. Plug "Dan Lenoski crossbar" into your favorite search engine. Dan Lenowski and others at Stanford did some good work that resulted in the ccNUMA machines at SGI. The SGI ccNUMA memory subsystem was built on cross-bar switches and modest to large Orign systems had X-sectional bandwidth setup issues. There is also a lot of telco research and work on this. Next some attention needs to be given to the subnet manager as it sets up the maps that the devices use to build a fabric. Expect to start in 2D space then to N space when building switched fabrics. It pays to play with some hot glue bamboo skewers and yarn for the 2D, 3D and 4D(hypercube) space... The 2D, 3D, 4D,.... ND meshes are in part why this can get hard. -- T o m M i t c h e l l Looking for a place to hang my hat. From mathog at caltech.edu Thu Jul 24 10:55:29 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Drive screw fixed with LocTite Message-ID: Mark Kosmowski wrote: > Blue Loctite is removable with just a little more force than needed > with mechanical lock washers. It is critical to get a good, solid fit > with the tool to the bolt / screw though. Were you using the correct > driver or just one that fit "good enough"? It was the right tool. For a tiny Philips head screw which has been fixed with LocTite the problem is that the force that will cause the indentations in the head to fail is perilously close to the force needed to turn the screw. It did not help that the screw was definitely not made of the hardest steel. In order to get the screw out in the end the disk had to be stood on its side and substantial downward force applied to the screwdriver to keep it from jumping out of the grooves while it was turned. The initial attempt to remove the screw had already damaged the head somewhat - because normal force did not turn the screw, and the screwdriver jumped back and chewed up the head a bit. By that I mean, I stuck the screwdriver in, applied as much force as one would normally apply, and rather than the screw moving, the screwdriver rode up a bit in the groves and damaged them. Had I been using a power screwdriver it certainly would have completely stripped the head on the first attempt, because one would not normally apply as much force as was required along the axis of the screw. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From andrew at moonet.co.uk Thu Jul 24 11:15:17 2008 From: andrew at moonet.co.uk (andrew holway) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: <4888B8D5.4000308@obs.unige.ch> References: <48885416.1030805@obs.unige.ch> <4888B8D5.4000308@obs.unige.ch> Message-ID: Well the top configuration(and the one that I suggested) is the one that we have tested and know works. We have implimented it into hundereds of clusters. It also provides redundancy for the core switches. With any network you need to avoid like the plauge any kind of loop, they can cause weird problems and are pretty much unnessasary. for instance, why would you put a line between the two core switches? Why would that line carry any traffic? When you consider that it takes 2-4?s for an mpi message to get from one node to another on the same switch, each extra hop will only introduce another 0.02?s (I think?) to that latency so its not really worth worrying about especially at the expence of reliability. Most applications dont use anything like the full bandwidth of the interconnect so the half bisectionalness of everything can generally be safeley ignored. All the spare ports you have on the edge switches can be used for extra connections to the core switches. ta Andy On Thu, Jul 24, 2008 at 6:16 PM, Daniel Pfenniger wrote: > Andrew, > > Here are joined some possible topologies I was contemplating, with some > remarks about them. Many other topologies are possible. > > The first one is the one you mention. If 12 nodes linked to one switch > communicate with 12 nodes on another switch the bandwidth is reduced to > 8/12 = 2/3. All the packets needs either 1 or 3 hops through a switch. > > The second topology improves the bandwidth between the core and edge > switches. The bandwidth for the above case is not reduced except that some > routes need 4 hops. > > The third topology has the feature that 2 hops node to node communications > are possible, but global communications are slightly degraded with respect > to the previous case. > > In the fourth case we have one core switch and 4 edge switches. When > 10 nodes communicate with 10 other nodes on another edge switch > 2 or 3 routes need 2 hops and the rest 3 hops, without bandwidth reduction. > It seems to me that this topology is better than the previous ones. > > Finally the last topology has no core switch. All the routes need either 1 > or 2 hops. This one seems to me even better. > > Since I am not network expert I would be glad if somebody explains > why the first solution is the best one. > > Dan > > > > andrew holway wrote: >> >> Daniel >> >> To give a half bisectional bandwidth the best approach is to set up >> two as core switches and the other 4 as edge switches. >> >> Each edge switch will have four connections to each core switch >> leaving 16 node connections on each edge switch. >> >> Should provide a 64 port network. >> Make sense? >> >> Ta >> >> Andy >> >> >> On Thu, Jul 24, 2008 at 11:06 AM, Daniel Pfenniger >> wrote: >>> >>> Hi, >>> >>> I have the problem of connecting with InfiniBand 50 1-HCA nodes with 6 >>> 24-port switches. Several configurations may be imagined, but which one >>> is >>> the best? What is the general method to solve such a problem? >>> >>> Thanks, >>> >>> Dan >>> >>> >>> _______________________________________________ >>> Beowulf mailing list, Beowulf@beowulf.org >>> To change your subscription (digest mode or unsubscribe) visit >>> http://www.beowulf.org/mailman/listinfo/beowulf >>> > > From jan.heichler at gmx.net Thu Jul 24 11:14:43 2008 From: jan.heichler at gmx.net (Jan Heichler) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: <4888B8D5.4000308@obs.unige.ch> References: <48885416.1030805@obs.unige.ch> <4888B8D5.4000308@obs.unige.ch> Message-ID: <177925861.20080724201443@gmx.net> Hallo Daniel, Donnerstag, 24. Juli 2008, meintest Du: [network configurations] I have to say i am not sure that all the configs you sketched really work. I never saw somebody creating loops in an IB fabric. DP> Since I am not network expert I would be glad if somebody explains DP> why the first solution is the best one. Let's say it as follows: 1) most applications are latency driven - not bandwidth driven. That means that half bisectional bandwidth is not cutting your application performance down to 50%. For most applications the impact should be less than 5% - for some it is really 0%. 2) Static routing in IB networks limits your bandwidth for many of the possible communication patterns anyway. For completely random communication it was like below 50%. So you buy a IB fabric with full bisectional but can't use it anyway - reducing the bisectional bandwidth is not impacting that much anymore (as far as i understood most whitepapers) 3) today you have usually 4 or 8 cores in one node. 12 nodes times 4/8 cores makes 48 or 92 cores that are connected with one HOP on the same switch. Many applications don't scale to that number of processes anyway. Before you try to think about optimizing the network to the maximum maybe it is better to think about your application, your ususal job sizes and the scheduling of the jobs. Try to avoid "cross switch communication" if possible. If you run small jobs like let's say of 8 nodes and you have 12 nodes on each switch and half bisectional bandwidth between them then it is 8 nodes on the first switch for job 1. For job 2 it is 4 nodes on switch one and 4 on switch two. Your bisectional bandwidth is big enough to handle this. I vote for the fat tree in picture one because i know it works and with 1) to 3) mentioned above it will give you good performance - especially if you run more than just one application (because optimizing is mostly optimizing for a single use case - if you have more than one it is hard to optimize). Regards, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080724/0b91482a/attachment.html From andrew at moonet.co.uk Thu Jul 24 11:17:38 2008 From: andrew at moonet.co.uk (andrew holway) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: <177925861.20080724201443@gmx.net> References: <48885416.1030805@obs.unige.ch> <4888B8D5.4000308@obs.unige.ch> <177925861.20080724201443@gmx.net> Message-ID: :) me and jan work together at ClusterVision. On Thu, Jul 24, 2008 at 7:14 PM, Jan Heichler wrote: > Hallo Daniel, > > Donnerstag, 24. Juli 2008, meintest Du: > > [network configurations] > > I have to say i am not sure that all the configs you sketched really work. I > never saw somebody creating loops in an IB fabric. > > DP> Since I am not network expert I would be glad if somebody explains > > DP> why the first solution is the best one. > > Let's say it as follows: > > 1) most applications are latency driven - not bandwidth driven. That means > that half bisectional bandwidth is not cutting your application performance > down to 50%. For most applications the impact should be less than 5% - for > some it is really 0%. > > 2) Static routing in IB networks limits your bandwidth for many of the > possible communication patterns anyway. For completely random communication > it was like below 50%. So you buy a IB fabric with full bisectional but > can't use it anyway - reducing the bisectional bandwidth is not impacting > that much anymore (as far as i understood most whitepapers) > > 3) today you have usually 4 or 8 cores in one node. 12 nodes times 4/8 cores > makes 48 or 92 cores that are connected with one HOP on the same switch. > Many applications don't scale to that number of processes anyway. Before you > try to think about optimizing the network to the maximum maybe it is better > to think about your application, your ususal job sizes and the scheduling of > the jobs. Try to avoid "cross switch communication" if possible. If you run > small jobs like let's say of 8 nodes and you have 12 nodes on each switch > and half bisectional bandwidth between them then it is 8 nodes on the first > switch for job 1. For job 2 it is 4 nodes on switch one and 4 on switch two. > Your bisectional bandwidth is big enough to handle this. > > I vote for the fat tree in picture one because i know it works and with 1) > to 3) mentioned above it will give you good performance - especially if you > run more than just one application (because optimizing is mostly optimizing > for a single use case - if you have more than one it is hard to optimize). > > Regards, > > Jan > From andrew at moonet.co.uk Thu Jul 24 11:27:56 2008 From: andrew at moonet.co.uk (andrew holway) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: <20080724174240.GA4349@hpegg.niftyegg.com> References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> Message-ID: > Your most cost effective solution will be a large port count switch. > Most are not 'ideal' but they are close to ideal and cost effective. That is not really the case in practice; You can buy a Mellanox 144-Port Modular InfiniBand DDR Switch (60-Ports enabled) for around 22k EUR or so the 24 port switches are around 2k EUR As most people dont need anything like the full 16 Gbit/s bandwidth you can connect up your 50 ports for about 13/14k EUR ta Andy From niftyompi at niftyegg.com Thu Jul 24 13:07:34 2008 From: niftyompi at niftyegg.com (Nifty niftyompi Mitch) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> Message-ID: <20080724200734.GA4513@hpegg.niftyegg.com> On Thu, Jul 24, 2008 at 07:27:56PM +0100, andrew holway wrote: > Sender: andrew.holway@googlemail.com > > > Your most cost effective solution will be a large port count switch. > > Most are not 'ideal' but they are close to ideal and cost effective. > > That is not really the case in practice; > > You can buy a Mellanox 144-Port Modular InfiniBand DDR Switch > (60-Ports enabled) for around 22k EUR or so > > the 24 port switches are around 2k EUR > > As most people dont need anything like the full 16 Gbit/s bandwidth > you can connect up your 50 ports for about 13/14k EUR 60 ports on a 144 port box.... Hmmm... You are paying for the empty holes and future expansion. If you need 144 ports a single switch will be be more cost effective than a gaggle of 24 ports gathered together into a tangled hairy cable ball with 144 ports exposed. Your point about "most people don't need" is important! With large multi core, multiple socket systems external and internal bandwidth can be interesting to ponder. Bandwidth, message rate and latency all come to play and differing applications need more or less of each to go fast. -- T o m M i t c h e l l Looking for a place to hang my hat. From kyron at neuralbs.com Thu Jul 24 13:41:48 2008 From: kyron at neuralbs.com (Eric Thibodeau) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] Vi _and_ emacs on the Gentoo Clustering LievCD Message-ID: <4888E90C.9080609@neuralbs.com> Yeah...you read that right, I'll put BOTH on the CD...incredible but true! Now, you emacs users out there, take your pick, which of these do you want: kyron ldap-auth # eix emacs -c [U] app-admin/eselect-emacs (1.3-r2@04/14/2008 -> 1.5): Manages Emacs versions [N] app-editors/emacs (21.4-r17(21) 22.2-r2(22)): The extensible, customizable, self-documenting real-time display editor [N] app-editors/emacs-cvs ( (22) ~22.2.9999 (23) ~23.0.50_pre20080201 ~23.0.9999 {X Xaw3d alsa dbus gif gpm gtk gzip-el hesiod jpeg kerberos m17n-lib motif png sound source spell svg tiff toolkit-scroll-bars xft xpm}): The extensible, customizable, self-documenting real-time display editor [N] app-editors/ersatz-emacs (20060515): A very minimal imitation of the famous GNU Emacs editor [N] app-editors/jasspa-microemacs (20060909-r1): Jasspa Microemacs [N] app-editors/qemacs (0.3.2_pre20070226): QEmacs (for Quick Emacs) is a very small but powerful UNIX editor [N] app-editors/uemacs-pk (4.0.18): uEmacs/PK is an enhanced version of MicroEMACS [N] app-editors/xemacs (21.4.21-r1): highly customizable open source text editor and application development system [N] app-emacs/aspectj4emacs (~1.1_beta2): AspectJ support for GNU Emacs java-mode and JDEE [N] app-emacs/emacs-jabber (0.7.1): A Jabber client for Emacs [N] app-emacs/emacs-w3m (1.4.4-r2): emacs-w3m is an interface program of w3m on Emacs [N] app-emacs/emacs-wget (0.5.0): Wget interface for Emacs [N] app-emacs/emacs-wiki (~2.72-r1): Maintain a local Wiki using Emacs-friendly markup [N] app-emacs/emacs-wiki-blog (~0.4-r1 ~0.5): Emacs-Wiki add-on for maintaining a weblog [N] app-emacs/http-emacs (~1.1 ~1.1-r1): Fetch, render and post html pages and edit wiki pages via Emacs. [U] app-xemacs/xemacs-base (2.08@04/14/2008 -> 2.10): Fundamental XEmacs support, you almost certainly need this. [N] app-xemacs/xemacs-devel (1.75): Emacs Lisp developer support. [N] app-xemacs/xemacs-eterm (1.17): Terminal emulation. [N] app-xemacs/xemacs-ispell (1.32): Spell-checking with GNU ispell. [N] app-xemacs/xemacs-packages-all (2007.04.27-r1): Meta package for XEmacs elisp packages, similar to the sumo archives. [N] app-xemacs/xemacs-packages-sumo (2006.12.21): The SUMO bundle of ELISP packages for Xemacs [N] dev-lisp/emacs-cl (~0_pre20060526): An implementation of Common Lisp written in Emacs Lisp [N] virtual/emacs (22): Virtual for GNU Emacs [N] x11-misc/emacs-desktop (0.3): Desktop entry and icon for Emacs Cheers, Eric Thibodeau PS: now stop polluting my inbox with ideological dead ends!...humbug!.. :P From hahn at mcmaster.ca Thu Jul 24 14:00:33 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <4888B8D5.4000308@obs.unige.ch> Message-ID: > Well the top configuration(and the one that I suggested) is the one > that we have tested and know works. We have implimented it into > hundereds of clusters. It also provides redundancy for the core > switches. just for reference, it's commonly known as "fat tree", and is indeed widely used. > With any network you need to avoid like the plauge any kind of loop, > they can cause weird problems and are pretty much unnessasary. for well, I don't think that's true - the most I'd say is that given the usual spanning-tree protocol for eth switches, loops are a bug. but IB doesn't use eth's STP, and even smarter eth networks can take good advantage of multiple paths, even loopy ones. > instance, why would you put a line between the two core switches? Why > would that line carry any traffic? indeed - those examples don't make much sense. but there are many others that involve loops that could be quite nice. consider 36 nodes: with 2x24pt, you get 3:1 blocking (6 inter-switch links). with 3 switches, you can do 2:1 blocking (6 interlinks in a triangle, forming a loop.) dual-port nics provide even more entertainment (FNN, but also the ability to tolerate a leaf-switch failure...) > When you consider that it takes 2-4ìs for an mpi message to get from depends on the nic - mellanox claims ~1 us for connectx (haven't seen it myself yet.) I see 4-4.5 us latency (worse than myri 2g mx!) on pre-connectx mellanox systems. > one node to another on the same switch, each extra hop will only > introduce another 0.02ìs (I think?) to that latency so its not really with current hardware, I think 100ns per hop is about right. mellanox claims 60ns for the latest stuff. > Most applications dont use anything like the full bandwidth of the > interconnect so the half bisectionalness of everything can generally > be safeley ignored. everything is simple for single-purpose clusters. for a shared cluster with a variety of job types, especially for large user populations, large jobs and large clusters, you want to think carefully about how much to compromise the fabric. consider, for instance, interference between a bw-heavy weather code and some latency-sensitive application (big and/or tightly-coupled.) From hahn at mcmaster.ca Thu Jul 24 15:39:00 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: <20080724200734.GA4513@hpegg.niftyegg.com> References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> Message-ID: > If you need 144 ports a single switch will be be more cost effective you'd think so - larger switches let you factor out lots of separate little power supplies, etc. not to mention transforming lots of cables into compact, reliable, cheap backplanes. but I haven't seen chassis switches actually wind up cheaper. of course, IB hardware prices seem to be extremely fuzzy (heavily discounted from list). > than a gaggle of 24 ports gathered together into a tangled hairy cable ball with > 144 ports exposed. once very nice thing about the leaf/trunk fat-tree approach is that your switches can be distributed in node racks. so any individual rack has a fairly managable bundle of cables coming out of it. > Your point about "most people don't need" is important! With large > multi core, multiple socket systems external and internal bandwidth > can be interesting to ponder. that makes it sound like inter-node networks in general are doomed ;) while cores-per-node is increasing, users love to increase cores-per-job. From patrick at myri.com Thu Jul 24 15:51:20 2008 From: patrick at myri.com (Patrick Geoffray) Date: Wed Nov 25 01:07:29 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: <177925861.20080724201443@gmx.net> References: <48885416.1030805@obs.unige.ch> <4888B8D5.4000308@obs.unige.ch> <177925861.20080724201443@gmx.net> Message-ID: <48890768.6080505@myri.com> Hi Jan, Jan Heichler wrote: > 1) most applications are latency driven - not bandwidth driven. That > means that half bisectional bandwidth is not cutting your application > performance down to 50%. For most applications the impact should be less > than 5% - for some it is really 0%. If the app is purely latency driven, bandwidth (link or bisection) is indeed irrelevant. However, don't underestimate the impact of contention on collective communication: once you exceed the internal buffering in the crossbars, you will have back-pressure. Typically, each crossbar port can buffer in the order of 1-10K these days. So, the larger the message size for the collective and the larger the communicator, the greater the need for effective bisection. At this scale (ie 50 nodes), I agree it's not that important, unless you are bandwidth bounded to begin with. > 2) Static routing in IB networks limits your bandwidth for many of the > possible communication patterns anyway. For completely random > communication it was like below 50%. So you buy a IB fabric with full > bisectional but can't use it anyway - reducing the bisectional bandwidth > is not impacting that much anymore (as far as i understood most whitepapers) With static routing on Fat Tree or Clos and pseudo-random traffic (ie real world), you waste ~50% of the bisection you have (actually, the more hops the more waste, but it's not linear). So, if you start with half the theoretical bisection, your effective bisection will roughly be a quarter of that. Patrick From lindahl at pbm.com Thu Jul 24 15:55:51 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080724130334.GB20718@student.math> References: <981498.39566.qm@web54106.mail.re2.yahoo.com> <4888767E.90905@tamu.edu> <20080724130334.GB20718@student.math> Message-ID: <20080724225550.GD6378@bx9.net> On Thu, Jul 24, 2008 at 09:03:34AM -0400, Kyle Spaans wrote: > On Thu, Jul 24, 2008 at 07:33:02AM -0500, Gerry Creager wrote: > > My next home will have multiple fiber pairs to high-use rooms, plus > > convenience wireless. I don't intend to pull copper through the walls. > > Sorry, but won't you still have to pull fiber through the walls? Is fiber getting close enough to commodity pricing that it could overtake Cat[56] UTP ethernet cabling? Fiber is a commodity. Perhaps you were looking for pricing close enough to twisted pair copper? In any case, it's not just the cost per length of cable, the endpoints for fiber are also more expensive. -- g From patrick at myri.com Thu Jul 24 16:02:11 2008 From: patrick at myri.com (Patrick Geoffray) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <4888B8D5.4000308@obs.unige.ch> Message-ID: <488909F3.50000@myri.com> Hi Mark, Mark Hahn wrote: >> With any network you need to avoid like the plauge any kind of loop, >> they can cause weird problems and are pretty much unnessasary. for > > well, I don't think that's true - the most I'd say is that given It is kind of true for wormhole switches, you can deadlock if you have loops (direct of indirect). The subnet manager / mapper will often prune some loopy links to be able to generate deadlock-free routes in polynomial time. So, you could have links that are just not used by any routes in funky topologies. Ethernet spanning tree is the most extreme paranoia, it will always prune all links but one between 2 switches (modulo link aggregation). Patrick From lindahl at pbm.com Thu Jul 24 17:18:38 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: <177925861.20080724201443@gmx.net> References: <48885416.1030805@obs.unige.ch> <4888B8D5.4000308@obs.unige.ch> <177925861.20080724201443@gmx.net> Message-ID: <20080725001837.GF6378@bx9.net> On Thu, Jul 24, 2008 at 08:14:43PM +0200, Jan Heichler wrote: > 1) most applications are latency driven - not bandwidth driven. As a guy who's a big fan of low latency, I had to say that this is not a good generalization. Some apps become latency or message-rate sensitive if you scale to enough nodes at a fixed problem size. But most of the time, if you're running an app that is scaling really well on your network, it's neither bandwidth or latency bound. It's cpu bound. As another by the way (not directed at you, Jan), fat trees and Clos networks are not the same. -- greg From lindahl at pbm.com Thu Jul 24 17:22:24 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] Infiniband modular switches In-Reply-To: <487B8FEF.4030708@myri.com> References: <9FA59C95FFCBB34EA5E42C1A8573784F013425E3@mtiexch01.mti.com> <487B8FEF.4030708@myri.com> Message-ID: <20080725002223.GA14358@bx9.net> On Mon, Jul 14, 2008 at 01:42:07PM -0400, Patrick Geoffray wrote: > AlltoAll of large messages is not a useless synthetic benchmark IMHO. AlltoAll is a real thing used by real codes, but do keep in mind that there are many algorithms for AlltoAll with various message sizes and network topologies, so it's testing both the raw interconnect and the AlltoAll implementation. I don't know of the results you mention were run with an optimal AlltoAll... do you? -- greg From niftyompi at niftyegg.com Thu Jul 24 18:17:19 2008 From: niftyompi at niftyegg.com (Nifty niftyompi Mitch) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> Message-ID: <20080725011719.GA9120@hpegg.niftyegg.com> On Thu, Jul 24, 2008 at 06:39:00PM -0400, Mark Hahn wrote: ....... > >> Your point about "most people don't need" is important! With large >> multi core, multiple socket systems external and internal bandwidth >> can be interesting to ponder. > > that makes it sound like inter-node networks in general are doomed ;) > while cores-per-node is increasing, users love to increase cores-per-job. Not doomed but currently limiting. But, with CPU core to CPU core and socket to socket memory improvements who knows. Another shared commons inside a chassis to factor-in is cache memory. For some time AMD had an advantage on core to core and socket to socket communication but that can change quickly. Still we do not like to link IB switches with a single cable so why should we limit eight cores in a single chassis to the bandwidth of a single cable. The more cores that hide behind a link the more the bandwidth has to be shared by those cores (MPI ranks). In practice many applications need not contend on the wire for rank to rank communication at the exactly the same time so YMMV. This reminds me to ask about all the Xen questions.... Virtual machines (sans dynamic migration) seem to address the inverse of the problem that MPI and other computational clustering solutions address. Virtual machines assume that the hardware is vastly more worthy than the OS and application where Beowulf style clustering exists because the hardware is one N'th what is necessary to get to the solution. Where does Xen and other VME (not the system bus) solutions play in Beowulf land. The "virtual machine environment" stuff will enable CPU vendors to add more cores to a box but how does that help/hurt an MPI cluster environment? -- T o m M i t c h e l l Looking for a place to hang my hat. From tmattox at gmail.com Thu Jul 24 20:28:25 2008 From: tmattox at gmail.com (Tim Mattox) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <4888B8D5.4000308@obs.unige.ch> Message-ID: Cool, FNN's are still being mentioned on the Beowulf mailing list... For those not familiar with the Flat Neighborhood Network (FNN) idea, check out this URL: http://aggregate.org/FNN/ For those who haven't played with our FNN generator cgi script, do try it out. Hank (my Ph.D. advisor) enhanced the cgi awhile back to generate pretty multi-color pictures of the resulting FNNs. Unfortunately, for the particular input parameters from this thread of six 24-port switches and 50 nodes, each node would need a 3-port HCA (or 3 HCAs) and a 7th switch to generate a Universal FNN. FNNs don't really shine until you have 3 or 4 NICs/HCAs per compute node. Anyway, you would get a LOT more bandwidth with an FNN in this case... and of course, the "single-switch-latency" that is characteristic of FNNs. Though, as others have mentioned, IB switch latency is pretty darn small, so latency would not be the primary reason to use FNNs with IB. I wonder if anyone has built a FNN using IB... or for that matter, any link technology other than Ethernet? On Thu, Jul 24, 2008 at 5:00 PM, Mark Hahn wrote: >> Well the top configuration(and the one that I suggested) is the one >> that we have tested and know works. We have implimented it into >> hundereds of clusters. It also provides redundancy for the core >> switches. > > just for reference, it's commonly known as "fat tree", and is indeed > widely used. > >> With any network you need to avoid like the plauge any kind of loop, >> they can cause weird problems and are pretty much unnessasary. for > > well, I don't think that's true - the most I'd say is that given > the usual spanning-tree protocol for eth switches, loops are a bug. > but IB doesn't use eth's STP, and even smarter eth networks can take > good advantage of multiple paths, even loopy ones. > >> instance, why would you put a line between the two core switches? Why >> would that line carry any traffic? > > indeed - those examples don't make much sense. but there are many others > that involve loops that could be quite nice. consider 36 nodes: with > 2x24pt, you get 3:1 blocking (6 inter-switch links). with 3 switches, you > can do 2:1 blocking (6 interlinks in a triangle, forming a loop.) > dual-port nics provide even more entertainment (FNN, but also the ability to > tolerate a leaf-switch failure...) > >> When you consider that it takes 2-4?s for an mpi message to get from > > depends on the nic - mellanox claims ~1 us for connectx (haven't seen it > myself yet.) I see 4-4.5 us latency (worse than myri 2g mx!) on > pre-connectx > mellanox systems. > >> one node to another on the same switch, each extra hop will only >> introduce another 0.02?s (I think?) to that latency so its not really > > with current hardware, I think 100ns per hop is about right. mellanox > claims > 60ns for the latest stuff. > >> Most applications dont use anything like the full bandwidth of the >> interconnect so the half bisectionalness of everything can generally >> be safeley ignored. > > everything is simple for single-purpose clusters. for a shared cluster > with a variety of job types, especially for large user populations, large > jobs and large clusters, you want to think carefully about how much to > compromise the fabric. consider, for instance, interference between a > bw-heavy weather code and some latency-sensitive application (big and/or > tightly-coupled.) > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/ tmattox@gmail.com || timattox@open-mpi.org I'm a bright... http://www.the-brights.net/ From hahn at mcmaster.ca Thu Jul 24 22:18:40 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: <20080725011719.GA9120@hpegg.niftyegg.com> References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> <20080725011719.GA9120@hpegg.niftyegg.com> Message-ID: > This reminds me to ask about all the Xen questions.... Virtual machines > (sans dynamic migration) seem to address the inverse of the problem that > MPI and other computational clustering solutions address. Virtual machines > assume that the hardware is vastly more worthy than the OS and application > where Beowulf style clustering exists because the hardware is one N'th what is necessary > to get to the solution. I don't agree. virtualization is a big deal because so many servers run at low duty cycles (utilization). VM lets you overlap them in time while preserving the fiction that they're on separate machines. this is perfect for latency-tolerant operations (like anything involving humans...). virtualization is a throughput thing. > Where does Xen and other VME (not the system bus) solutions play in Beowulf land. > > The "virtual machine environment" stuff will enable CPU vendors to add > more cores to a box but how does that help/hurt an MPI cluster environment? throughput or "real" parallel? it's all about how tight your coupling is. virtualization is like sharing interconnect links - great if you're latency tolerant (that is, loose-coupled, not synchronized), but not if your parallel processes need to avoid random delays. From hahn at mcmaster.ca Thu Jul 24 22:38:27 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <4888B8D5.4000308@obs.unige.ch> Message-ID: > to generate a Universal FNN. FNNs don't really shine until you have 3 or 4 > NICs/HCAs per compute node. depends on costs. for instance, the marginal cost of a second IB port on a nic seems to usually be fairly small. for instance, if you have 36 nodes, 3x24pt switches is pretty neat for 1 hop nonblocking. two switches in a 1-level fabric would get 2 hops and 3:1 blocking. if arranged in a triangle, 3x24 would get 1 hop 2:1, which might be an interesting design point. > Though, as others have mentioned, IB switch latency is pretty darn small, > so latency would not be the primary reason to use FNNs with IB. yeah, that's a good point - FNN is mainly about utilizing "zero-order" switching when the node selects which link to use, and shows the biggest advantage when it's slow or hard to do multi-level fabrics. > I wonder if anyone has built a FNN using IB... or for that matter, any > link technology > other than Ethernet? I'm a little unclear on how routing works on IB - does a node have something like an ethernet neighbor table that tracks which other nodes are accessible through which port? I think the real problem is that small IB switches have never really gotten cheap, even now, in the same way ethernet has. or IB cables, for that matter. regards, mark hahn. From andrew at moonet.co.uk Fri Jul 25 00:22:59 2008 From: andrew at moonet.co.uk (andrew holway) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> <20080725011719.GA9120@hpegg.niftyegg.com> Message-ID: > virtualization is a throughput thing. Mark, Please can you clarify what you mean by 'throughput' ta From eugen at leitl.org Fri Jul 25 04:00:42 2008 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080724225550.GD6378@bx9.net> References: <981498.39566.qm@web54106.mail.re2.yahoo.com> <4888767E.90905@tamu.edu> <20080724130334.GB20718@student.math> <20080724225550.GD6378@bx9.net> Message-ID: <20080725110042.GL9875@leitl.org> On Thu, Jul 24, 2008 at 03:55:51PM -0700, Greg Lindahl wrote: > Fiber is a commodity. Perhaps you were looking for pricing close > enough to twisted pair copper? In any case, it's not just the cost per > length of cable, the endpoints for fiber are also more expensive. Right now you need GBICs, a splicing cassette, and a splicer (a 10 k$ device). The future looks very bright for polymer fiber, which can be processed with a simple sharp knife (100 MBit/s Ethernet kits + converters are reasonably cheap, 1 GBit/s is being developed -- 10 GBit/s might be rather challenging, unless there's photonic crystal technology). -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE From peter.st.john at gmail.com Fri Jul 25 06:56:20 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> Message-ID: On 7/24/08, Mark Hahn wrote: > > > that makes it sound like inter-node networks in general are doomed ;) > while cores-per-node is increasing, users love to increase cores-per-job. > It is my sacred duty to rescue hypercube topology. Cool Preceeds Coolant :-) Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080725/d9290b9f/attachment.html From hahn at mcmaster.ca Fri Jul 25 07:04:57 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> <20080725011719.GA9120@hpegg.niftyegg.com> Message-ID: >> virtualization is a throughput thing. > > Mark, Please can you clarify what you mean by 'throughput' sorry, I don't whether the use of that term is widespread or not. what I mean is that with some patterns of use, the goal is just to jam through as many serial jobs per day, or to transfer as many GBps over a link as possible. these are operations that can be overlapped, and which are not, individually, latency-sensitive. to me, throughput computing is a lot like handling fungible commodities: jobs by the ton. being lat-tolerant is nice, since it means the system can schedule differently. for instance, if the serial jobs spend any time with the cpu idle (blocked on IO for instance), you can profitably overcommit your cpus (run slightly more processes than cpus). you can gain by overlapping. similarly, virtualization is all about overlapping low duty-cycle jobs. it does bring something new to the table: being able to provision a node with a completely new environment without dealing with the time overhead of booting on bare metal. it's unclear to me whether that's a big deal - I cringe at the thought of offering our users their own choice of OS and distro. using VM's would isolate jobs better, so that they couldn't see that they were, for instance, sharing a node, but I don't think it would greater insulate against performance intrusions (for instance, if someone is consuming all the memory bandwidth, it'll still be noticed.) virtualization is a pretty basic part of "cloud computing" grids, though, where you specifically want to mask users from each other, and where, by virtue of being internet apps, processes do a lot of waiting. regards, mark hahn. From dnlombar at ichips.intel.com Fri Jul 25 07:29:42 2008 From: dnlombar at ichips.intel.com (Lombard, David N) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> <20080725011719.GA9120@hpegg.niftyegg.com> Message-ID: <20080725142942.GA6030@nlxdcldnl2.cl.intel.com> On Thu, Jul 24, 2008 at 10:18:40PM -0700, Mark Hahn wrote: > > This reminds me to ask about all the Xen questions.... Virtual machines > > (sans dynamic migration) seem to address the inverse of the problem that > > MPI and other computational clustering solutions address. Virtual machines > > assume that the hardware is vastly more worthy than the OS and application > > where Beowulf style clustering exists because the hardware is one N'th what is necessary > > to get to the solution. > > I don't agree. virtualization is a big deal because so many servers > run at low duty cycles (utilization). VM lets you overlap them in time > while preserving the fiction that they're on separate machines. this > is perfect for latency-tolerant operations (like anything involving > humans...). virtualization is a throughput thing. That fiction also permits the guests to run disparate OS stacks and permits software to live on when the host that ran it falls over. Finally, the fiction permits moving guests from one physical host to another. All good qualities for suitably chosen apps. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From hahn at mcmaster.ca Fri Jul 25 07:40:13 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> Message-ID: > It is my sacred duty to rescue hypercube topology. Cool Preceeds Coolant :-) I agree HC's are cool, but I think they fit only a narrow ecosystem: where you don't mind lots of potentially long wires, since higher dimensional fabrics are kind of messy in our low-dimensional universe. also, HC's assume intelligent routing on the vertices, so you've got to make the routing overhead low relative to the physical hop latency. it does seem like there is some convergence to using rings onchip, fully connected graphs within a node and fat trees inter-node. one unifying factor is that these are all point-to-point topologies... From larry.stewart at sicortex.com Fri Jul 25 08:50:06 2008 From: larry.stewart at sicortex.com (Lawrence Stewart) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> Message-ID: <4889F62E.7070708@sicortex.com> Mark Hahn wrote: >> It is my sacred duty to rescue hypercube topology. Cool Preceeds >> Coolant :-) > > I agree HC's are cool, but I think they fit only a narrow ecosystem: > where you don't mind lots of potentially long wires, since higher > dimensional > fabrics are kind of messy in our low-dimensional universe. also, HC's > assume intelligent routing on the vertices, so you've got to make the > routing overhead low relative to the physical hop latency. > > it does seem like there is some convergence to using rings onchip, > fully connected graphs within a node and fat trees inter-node. > one unifying factor is that these are all point-to-point topologies... Hypercubes give log diameter, which is good, but when you grow the machine you have to add more ports to each node, which is not so good once you run out of pins. Other topologies, such as Kautz and deBruijn graphs, give log diameter as well, but with a fixed number of ports per node, so you can put the node on one chip and still build systems of greater or lesser size without having to respin. You can route arbitrary size Kautz graphs on a fixed number of layers, so when these somewhat wacky topologies go on chip you can still route it on N metal layers. The point about long wires is well taken, but I think it is the price of low diameter. -- -Larry / Sector IX From peter.st.john at gmail.com Fri Jul 25 09:19:30 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> Message-ID: My plan (like an ant contemplating eating an elephant but...) is for the self-adapting application to optimize not just itself (which it does already) but it's platform (which is imaginable in a complex network). So yes, I want intelligence at the node; I want a node to decide rationally that certain tasks for certain compuational categories of applications should be sent to certain other nodes on the basis of their proximity, current and maybe anticipated workload, and ram/disk/cores/interconnectivity configuation, if that is mixed. But it's just a vague plan atm. On 7/25/08, Mark Hahn wrote: > > It is my sacred duty to rescue hypercube topology. Cool Preceeds Coolant >> :-) >> > > I agree HC's are cool, but I think they fit only a narrow ecosystem: > where you don't mind lots of potentially long wires, since higher > dimensional > fabrics are kind of messy in our low-dimensional universe. also, HC's > assume intelligent routing on the vertices, so you've got to make the > routing overhead low relative to the physical hop latency. > > it does seem like there is some convergence to using rings onchip, > fully connected graphs within a node and fat trees inter-node. > one unifying factor is that these are all point-to-point topologies... > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080725/4dbcee8b/attachment.html From peter.st.john at gmail.com Fri Jul 25 09:24:55 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: <4889F62E.7070708@sicortex.com> References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> <4889F62E.7070708@sicortex.com> Message-ID: I imagine a hybrid topology of certain sized subclusters connected internally with a right topology for their size, and the subclusters connected to each other with some other topology, etc. The way cores on a chip are connected is different obviously from the way chips on a board, or boards on a backplane, or boxes to routers, or routers to metarouters...and I need hybrid-ness so that the optimization/self-reconfiguring algorithm can do something with its platform. So maybe just some scale component or subcluster would be hypercube, some FNN, some tree, I don't know. I want enough wires with enough nodes so that my application can tell **me** what topology works best for a compuational category of applications. Peter On 7/25/08, Lawrence Stewart wrote: > > Mark Hahn wrote: > >> It is my sacred duty to rescue hypercube topology. Cool Preceeds > >> Coolant :-) > > > > I agree HC's are cool, but I think they fit only a narrow ecosystem: > > where you don't mind lots of potentially long wires, since higher > > dimensional > > fabrics are kind of messy in our low-dimensional universe. also, HC's > > assume intelligent routing on the vertices, so you've got to make the > > routing overhead low relative to the physical hop latency. > > > > it does seem like there is some convergence to using rings onchip, > > fully connected graphs within a node and fat trees inter-node. > > one unifying factor is that these are all point-to-point topologies... > > Hypercubes give log diameter, which is good, but when you grow the machine > you have to add more ports to each node, which is not so good once you run > out of pins. > > Other topologies, such as Kautz and deBruijn graphs, give log diameter > as well, > but with a fixed number of ports per node, so you can put the node on > one chip > and still build systems of greater or lesser size without having to respin. > > You can route arbitrary size Kautz graphs on a > fixed number of layers, so when these somewhat wacky topologies > go on chip you can still route it on N metal layers. > > The point about long wires is well taken, but I think it is the price of > low diameter. > > > -- > -Larry / Sector IX > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080725/3d4f4ccd/attachment.html From henning.fehrmann at aei.mpg.de Fri Jul 25 09:26:24 2008 From: henning.fehrmann at aei.mpg.de (Henning Fehrmann) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] [rpciod] in D state Message-ID: <20080725162624.GA18108@gretchen.aei.uni-hannover.de> Hello everybody, I observed the following problem: Usually, on the nodes we have 4 rpciod processes running: [rpciod/0] - [rpciod/3] I assume, the squared brackets mean that these are kernel processes. >From time to time one of them changes in to the 'D' (Uninterruptible sleep) mode. Once it happens on a particular node it is also impossible to mount any nfs exports on this node, which seems logical since the nfs client uses the portmapper. Stopping the automounter, nfs-common, portmapper did not help. I was also unable to unload the nfs module. I tried to debug the portmapper by setting the kernel parameter sunrpc.nfs_debug to 65536 but this creates a lot of junk. Has somebody an idea, what might kill the rpciod process and one could avoid this? Cheers, Henning Fehrmann From niftyompi at niftyegg.com Fri Jul 25 10:52:18 2008 From: niftyompi at niftyegg.com (Nifty niftyompi Mitch) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> Message-ID: <20080725175218.GA4220@hpegg.niftyegg.com> On Thu, Jul 24, 2008 at 06:39:00PM -0400, Mark Hahn wrote: > >> If you need 144 ports a single switch will be be more cost effective > > you'd think so - larger switches let you factor out lots of separate > little power supplies, etc. not to mention transforming lots of cables > into compact, reliable, cheap backplanes. but I haven't seen chassis > switches actually wind up cheaper. > > of course, IB hardware prices seem to be extremely fuzzy (heavily > discounted from list). > >> than a gaggle of 24 ports gathered together into a tangled hairy cable ball with >> 144 ports exposed. > > once very nice thing about the leaf/trunk fat-tree approach is that your > switches can be distributed in node racks. so any individual rack has a > fairly managable bundle of cables coming out of it. Good point. And with double data rate links (QDR some day) the reach of a DDR cable is shorter so yes distributing smaller switches in a set of racks can help in another way. Optical links may bring things back toward larger switches. In some cases updating the set of core switches to DDR and leaving the outer switches at SDR can improve cross sectional bandwidth. Also for upgrades to DDR smaller switches deployed some now some later lowers the budget burn rate. All good stuff. -- T o m M i t c h e l l Looking for a place to hang my hat. From hahn at mcmaster.ca Fri Jul 25 11:27:30 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: <20080725175218.GA4220@hpegg.niftyegg.com> References: <48885416.1030805@obs.unige.ch> <200807240942.57699.kilian@stanford.edu> <20080724174240.GA4349@hpegg.niftyegg.com> <20080724200734.GA4513@hpegg.niftyegg.com> <20080725175218.GA4220@hpegg.niftyegg.com> Message-ID: > Optical links may bring things back toward larger switches. optical increases costs, though. we just put in a couple long DDR runs using Intel Connects cables, which work nicely, but are noticably more expensive than copper ;) although I give DDR and QDR due respect, I don't find many users (we have a lot, and varied) who are overly concerned about bandwidth, even as core-per-node increases. yes, most of them would notice if we downgraded to gigabit, but myri 2g seems to be enough for 4c/node. those that notice the difference between our myri 2g and quadrics clusters are probably responding to latency more than BW anyway. From timattox at open-mpi.org Fri Jul 25 11:35:22 2008 From: timattox at open-mpi.org (Tim Mattox) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] How to configure a cluster network In-Reply-To: References: <48885416.1030805@obs.unige.ch> <4888B8D5.4000308@obs.unige.ch> Message-ID: Hi Mark, Thanks for helping keep the FNN meme alive while I've been "away". :-) On Fri, Jul 25, 2008 at 1:38 AM, Mark Hahn wrote: >> to generate a Universal FNN. FNNs don't really shine until you have 3 or >> 4 >> NICs/HCAs per compute node. > > depends on costs. for instance, the marginal cost of a second IB port on a > nic seems to usually be fairly small. for instance, if you have 36 nodes, > 3x24pt switches is pretty neat for 1 hop nonblocking. > two switches in a 1-level fabric would get 2 hops and 3:1 blocking. > if arranged in a triangle, 3x24 would get 1 hop 2:1, which might be an > interesting design point. Yes, of course the choice of FNN design parameters depends on cost, and that 2-port HCAs are common for IB, so that should be considered. My comment about FNN's shining at the 3 or 4 NIC/node range is because of the jump in node count you can support with a given switch size. With only 2 NICs/node, the triangle pattern is pretty much all you can get, which allows you to connect 50% more nodes than your switch size (36 nodes w/24-port switches). While, at 4 NICs/node, a Universal FNN with 24-port switches can connect 72 nodes, 3x the switch size. Now, the cost/node of the network goes up (relative to the 2-NIC/node FNN), since you have twice as many wires , NICs and switch-ports (per node). >> Though, as others have mentioned, IB switch latency is pretty darn small, >> so latency would not be the primary reason to use FNNs with IB. > > yeah, that's a good point - FNN is mainly about utilizing "zero-order" > switching when the node selects which link to use, and shows the biggest > advantage when it's slow or hard to do multi-level fabrics. My perspective on what is the best or most important aspect of a FNN has shifted over the years. I honestly think it really depends on the goals of the cluster in question. For some, the latency reduction is key. For others it is the guaranteed bandwidth between pairs of nodes (since no communication link is shared between disjoint node pairs, communication patterns that are permutations pass conflict free). And for some it is the potential cost savings to get "good" connectivity for more nodes than a single switch can handle. And another potential benefit is that you can engineer the FNN to place more bandwidth between specified node pairs. This latter benefit turned into my dissertation on Sparse FNNs, which directly exploit a priori knowledge of expected communication patterns. It is still yet to be shown in a practical installation that a Sparse FNN is the right choice (politically or otherwise). I don't know of any implementations beyond our KASY0 machine from 2003. >> I wonder if anyone has built a FNN using IB... or for that matter, any >> link technology >> other than Ethernet? > > I'm a little unclear on how routing works on IB - does a node have something > like an ethernet neighbor table that tracks which other nodes are > accessible through which port? Ah, well, having never built an IB based FNN, I don't know the very low level details of what would be required, but from what I understand about how IB routing works, it would simply be a matter of setting up the proper routing tables. AFAIK, the Open MPI IB implementation would figure it out automatically, as long as the disjoint IB fabrics had unique IDs (equivalent to subnet address & mask for ethernet) (GIDs?). > I think the real problem is that small IB switches have never really gotten > cheap, even now, in the same way ethernet has. or IB cables, > for that matter. Yeah, that is true. Though, how much more do the larger switches cost? What really counts is the ratio of small switch to large switch cost, assuming you are trying to save money with a FNN, and that cables are not ludicrously expensive. Though, not every HPC installation is as monetarily limited as taxpayers might hope. (Oh, did I say that out loud?) Oh, another topic of discussion is how do many-core nodes change the design space for cluster networks? For instance, does the network on Ranger have enough bandwidth on a per core basis? As far as I can tell, each node has 16 cores, yet each node only has one IB link? That is some serious oversubscription if the cores are not talking locally. > regards, mark hahn. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/ tmattox@gmail.com || timattox@open-mpi.org I'm a bright... http://www.the-brights.net/ From walid.shaari at gmail.com Sat Jul 26 08:33:11 2008 From: walid.shaari at gmail.com (Walid) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] high %system utiliziation on infiniband nodes Message-ID: Hi, I have two nodes Interconnected using Infiniband, and using Intel-MPI over dapl1.2.7 from OFED 1.3.1 compiled localy on the same build, when there is interconnect communication i can see on one of the nodes that i monitoring have a high cpu utiliztion (%system) that exceeds 60%. the mpi job is helloworld/pallas runing over two nodes, 8 cores each (16 processes in total) a snapshot of mpstat -P ALL on one node 06:22:20 PM CPU %user %nice %system %iowait %irq %soft %idle intr/s 06:22:22 PM all 30.25 0.00 69.75 0.00 0.00 0.00 0.00 1768.50 06:22:22 PM 0 30.00 0.00 70.00 0.00 0.00 0.00 0.00 566.50 06:22:22 PM 1 30.50 0.00 69.00 0.00 0.00 0.00 0.00 201.00 06:22:22 PM 2 30.50 0.00 69.50 0.00 0.00 0.00 0.00 0.00 06:22:22 PM 3 29.50 0.00 70.50 0.00 0.00 0.00 0.00 0.00 06:22:22 PM 4 28.50 0.00 71.00 0.00 0.00 0.00 0.00 0.00 06:22:22 PM 5 30.00 0.00 70.00 0.00 0.00 0.00 0.00 0.00 06:22:22 PM 6 31.00 0.00 69.50 0.00 0.00 0.00 0.00 1000.50 06:22:22 PM 7 32.00 0.00 68.00 0.00 0.00 0.00 0.00 0.00 now i get the same behaviour on RHEL5.0/5.1 and RHEL4.6, using Infiniband or ethernet, so is this normal, to me it does not, or at least i have never seen such behaviour before? the node is a DELL PE1950 regards Walid -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080726/998be018/attachment.html From iioleynik at gmail.com Wed Jul 23 19:56:35 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] Building new cluster - estimate Message-ID: I am in process of upgrading computational facilities of my lab and considering of building/purchasing 40 node cluster. Before contacting vendors I would like to get some understanding how much would it cost. The major considerations/requirements: 1. newest Intel quad-core CPUs (Opteron quad-core cpus are out of question due to ridiculous pricing), good balance of price/performance 2. reasonably fast interconnect (IB SDR 10Gb/s would suffice our computational needs (running LAMMPs molecular dynamics and VASP DFT codes) 3. 48U rack (preferably with good thermal management) I used newegg plus some extra info to get pricing for IB. This is what I got: Single node configuration ------------------------------------------------------------------------------------------------------------------ - 2x Intel Xeon E5420 Hapertown 2.5 GHz quad core CPU : 2x$350=$700 - Dual LGA 771 Intel 5400 Supermicro mb : $430 - Kingston 8 Gb (4x2Gb) DDR2 FB-DIMM DDR2 667 memory : $360 - WD Caviar 750 Gb SATA HD : $110 -Melanox InfiniHost Single port 4x IB SDR 10Gb/s PCI-e card : $125 ( http://www.colfaxdirect.com/store/pc/viewPrd.asp?idproduct=12) - 1U case including power supply, fans : $150 ------------------------------------------------------------------------------------------------------ $1,875/node Cluster estimate: ------------------------ 40 nodes: : 40x$187=$75,000 2x24-port 4X (10Gb/s) 1U SDR IB Flextronics Switches : 2x$2400=$4,800 (http://www.colfaxdirect.com/store/pc/viewPrd.asp?idcategory=7&idproduct=13) IB cables : 40$65=$,2600 48U rack cabinet, PDUs, power cables, : $3000 ----------------------------------------------------------------------------------------------------------- TOTAL: $85,400 I would like to get some advice from experts from this list, is this pricing realistic? Are there any flaws with configuration? In principle, we have some experience in building and managing clusters, but with 40 node systems it would make sense to get a good cluster integrator to do the job. Can people share their recent experiences and recommend reliable vendors to deal with? Many thanks, Ivan Oleynik -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080723/08e8c4d7/attachment.html From himikehawaii1 at yahoo.com Thu Jul 24 05:19:18 2008 From: himikehawaii1 at yahoo.com (MDG) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <4887F302.8060909@tamu.edu> Message-ID: <981498.39566.qm@web54106.mail.re2.yahoo.com> I am, wirin my home for a high-speed intranet, to internet fateway.? I had ;planned 10/100/1000 and Catb6 cables, but with the merger ofr entertainment, my 50,000 book digitial ,ibrary (anffrowing), as well as statistcal modeling of econometrics and comnpamies including hedging (Monte Carlo Simulations) will als be doing video and audio editinmg and some web site hosing. ? wuith 40/1000 ethernet being talked about as well as InfinioBand should i just wire forone of those insted as senseless to cut acallopen twice. The storgage area for the centralized computers and data storage, nodes will als have some, is already wired, cooling vents cut and installed to dump excess het into the building, it is a condo, exhaust stsrem as wellas the room can be closde and kept air=conditioned with the heat dumbs turned off. ? My question is the wiring with the wotk i do, my2 terabytes is full bringing 3 more on line, and exopct much more, it only makes sense to look at the backbone to see if it will be abottle neck. ? What are your feelings. 10/00/100, 40 gigabyte ethernet,or 100 gigabyte eithernet or InfiniBand?? I can run the CAR 6 and just change switches and routers later as needed but is far cheaoer to put it the wire growth path now.? What do you recommend.? we will be runnng anywhere from 6 at the start to 40 cores, te database will be a dedicated node, maybe if overloaded a 2nd database or nas WILL BE ADDED.? i USE scsi SYSTEM REFERRABLY AS TRAINED THAT WAY BUT MAY ALSO LOOK AT rAID AT LEAST rAID 5 sata SYATES, WITH FAST DUAL or a QUAD iore, or multile Dual or Quad Cores in the groth path. ? whawould you sujest as homes will soon neeed a central data management vault where even game consoles feed the system instead of multiole cmputers everywhere. ? Later we will be doing the smae to a TESDA accreduted Private Philippone Technical Collegee with approximaeyly 150 nodes, and muliple servers and NAS systems, so plannong goes for both. and my home HPU may be the daily offsite, out of the cointry even,daily back up, I canm get guranteed bandwidgth so tey could actually use server here but that pushes it with internationak work in real time as the Philippines is far ferom haHawaii in rebilility.? And the Philippine Static Modem is tooslowfor that many to access in real time.? Thank y ? Mike --- On Wed, 7/23/08, Gerry Creager wrote: From: Gerry Creager Subject: Re: [Beowulf] Drive screw fixed with LocTite To: "Perry E. Metzger" Cc: beowulf@beowulf.org, "David Mathog" Date: Wednesday, July 23, 2008, 5:12 PM Perry E. Metzger wrote: > "David Mathog" writes: >> A vendor who shall remain nameless graced us with a hot swappable drive >> caddy in which one of the three mounting screws used to fasten the drive >> to the caddy had been treated with blue LocTite. This wasn't obvious >> from external inspection, but the telltale blue glop was on the threads >> when the screw finally let go and came out. It was beginning to look >> like power tools were going to be needed to get it out, and the screw >> head was pretty badly torn up after removal. > > I believe a touch from a soldering iron will usually loosen LocTite, > but that might also damage a drive, so be careful. Acetone or mineral spirits will also take care of locktite. Based on some rather harsh experience showed that the piddly little heat generated by a soldering iron won't really cause much damage. >> This is the first time I have encountered a drive screw on a removable >> drive which was, well, unremovable. Is this a trend or are we just >> dealing with a sadistic assembler? > > I've never seen it used with a drive, it is certainly not normal. > > Perry > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080724/6d10853a/attachment.html From schoenk at utulsa.edu Thu Jul 24 19:05:39 2008 From: schoenk at utulsa.edu (Schoenefeld, Keith) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] Strange SGE scheduling problem In-Reply-To: References: <5E0BB54BEC5EBA44B373175A080E64010A733879@ophelia.ad.utulsa.edu> Message-ID: <5E0BB54BEC5EBA44B373175A080E64010A7948BE@ophelia.ad.utulsa.edu> Yes, with top. I have one queue, with 12 machines, 8 slots per machine. -- KS -----Original Message----- From: Reuti [mailto:reuti@staff.uni-marburg.de] Sent: Wednesday, July 23, 2008 4:35 PM To: Schoenefeld, Keith Cc: Beowulf ML Subject: Re: [Beowulf] Strange SGE scheduling problem Hi, Am 22.07.2008 um 23:54 schrieb Schoenefeld, Keith: > My cluster has 8 slots (cores)/node in the form of two quad-core > processors. Only recently we've started running jobs on it that > require > 12 slots. We've noticed significant speed problems running > multiple 12 > slot jobs, and quickly discovered that the node that was running 4 > slots > on one job and 4 slots on another job was running both jobs on the > same > processor cores (i.e. both job1 and job2 were running on CPU's #0-#3, > and the CPUs #4-#7 were left idling. The result is that the jobs were > competing for time on half the processors that were available. how did you check this? With `top`? You have one queue with 8 slots per machine? -- Reuti > In addition, a 4 slot job started well after the 12 slot job has > ramped > up results in the same problem (both the 12 slot job and the four slot > job get assigned to the same slots on a given node). > > Any insight as to what is occurring here and how I could prevent it > from > happening? We were are using SGE + mvapich 1.0 and a PE that has the > $fill_up allocation rule. > > I have also posted this question to the hpc_training-l@georgetown.edu > mailing list, so my apologies for people who get this email multiple > times. > > Any help is appreciated. > > -- KS > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From schoenk at utulsa.edu Thu Jul 24 19:07:18 2008 From: schoenk at utulsa.edu (Schoenefeld, Keith) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] Strange SGE scheduling problem In-Reply-To: <48879C50.5000607@charter.net> References: <5E0BB54BEC5EBA44B373175A080E64010A733879@ophelia.ad.utulsa.edu> <48879C50.5000607@charter.net> Message-ID: <5E0BB54BEC5EBA44B373175A080E64010A7948BF@ophelia.ad.utulsa.edu> This definitely looked promising, but unfortunately it didn't work. I both added the appropriate export lines to my qsub file, and then when that didn't work I checked the mvapich.conf file and confirmed that the processor affinity was disabled. I wonder if I can turn it on and make it work, but unfortunately the cluster is full at the moment, so I can't test it. -- KS -----Original Message----- From: Shannon V. Davidson [mailto:svdavidson@charter.net] Sent: Wednesday, July 23, 2008 4:02 PM To: Schoenefeld, Keith Cc: beowulf@beowulf.org Subject: Re: [Beowulf] Strange SGE scheduling problem Schoenefeld, Keith wrote: > My cluster has 8 slots (cores)/node in the form of two quad-core > processors. Only recently we've started running jobs on it that require > 12 slots. We've noticed significant speed problems running multiple 12 > slot jobs, and quickly discovered that the node that was running 4 slots > on one job and 4 slots on another job was running both jobs on the same > processor cores (i.e. both job1 and job2 were running on CPU's #0-#3, > and the CPUs #4-#7 were left idling. The result is that the jobs were > competing for time on half the processors that were available. > > In addition, a 4 slot job started well after the 12 slot job has ramped > up results in the same problem (both the 12 slot job and the four slot > job get assigned to the same slots on a given node). > > Any insight as to what is occurring here and how I could prevent it from > happening? We were are using SGE + mvapich 1.0 and a PE that has the > $fill_up allocation rule. > > I have also posted this question to the hpc_training-l@georgetown.edu > mailing list, so my apologies for people who get this email multiple > times. > Any insight as to what is occurring here and how I could prevent it from > happening? We were are using SGE + mvapich 1.0 and a PE that has the > $fill_up allocation rule. > This sounds like MVAPICH is assigning your MPI tasks to your CPUs starting with CPU#0. If you are going to run multiple MVAPICH jobs on the same host, turn off CPU affinity by starting the MPI tasks with the environment variable VIADEV_USE_AFFINITY=0 and VIADEV_ENABLE_AFFINITY=0. Cheers, Shannon > Any help is appreciated. > > -- KS > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > -- _________________________________________ Shannon V. Davidson Software Engineer Appro International 636-633-0380 (office) 443-383-0331 (fax) _________________________________________ From svdavidson at charter.net Fri Jul 25 09:22:45 2008 From: svdavidson at charter.net (Shannon V. Davidson) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] Strange SGE scheduling problem In-Reply-To: <5E0BB54BEC5EBA44B373175A080E64010A7948BF@ophelia.ad.utulsa.edu> References: <5E0BB54BEC5EBA44B373175A080E64010A733879@ophelia.ad.utulsa.edu> <48879C50.5000607@charter.net> <5E0BB54BEC5EBA44B373175A080E64010A7948BF@ophelia.ad.utulsa.edu> Message-ID: <4889FDD5.5000907@charter.net> Schoenefeld, Keith wrote: > This definitely looked promising, but unfortunately it didn't work. I > both added the appropriate export lines to my qsub file, and then when > that didn't work I checked the mvapich.conf file and confirmed that the > processor affinity was disabled. I wonder if I can turn it on and make > it work, but unfortunately the cluster is full at the moment, so I can't > test it. > You may want to verify that the environment variable was actually passed down to the MPI task. To set environment variables for MPI jobs, I usually either specify the environment variable on the mpirun command line or in a wrapper script: mpirun -np 32 -hostfile nodes VIADEV_ENABLE_AFFINITY=0 a.out mpirun -np 32 -hostfile nodes run.sh a.out where run.sh sets up the local environment including environment variables. The second method is more portable to various shells and MPI versions. Shannon > -- KS > > -----Original Message----- > From: Shannon V. Davidson [mailto:svdavidson@charter.net] > Sent: Wednesday, July 23, 2008 4:02 PM > To: Schoenefeld, Keith > Cc: beowulf@beowulf.org > Subject: Re: [Beowulf] Strange SGE scheduling problem > > Schoenefeld, Keith wrote: > >> My cluster has 8 slots (cores)/node in the form of two quad-core >> processors. Only recently we've started running jobs on it that >> > require > >> 12 slots. We've noticed significant speed problems running multiple >> > 12 > >> slot jobs, and quickly discovered that the node that was running 4 >> > slots > >> on one job and 4 slots on another job was running both jobs on the >> > same > >> processor cores (i.e. both job1 and job2 were running on CPU's #0-#3, >> and the CPUs #4-#7 were left idling. The result is that the jobs were >> competing for time on half the processors that were available. >> >> In addition, a 4 slot job started well after the 12 slot job has >> > ramped > >> up results in the same problem (both the 12 slot job and the four slot >> job get assigned to the same slots on a given node). >> >> Any insight as to what is occurring here and how I could prevent it >> > from > >> happening? We were are using SGE + mvapich 1.0 and a PE that has the >> $fill_up allocation rule. >> >> I have also posted this question to the hpc_training-l@georgetown.edu >> mailing list, so my apologies for people who get this email multiple >> times. >> Any insight as to what is occurring here and how I could prevent it >> > from > >> happening? We were are using SGE + mvapich 1.0 and a PE that has the >> $fill_up allocation rule. >> >> > > This sounds like MVAPICH is assigning your MPI tasks to your CPUs > starting with CPU#0. If you are going to run multiple MVAPICH jobs on > the same host, turn off CPU affinity by starting the MPI tasks with the > environment variable VIADEV_USE_AFFINITY=0 and VIADEV_ENABLE_AFFINITY=0. > > Cheers, > Shannon > > >> Any help is appreciated. >> >> -- KS >> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> > http://www.beowulf.org/mailman/listinfo/beowulf > >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080725/9537ae7d/attachment.html From himikehawaii1 at yahoo.com Fri Jul 25 22:28:39 2008 From: himikehawaii1 at yahoo.com (MDG) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080725110042.GL9875@leitl.org> Message-ID: <859578.1281.qm@web54106.mail.re2.yahoo.com> I agree that the high costs in fiber iw the end-points but when cutting open walls, as I am doing for Cat 6, and 10/100/1000 (or gigabit) is available and while maybe still being developed, the question is this. ? Just like cell phone technology leap frogged in Asia, they are way ahead on 3rd, 3.5 and the start of 4th generation cell technology, I see things there the USA cannot offer or if does it is too expensive to be worth while yet in Japan, Hong Kong and for that matter mainland China it is part of everyday usage. ? Now I am assuming that the lack of foresight, just look at our freeways whichwere planned with little usage growth, and were outdated by compkletetion, the same thying is already ahppened with 10megabit wired companies, some manage the 10/100, the smart at least in the last year left a groeth path, at least where tey took my advice, and wired CAT 6 backbones to allow 10/100/1000 without problem but as pointed out the price of te fiber cable itself is fallien to bear commodity prices, the endpoints well I still say Moore's Law will make endpoints fall fast also.? This is also because of the very topic of HPCs.? The distributed processing of an HPC is limited bu the slowest link; Intel did not but high-end PIV Xeons and Quad-Cores for fun, the enternal bus of the computers was not able to keep up so teir 2 and teir 3 cache had to make up the shortfall. ? I agree fiber end-points, just as fibre-channels drive adapters, are expensive, too expensive.? But so is installing something thst wil be onslete, if it is standarized which I do believe Gigabit has been, the problem is the same as the USA POTS (Plain Old Tele Phoe System)which happens to be copper wire and the fact that compaies do not want to update sunk costs if they do not have to; tjhus the slow adoption of Gigabit and hemce the likelyhood that it will be leapfrogged by 10/40/100 Ethernet or InfiniBand fiber/ ? I do not like to plan a system that has no growth path or is obsolete before itis installed.? It does not mean I have to install the endpoints ommediately,? can pull the Cat 6 for Gigabit? and the back up at the sametime.? the question is which is the more likely winner? ? InfiniBand was declared all but dead? not too long ago and it has been resurrected.? No one really thoygh anyone wold need quad=core computers, but they are here basically because of developing faster single core was getting harder aand harder, and they are distributed processing which is the bery idea of an HPC. ? Now if the enternal bus of modern PCs cannot keep up it becomes fairly obvious that a distributed betwork processing system will need as fast of infrastructure as possible and considering replacing this infrastructur is an expensive process it makes sense to plan in groth paths, at least to me, after all I can wait to use alternate paths till end-point prices fall, the question is which path is the most likely?? yes the typical fiber connection tales special equipment to splice amd connect, etc.? I just am trying to not have to cut the walls open again in a year. ? So without pointing out repeatly the hgh cost os the end-points does anyone habe any insight or something to contribute on what I see as a critical issue in HPC development.? I think we all know fiber is not as cheap as ethernet 10/100 or Cat 5 (or Cat 5e which some can squeeze 10/100/1000 )gigabit)speed from, and Cat 6 while maybe 203 times per yard/meter as expensive as Cat 5 it is far cheaper then cutting and repulling, expenses, even at a thrird the cable price labor kills youon expenses. ? Therefore my delimea, where will we be in 1 to 2 years and what sppeds. I know I am alrady looking for dual brodband input at home and pricing business packages.? Further as tele-commuting allows employers to off load physical space to wotk at home there will be another huge demand for speed in bandwidgth, I know I am not happy with mycable modem speed. When you combine that aong with entertainment and telephony (VoIP) we are already hitting the intranet (internal network speeds) and Internet speds are congested so the sale of dedicated bandwidtgh as well as connections ill only increase.? ? Therefore te best comment was I thnk Greg's on UTC (53).? But specifically has anyone done any study, thought or projections on the needs as well as how to prepare, and the methodology to do so, for the next steps?? I know as a business-person that any major project usually requires analysis of the useful life as well as benefits be it financial, a concern of mine such as NPV analysus for the MBAs out there, to infrastructure and equipment needs over the life or I should say projected useful life for the techs out there. ? So for the techs what and where do you think we are headed and i know Gigbit, 10/100/1000 is a short-term solution, but how shortterm?? Like I said labor is expensive, end-point debates will rageon as the new replaces the old.? I am just trying to develop something that can do the job for maybe 3 to 5 years.? So if you had to pull Cat 6 and a second system at the same time what would you pull and why? ? Mike --- On Fri, 7/25/08, Eugen Leitl wrote: From: Eugen Leitl Subject: Re: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? To: beowulf@beowulf.org Date: Friday, July 25, 2008, 1:00 AM On Thu, Jul 24, 2008 at 03:55:51PM -0700, Greg Lindahl wrote: > Fiber is a commodity. Perhaps you were looking for pricing close > enough to twisted pair copper? In any case, it's not just the cost per > length of cable, the endpoints for fiber are also more expensive. Right now you need GBICs, a splicing cassette, and a splicer (a 10 k$ device). The future looks very bright for polymer fiber, which can be processed with a simple sharp knife (100 MBit/s Ethernet kits + converters are reasonably cheap, 1 GBit/s is being developed -- 10 GBit/s might be rather challenging, unless there's photonic crystal technology). -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080725/e572e6b2/attachment.html From matt at technoronin.com Sat Jul 26 14:45:35 2008 From: matt at technoronin.com (Matt Lawrence) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: Message-ID: On Wed, 23 Jul 2008, Ivan Oleynik wrote: > In principle, we have some experience in building and managing clusters, but > with 40 node systems it would make sense to get a good cluster integrator to > do the job. Can people share their recent experiences and recommend reliable > vendors to deal with? I suggest that you need a minimum of 16GB/node (2GB/core) and possibly 32GB/node (4GB/node). You also should have a dedicated system for managing the cluster, a commodity PC with a lot of disk is adequate. That is the ideal place to host a DHCP server, tftp server (for for pxeboot), web/ftp/nfs server for installation and local mirror of the various repositories. This system should not be accessable by the users, just the admins. You will want to set up IPMI on all of the nodes. You want to avoid touching the harware or even going into the machine room whenever practical. Label both ends of each cable with source and destination. We have been using LSL-79 labels from wiremarkersplus.com. I am very fond of using velcro cable ties http://www.fastenation.com/class.php?id=9 for managing cables. I personally bought a spool of the 8" straps for use on my previous job and the folks at my current job bought a spool on my recommendation. Very useful. We have also been buying custom ethernet cables in 1' increments from reynco.com. Very little price difference and ir radically reduced the rat's nest of cabling. Color coding by function/network is handy as well. Also, black ethernet cables inside of a black rack within a data center that is really well lit can lead to all sorts of headaches. Release your inner anti-goth and use brightly colored cables. You will want to record the MAC address to physical location mapping sooner rather than later. Install blanking panels to cover unused spaces in the rack. This will improve airflow. Both Rocks and Warewulf seem to be excellent and have very friendly and supportive developers. Talking to them would probably be a good idea. A much longer message than I planned, hopefully it is of use. -- Matt It's not what I know that counts. It's what I can remember in time to use. From niftyompi at niftyegg.com Sat Jul 26 20:54:20 2008 From: niftyompi at niftyegg.com (Nifty niftyompi Mitch) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] Infiniband modular switches In-Reply-To: <20080725002223.GA14358@bx9.net> References: <9FA59C95FFCBB34EA5E42C1A8573784F013425E3@mtiexch01.mti.com> <487B8FEF.4030708@myri.com> <20080725002223.GA14358@bx9.net> Message-ID: <20080727035420.GA4239@hpegg.niftyegg.com> On Thu, Jul 24, 2008 at 05:22:24PM -0700, Greg Lindahl wrote: > On Mon, Jul 14, 2008 at 01:42:07PM -0400, Patrick Geoffray wrote: > > > AlltoAll of large messages is not a useless synthetic benchmark IMHO. > > AlltoAll is a real thing used by real codes, but do keep in mind that > there are many algorithms for AlltoAll with various message sizes and > network topologies, so it's testing both the raw interconnect and the > AlltoAll implementation. I don't know of the results you mention were > run with an optimal AlltoAll... do you? Is there a single "optimal AlltoAll"? I can imagine a handful of ways to build an AlltoAll but I suspect that various cards, system, transports, switches, topologies ... each will act differently on different processors and memory systems. Is there a collection of coded algorithms that can be built into the likes of OpenMPI? If so a simple site hook to benchmark then pick/linkto one over another could follow. -- T o m M i t c h e l l Looking for a place to hang my hat. From niftyompi at niftyegg.com Sat Jul 26 21:15:23 2008 From: niftyompi at niftyegg.com (Nifty niftyompi Mitch) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <859578.1281.qm@web54106.mail.re2.yahoo.com> References: <20080725110042.GL9875@leitl.org> <859578.1281.qm@web54106.mail.re2.yahoo.com> Message-ID: <20080727041523.GC4239@hpegg.niftyegg.com> On Fri, Jul 25, 2008 at 10:28:39PM -0700, MDG wrote: > > ..... > InfiniBand was declared all but dead not too long ago and it has been ..... > > > Fiber is a commodity. Perhaps you were looking for pricing close > > enough to twisted pair copper? In any case, it's not just the cost per > > length of cable, the endpoints for fiber are also more expensive. > > Right now you need GBICs, a splicing cassette, and a splicer > (a 10 k$ device). The future looks very bright for polymer fiber, > which can be processed with a simple sharp knife (100 MBit/s > Ethernet kits + converters are reasonably cheap, 1 GBit/s > is being developed -- 10 GBit/s might be rather challenging, > unless there's photonic crystal technology). Did I read this correctly? Are you cutting holes in the walls of a home.....? It seems to me that today's 'exotic' links should be confined to the machine room/ data closet. Consider the current maximum length of Infiniband DDR over Cu .... After that you can look at end point bandwidth to an office or entertainment room. It is not hard or expensive to pull high quality TV coax which can support multiple HDTV links and multiple cat6 for trunked data links leaving the machine room. Also all wires in the wall MUST be safety rated for such use. -- T o m M i t c h e l l Looking for a place to hang my hat. From eugen at leitl.org Sun Jul 27 12:28:07 2008 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080727041523.GC4239@hpegg.niftyegg.com> References: <20080725110042.GL9875@leitl.org> <859578.1281.qm@web54106.mail.re2.yahoo.com> <20080727041523.GC4239@hpegg.niftyegg.com> Message-ID: <20080727192807.GE10566@leitl.org> On Sat, Jul 26, 2008 at 09:15:23PM -0700, Nifty niftyompi Mitch wrote: > Did I read this correctly? > Are you cutting holes in the walls of a home.....? Real homes have cable ducts -- I'm limited to drilling through walls. Unfortunately, due to lousy electric installation I'm stuck with severe ground loop issues (I catched that too late orelse I would have put in a heavy ground wire strung along the CAT 6/7). Galvanic separation would be a god-send -- but for the price tag, and unavailability of GBit polymer optical. > It seems to me that today's 'exotic' links should be confined > to the machine room/ data closet. Consider the current > maximum length of Infiniband DDR over Cu .... It's a home installation, not an unobtainium interconnect. > After that you can look at end point bandwidth to an office or > entertainment room. It is not hard or expensive to pull high quality > TV coax which can support multiple HDTV links and multiple cat6 for > trunked data links leaving the machine room. CAT 6 is great, except if you see sparks when plugging in RJ-45s. Also, most systems tend to shutdown spontaneously overnight when on such dirty network. > Also all wires in the wall MUST be safety rated for such use. -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE From gerry.creager at tamu.edu Sun Jul 27 16:37:07 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080727041523.GC4239@hpegg.niftyegg.com> References: <20080725110042.GL9875@leitl.org> <859578.1281.qm@web54106.mail.re2.yahoo.com> <20080727041523.GC4239@hpegg.niftyegg.com> Message-ID: <488D06A3.5000502@tamu.edu> Er... Fiber's no longer exotic. I've got Cat6 in the walls of my house (although I had to do it as Eugen did by drilling holes and pulling with snakes in this house. In my garage workshop, I've got electrical in conduit, Cat6 in conduit and multimode in conduit. I can add more conduit and single mode if I need to later, easily, since the conduit resides on the wall surface (it IS a workshop, after all). We're designing our next/final house. I intend to have a data race with at least 4pr of single mode and multi-mode each, and probably a 4-in (10cm) conduit for copper data cables: expansion room. I can handle most anything I can currently imagine with that infrastructure. I'll also stand up wireless for commodity connectivity for visitors, laptops, game systems, etc., I'm leery of quick-splicing fiber for 10GBE but I'm not uncomfortable with that for gigabit or there abouts. I can order 4-pr single/multimode for something around $2/ft these days with terminations, in custom lengths. That's a lot of potential bandwidth. Home-run it to the machine room where my home cluster resides, and where the demarc is to the real data connectivity and I should be set. I've never been afraid to cut holes in walls for connectivity. I do find it easier to do before we've got all the junk in. In our current house, I got to do 3 days of wiring before we moved in, which helped a lot. I got all the cat6 pulled (telco is also on cat6 but on different runs from data; no, I'm not doing VoIP at home yet but that expansion is simple here). Nifty niftyompi Mitch wrote: > On Fri, Jul 25, 2008 at 10:28:39PM -0700, MDG wrote: >> > ..... >> InfiniBand was declared all but dead not too long ago and it has been > ..... >>> Fiber is a commodity. Perhaps you were looking for pricing close >>> enough to twisted pair copper? In any case, it's not just the cost per >>> length of cable, the endpoints for fiber are also more expensive. >> Right now you need GBICs, a splicing cassette, and a splicer >> (a 10 k$ device). The future looks very bright for polymer fiber, >> which can be processed with a simple sharp knife (100 MBit/s >> Ethernet kits + converters are reasonably cheap, 1 GBit/s >> is being developed -- 10 GBit/s might be rather challenging, >> unless there's photonic crystal technology). > > Did I read this correctly? > Are you cutting holes in the walls of a home.....? > > It seems to me that today's 'exotic' links should be confined > to the machine room/ data closet. Consider the current > maximum length of Infiniband DDR over Cu .... > > After that you can look at end point bandwidth to an office or > entertainment room. It is not hard or expensive to pull high quality > TV coax which can support multiple HDTV links and multiple cat6 for > trunked data links leaving the machine room. > > Also all wires in the wall MUST be safety rated for such use. > > > -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From niftyompi at niftyegg.com Sun Jul 27 16:51:38 2008 From: niftyompi at niftyegg.com (Nifty niftyompi Mitch) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080727192807.GE10566@leitl.org> References: <20080725110042.GL9875@leitl.org> <859578.1281.qm@web54106.mail.re2.yahoo.com> <20080727041523.GC4239@hpegg.niftyegg.com> <20080727192807.GE10566@leitl.org> Message-ID: <20080727235138.GA4331@hpegg.niftyegg.com> On Sun, Jul 27, 2008 at 09:28:07PM +0200, Eugen Leitl wrote: > On Sat, Jul 26, 2008 at 09:15:23PM -0700, Nifty niftyompi Mitch wrote: > > > Did I read this correctly? > > Are you cutting holes in the walls of a home.....? > > Real homes have cable ducts -- I'm limited to drilling through walls. > Unfortunately, due to lousy electric installation I'm stuck with > severe ground loop issues (I catched that too late orelse > I would have put in a heavy ground wire strung along the CAT 6/7). > Galvanic separation would be a god-send -- but for the price > tag, and unavailability of GBit polymer optical. > ...... > > CAT 6 is great, except if you see sparks when plugging in RJ-45s. > Also, most systems tend to shutdown spontaneously overnight when > on such dirty network. > > > Also all wires in the wall MUST be safety rated for such use. > OK if you have severe ground loop issues you must rethink what you are doing! Ground loops generated by a bad power install can have almost unlimited currents and present a serious fire hazard. Get the problem fixed. Contact a qualified electrician and find out what the root cause is. AFAIK most network switches do provide MAC level isolation at the physical layer so I am confused and _concerned_ about your comment about seeing sparks at the RJ-46. Get the problem addressed promptly. -- T o m M i t c h e l l Looking for a place to hang my hat. From james.p.lux at jpl.nasa.gov Sun Jul 27 19:19:56 2008 From: james.p.lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080727192807.GE10566@leitl.org> References: <20080725110042.GL9875@leitl.org> <859578.1281.qm@web54106.mail.re2.yahoo.com> <20080727041523.GC4239@hpegg.niftyegg.com> <20080727192807.GE10566@leitl.org> Message-ID: <20080727191956.f75pi29o0sscocw8@webmail.jpl.nasa.gov> Quoting Eugen Leitl , on Sun 27 Jul 2008 12:28:07 PM PDT: > On Sat, Jul 26, 2008 at 09:15:23PM -0700, Nifty niftyompi Mitch wrote: > >> Did I read this correctly? >> Are you cutting holes in the walls of a home.....? > > Real homes have cable ducts -- I'm limited to drilling through walls. > Unfortunately, due to lousy electric installation I'm stuck with > severe ground loop issues (I catched that too late orelse > I would have put in a heavy ground wire strung along the CAT 6/7). > Galvanic separation would be a god-send -- but for the price > tag, and unavailability of GBit polymer optical. bear in mind that ordinary ethernet both coax and twisted pair is galvanically isolated. From hahn at mcmaster.ca Sun Jul 27 22:52:38 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: Message-ID: > considering of building/purchasing 40 node cluster. Before contacting > vendors I would like to get some understanding how much would it cost. The vendors have at least list prices available on their websites. > 1. newest Intel quad-core CPUs (Opteron quad-core cpus are out of question > due to ridiculous pricing), good balance of price/performance at least for code which is relatively cache-friendly. > 2. reasonably fast interconnect (IB SDR 10Gb/s would suffice our > computational needs (running LAMMPs molecular dynamics and VASP DFT codes) > 3. 48U rack (preferably with good thermal management) "thermal management"? servers need cold air in front and unobstructed exhaust. that means open or mesh front/back (and blanking panels). > - 2x Intel Xeon E5420 Hapertown 2.5 GHz quad core CPU : 2x$350=$700 > - Dual LGA 771 Intel 5400 Supermicro mb : > $430 this is an eatx-format board with the vendor requiring >= 550W PSU, right? you should definitely budget for the add-in IPMI card. > - Kingston 8 Gb (4x2Gb) DDR2 FB-DIMM DDR2 667 memory : $360 wouldn't a 5100-based board allow you to avoid the premium of fbdimms? > - WD Caviar 750 Gb SATA HD : > $110 I usually figure a node should have zero or as many disks as feasible. > - 1U case including power supply, fans : > $150 I suspect this isn't a 550W EATX chassis... > 48U rack cabinet, PDUs, power cables, : > $3000 seems like a lot - those must be pretty fancy controllable pdus. for smallish clusters like this, I'd feel entirely comfortable with just IPMI, omitting controllable pdus. (I would not omit ipmi in favor of controllable pdus.) > I would like to get some advice from experts from this list, is this pricing > realistic? Are there any flaws with configuration? the pricing is agressive; cost from a vendor like HP will be substantially higher. > In principle, we have some experience in building and managing clusters, but > with 40 node systems it would make sense to get a good cluster integrator to > do the job. Can people share their recent experiences and recommend reliable > vendors to deal with? HP is safe and solid and not cheap. for a small cluster like this, I don't think vendor integration is terribly important. regards, mark hahn. From john.hearns at streamline-computing.com Mon Jul 28 01:16:11 2008 From: john.hearns at streamline-computing.com (John Hearns) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: Message-ID: <1217232982.4907.15.camel@Vigor13> On Mon, 2008-07-28 at 01:52 -0400, Mark Hahn wrote: > > > 2. reasonably fast interconnect (IB SDR 10Gb/s would suffice our > > computational needs (running LAMMPs molecular dynamics and VASP DFT codes) > > 3. 48U rack (preferably with good thermal management) > > "thermal management"? servers need cold air in front and unobstructed > exhaust. that means open or mesh front/back (and blanking panels). > Agreed. However depending on the location if space is tight you could think of an APC rack with the heavy fan exhaust door on th rear, and vent the hot air. > > - 2x Intel Xeon E5420 Hapertown 2.5 GHz quad core CPU : 2x$350=$700 > > - Dual LGA 771 Intel 5400 Supermicro mb : > > $430 I'd recommend looking at the Intel Twin motherboard systems for this project. Leaves plenty of room for cluster head node, and RAID arrays, a UPS and switches. Supermicro have these motherboards with onboard Infiniband, so no need for extra cards. One thing you have to think about is power density - it is no use cramming 40 1U systems into a rack plus switches and head nodes - it is going to draw far too many amps. Think two times APC PDUs per cabinet at the very maximum. The Intel twins help here again, as they have a high efficiency PSU and the losses are shared between two systems. I'm not sure if we would still have to spread this sort of load between two racks - it depends on the calculations. You also need to put in some budget for power - importantly - air conditioning. > > In principle, we have some experience in building and managing clusters, but > > with 40 node systems it would make sense to get a good cluster integrator to > > do the job. Can people share their recent experiences and recommend reliable > > vendors to deal with? Our standard build would be an APC rack, IPMI in all compute nodes plus two networked APC PDUs. John Hearns From eugen at leitl.org Mon Jul 28 01:51:30 2008 From: eugen at leitl.org (Eugen Leitl) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080727191956.f75pi29o0sscocw8@webmail.jpl.nasa.gov> References: <20080725110042.GL9875@leitl.org> <859578.1281.qm@web54106.mail.re2.yahoo.com> <20080727041523.GC4239@hpegg.niftyegg.com> <20080727192807.GE10566@leitl.org> <20080727191956.f75pi29o0sscocw8@webmail.jpl.nasa.gov> Message-ID: <20080728085130.GN10566@leitl.org> On Sun, Jul 27, 2008 at 07:19:56PM -0700, Jim Lux wrote: > bear in mind that ordinary ethernet both coax and twisted pair is > galvanically isolated. This is strange, because I've seen (small) sparks and received (mild) shocks from both, in two different locations. As you say, http://www.apcmedia.com/salestools/FLUU-5T3TLT_R1_EN.pdf claims Ethernet is immune, yet I've read somewhere that Gigabit ethernet is more susceptible than Fast Ethernet. I've got (cheap) UPSen for almost all equipment, maybe they're the problem and not the switching power supplies. In any case I'll have an electrician diagnose the problem. Unfortunately, I anticipate his solution would involve pulling through a new large-crossection ground wire to several locations. It is at this point that lack of wall conduits will become quite painful. -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE From apittman at concurrent-thinking.com Mon Jul 28 01:54:03 2008 From: apittman at concurrent-thinking.com (Ashley Pittman) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] Infiniband modular switches In-Reply-To: <20080727035420.GA4239@hpegg.niftyegg.com> References: <9FA59C95FFCBB34EA5E42C1A8573784F013425E3@mtiexch01.mti.com> <487B8FEF.4030708@myri.com> <20080725002223.GA14358@bx9.net> <20080727035420.GA4239@hpegg.niftyegg.com> Message-ID: <1217235243.7048.17.camel@bruce.priv.wark.uk.streamline-computing.com> On Sat, 2008-07-26 at 20:54 -0700, Nifty niftyompi Mitch wrote: > On Thu, Jul 24, 2008 at 05:22:24PM -0700, Greg Lindahl wrote: > > On Mon, Jul 14, 2008 at 01:42:07PM -0400, Patrick Geoffray wrote: > > > > > AlltoAll of large messages is not a useless synthetic benchmark IMHO. > > > > AlltoAll is a real thing used by real codes, but do keep in mind that > > there are many algorithms for AlltoAll with various message sizes and > > network topologies, so it's testing both the raw interconnect and the > > AlltoAll implementation. I don't know of the results you mention were > > run with an optimal AlltoAll... do you? > > Is there a single "optimal AlltoAll"? > > I can imagine a handful of ways to build an AlltoAll but I suspect that > various cards, system, transports, switches, topologies ... each will > act differently on different processors and memory systems. Is there > a collection of coded algorithms that can be built into the likes of > OpenMPI? If so a simple site hook to benchmark then pick/linkto one > over another could follow. If only it were that simple. A basic AlltoAll is easy to implement but getting it to work well is difficult and getting a single algorithm which works well across a number of different topologies is extremely difficult. You forget two other variables, the message size and the size of the communicator, both of which can vary within the same job which effectively prevent there being a single optimum "site" algorithm. AlltoAll *is* the hardest MPI function to implement well and in my view it makes a good benchmark not just of the network but also of the MPI stack itself, there is a good chance that if AlltoAll works well on a given machine for a given job size then most other things will as well. Ashley Pittman. From rgb at phy.duke.edu Mon Jul 28 06:15:44 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080728085130.GN10566@leitl.org> References: <20080725110042.GL9875@leitl.org> <859578.1281.qm@web54106.mail.re2.yahoo.com> <20080727041523.GC4239@hpegg.niftyegg.com> <20080727192807.GE10566@leitl.org> <20080727191956.f75pi29o0sscocw8@webmail.jpl.nasa.gov> <20080728085130.GN10566@leitl.org> Message-ID: On Mon, 28 Jul 2008, Eugen Leitl wrote: > On Sun, Jul 27, 2008 at 07:19:56PM -0700, Jim Lux wrote: > >> bear in mind that ordinary ethernet both coax and twisted pair is >> galvanically isolated. > > This is strange, because I've seen (small) sparks and received (mild) > shocks from both, in two different locations. Ground loop. Very dangerous. You go first...;-) rgb > > As you say, http://www.apcmedia.com/salestools/FLUU-5T3TLT_R1_EN.pdf > claims Ethernet is immune, yet I've read somewhere that Gigabit ethernet > is more susceptible than Fast Ethernet. I've got (cheap) UPSen for > almost all equipment, maybe they're the problem and not the switching > power supplies. > > In any case I'll have an electrician diagnose the problem. Unfortunately, > I anticipate his solution would involve pulling through a new large-crossection > ground wire to several locations. It is at this point that lack of wall > conduits will become quite painful. > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From Bogdan.Costescu at iwr.uni-heidelberg.de Mon Jul 28 07:17:20 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] [rpciod] in D state In-Reply-To: <20080725162624.GA18108@gretchen.aei.uni-hannover.de> References: <20080725162624.GA18108@gretchen.aei.uni-hannover.de> Message-ID: On Fri, 25 Jul 2008, Henning Fehrmann wrote: > I assume, the squared brackets mean that these are kernel processes. Kernel threads, yes. > From time to time one of them changes in to the 'D' (Uninterruptible > sleep) mode. Once it happens on a particular node it is also > impossible to mount any nfs exports on this node, which seems > logical since the nfs client uses the portmapper. Are you able to kill those kernel threads ? Something like: killall -9 rpciod I seem to remember that they are (re)started as needed, but my recollection of the NFS implementation on Linux is somehow hazy. Probably the best place to ask is the Linux NFS mailing list... -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From james.p.lux at jpl.nasa.gov Mon Jul 28 07:19:10 2008 From: james.p.lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: References: <20080725110042.GL9875@leitl.org> <859578.1281.qm@web54106.mail.re2.yahoo.com> <20080727041523.GC4239@hpegg.niftyegg.com> <20080727192807.GE10566@leitl.org> <20080727191956.f75pi29o0sscocw8@webmail.jpl.nasa.gov> <20080728085130.GN10566@leitl.org> Message-ID: <20080728071910.oi86kmmv8kc8sok4@webmail.jpl.nasa.gov> Quoting "Robert G. Brown" , on Mon 28 Jul 2008 06:15:44 AM PDT: > On Mon, 28 Jul 2008, Eugen Leitl wrote: > >> On Sun, Jul 27, 2008 at 07:19:56PM -0700, Jim Lux wrote: >> >>> bear in mind that ordinary ethernet both coax and twisted pair is >>> galvanically isolated. >> >> This is strange, because I've seen (small) sparks and received (mild) >> shocks from both, in two different locations. > > Ground loop. Very dangerous. You go first...;-) > > rgb Very odd.. I'd be looking for an outright short from the cables to something (or, a LOT of capacitive coupling)... After all, the twisted pairs are isolated at BOTH ends.. Now, there is Power over Ethernet these days.. Basically uses each pair of wires as a single conductor (i.e. they feed the juice in at the center tap of the isolation transformer) but, again, that shouldn't be sparking/shocking. > >> >> As you say, http://www.apcmedia.com/salestools/FLUU-5T3TLT_R1_EN.pdf >> claims Ethernet is immune, yet I've read somewhere that Gigabit ethernet >> is more susceptible than Fast Ethernet. I've got (cheap) UPSen for >> almost all equipment, maybe they're the problem and not the switching >> power supplies. >> >> In any case I'll have an electrician diagnose the problem. Unfortunately, >> I anticipate his solution would involve pulling through a new >> large-crossection >> ground wire to several locations. It is at this point that lack of wall >> conduits will become quite painful. > Nope.. shouldn't require a separate grounding conductor, at least not along with your cabling. What you might want to do is see if your electrical safety ground (third pin/green wire ground) at the two ends is at a radically different voltage. You might have a miswired receptacle. You should be able to just drag a single conductor through the house and use a multimeter to measure the voltage between the ground pins, and it should be zero, or pretty darn close.. use the AC setting, and put a small (few K) load resistor across the meter, so you don't get fooled by electrostatic/electromagnetic coupling... which will induce several volts, at least into an open circuit. From henning.fehrmann at aei.mpg.de Mon Jul 28 07:57:09 2008 From: henning.fehrmann at aei.mpg.de (Henning Fehrmann) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] [rpciod] in D state In-Reply-To: References: <20080725162624.GA18108@gretchen.aei.uni-hannover.de> Message-ID: <20080728145709.GA11796@gretchen.aei.uni-hannover.de> On Mon, Jul 28, 2008 at 04:17:20PM +0200, Bogdan Costescu wrote: > On Fri, 25 Jul 2008, Henning Fehrmann wrote: > > >I assume, the squared brackets mean that these are kernel processes. > > Kernel threads, yes. > > >From time to time one of them changes in to the 'D' (Uninterruptible sleep) mode. Once it happens on a particular node it is also impossible to > >mount any nfs exports on this node, which seems logical since the nfs client uses the portmapper. > > Are you able to kill those kernel threads ? Something like: > > killall -9 rpciod No chance. > > I seem to remember that they are (re)started as needed, but my recollection of the NFS implementation on Linux is somehow hazy. Probably the best > place to ask is the Linux NFS mailing list... We found this: http://bugzilla.kernel.org/show_bug.cgi?id=10837 It might appear only in the 2.6.24 kernel. I'll try a later version. Unfortunately, this problem appears with a small probability density. So I have to wait and to observe whether it appears in a newer kernel. Cheers Henning From mathog at caltech.edu Mon Jul 28 12:41:23 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? Message-ID: Jim Lux wrote > Quoting "Robert G. Brown" , on Mon 28 Jul 2008 > 06:15:44 AM PDT: > > > On Mon, 28 Jul 2008, Eugen Leitl wrote: > > > >> On Sun, Jul 27, 2008 at 07:19:56PM -0700, Jim Lux wrote: > >> > >>> bear in mind that ordinary ethernet both coax and twisted pair is > >>> galvanically isolated. > >> > >> This is strange, because I've seen (small) sparks and received (mild) > >> shocks from both, in two different locations. > > > > Ground loop. Very dangerous. You go first...;-) > > > > rgb > > > Very odd.. I'd be looking for an outright short from the cables to > something (or, a LOT of capacitive coupling)... Could this possibly be static electricity discharging? Is the humidity very low where this is being seen, and or, is the operator moving over carpet shortly before the spark is observed? I can't say that I've ever seen sparks leave an ethernet cable even here in Pasadena when the winter humidity is close to zero, but I have had sparks jump off my fingers as they passed near mounting screws on wall plates. In spark season I routinely get blasted by my car's door handle, and there's definitely no ground loop going on there. David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From gerry.creager at tamu.edu Mon Jul 28 13:17:19 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: References: Message-ID: <488E294F.8070009@tamu.edu> Back when the Earth was young, and the crust was still cooling, we ran serial connections between computers, over long distances and sometimes between power distributions. It wasn't uncommon to see ground loops lead to arcing. I don't see it as much now because I'm a little more careful about my grounds, and I bridge such problems with glass rather than copper. The potential is still very real. gerry David Mathog wrote: > Jim Lux wrote > >> Quoting "Robert G. Brown" , on Mon 28 Jul 2008 >> 06:15:44 AM PDT: >> >>> On Mon, 28 Jul 2008, Eugen Leitl wrote: >>> >>>> On Sun, Jul 27, 2008 at 07:19:56PM -0700, Jim Lux wrote: >>>> >>>>> bear in mind that ordinary ethernet both coax and twisted pair is >>>>> galvanically isolated. >>>> This is strange, because I've seen (small) sparks and received (mild) >>>> shocks from both, in two different locations. >>> Ground loop. Very dangerous. You go first...;-) >>> >>> rgb >> >> Very odd.. I'd be looking for an outright short from the cables to >> something (or, a LOT of capacitive coupling)... > > Could this possibly be static electricity discharging? Is the humidity > very low where this is being seen, and or, is the operator moving over > carpet shortly before the spark is observed? > > I can't say that I've ever seen sparks leave an ethernet cable even here > in Pasadena when the winter humidity is close to zero, but I have had > sparks jump off my fingers as they passed near mounting screws on wall > plates. In spark season I routinely get blasted by my car's door > handle, and there's definitely no ground loop going on there. > > David Mathog > mathog@caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From maurice at harddata.com Sat Jul 26 18:54:02 2008 From: maurice at harddata.com (Maurice Hilarius) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik) In-Reply-To: <200807261957.m6QJv6HE031997@bluewest.scyld.com> References: <200807261957.m6QJv6HE031997@bluewest.scyld.com> Message-ID: <488BD53A.2010907@harddata.com> "Ivan Oleynik" wrote: > Subject: [Beowulf] Building new cluster - estimate > To: beowulf@beowulf.org > .. > I am in process of upgrading computational facilities of my lab and > considering of building/purchasing 40 node cluster. Before contacting > vendors I would like to get some understanding how much would it cost. The > major considerations/requirements: > > 1. newest Intel quad-core CPUs (Opteron quad-core cpus are out of question > due to ridiculous pricing), good balance of price/performance > Hmm, what is "ridiculous about Opteron pricing? In HPC work one generally needs a XEON clocked about 20% to 25% faster than an Opteron for equivalent performance. Mainly due to the limitations of memory bandwidth of the XEONs. Current CPU pricing comparison: BX80574E5410A Intel Quad-Core Xeon E5410 / 2.33 GHz ( 1333 MHz ) - LGA771 Socket - L2 12 MB ( 2 x 6MB ) - Box $322.00 BX80574E5420A Intel Quad-Core Xeon E5420 / 2.5 GHz ( 1333 MHz ) - LGA771 Socket - L2 12 MB ( 2 x 6MB ) - Box $393.00 OS2350WAL4BGHWOF AMD Third-Generation Opteron 2350 / 2 GHz - Socket F (1207) - L3 2 MB - PIB/WOF $292.00 OS2352WAL4BGHWOF AMD Third-Generation Opteron 2352 / 2.1 GHz - Socket F (1207) - L3 2 MB - PIB/WOF $366.00 BX80574E5430A Intel Quad-Core Xeon E5430 / 2.66 GHz ( 1333 MHz ) - LGA771 Socket - L2 12 MB ( 2 x 6MB ) - Box $557.00 BX80574E5440A Intel Quad-Core Xeon E5440 / 2.83 GHz ( 1333 MHz ) - LGA771 Socket - L2 12 MB ( 2 x 6MB ) - Box $833.00 OS2354WAL4BGHWOF AMD Third-Generation Opteron 2354 / 2.2 GHz - Socket F (1207) - L3 2 MB - PIB/WOF $525.00 OS2356WAL4BGHWOF AMD Third-Generation Opteron 2356 / 2.3 GHz - Socket F (1207) - L3 2 MB - PIB/WOF $796.00 > 2. reasonably fast interconnect (IB SDR 10Gb/s would suffice our > computational needs (running LAMMPs molecular dynamics and VASP DFT codes) > Or Myrinet. Bandwidth is usually unimportant, but latency usually is. So called "memfree" IB cards are rarely any kind of bargain for this use. -- With our best regards, //Maurice W. Hilarius Telephone: 01-780-456-9771/ /Hard Data Ltd. FAX: 01-780-456-9772/ /11060 - 166 Avenue email:maurice@harddata.com/ /Edmonton, AB, Canada http://www.harddata.com// / T5X 1Y3/ / -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080726/e5244f2b/attachment.html From iioleynik at gmail.com Sun Jul 27 18:58:38 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:30 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: <271mgAaVh3670S10.1217118153@cmsweb10.cms.usa.net> Message-ID: Joshua, Thanks for your response. I may be wrong but Barcelona at 2.3GHz is being offered at the same price as > Harpertown at 2.8GHz. > Yes, both opteron 2356 and Xeon E5440 are comparable in pricing (~ $700), but it is 0.5 GHz difference! I am going to run some tests, but our previous experience with 2.0GHz Barcelona was not very encouraging. > I think you can find Barcelona 2.3GHz from Dell for $1800 USD or around > that > with SDR. > Is this price with fully loaded unit (both cpu chips and decent amount of memory present)? I would like to get the link to the source of info, if available. Thanks, Ivan > ------ Original Message ------ > Received: Sat, 26 Jul 2008 01:01:58 PM PDT > From: "Ivan Oleynik" > To: beowulf@beowulf.org > Subject: [Beowulf] Building new cluster - estimate > > > I am in process of upgrading computational facilities of my lab and > > considering of building/purchasing 40 node cluster. Before contacting > > vendors I would like to get some understanding how much would it cost. > The > > major considerations/requirements: > > > > 1. newest Intel quad-core CPUs (Opteron quad-core cpus are out of > question > > due to ridiculous pricing), good balance of price/performance > > 2. reasonably fast interconnect (IB SDR 10Gb/s would suffice our > > computational needs (running LAMMPs molecular dynamics and VASP DFT > codes) > > 3. 48U rack (preferably with good thermal management) > > > > I used newegg plus some extra info to get pricing for IB. This is what I > > got: > > > > Single node configuration > > > > ------------------------------------------------------------------------------------------------------------------ > > - 2x Intel Xeon E5420 Hapertown 2.5 GHz quad core CPU : > 2x$350=$700 > > - Dual LGA 771 Intel 5400 Supermicro mb : > > $430 > > - Kingston 8 Gb (4x2Gb) DDR2 FB-DIMM DDR2 667 memory : $360 > > - WD Caviar 750 Gb SATA HD > : > > $110 > > -Melanox InfiniHost Single port 4x IB SDR 10Gb/s PCI-e card : $125 > ( > > http://www.colfaxdirect.com/store/pc/viewPrd.asp?idproduct=12) > > - 1U case including power supply, fans > : > > $150 > > > > ------------------------------------------------------------------------------------------------------ > > > > $1,875/node > > > > Cluster estimate: > > ------------------------ > > 40 nodes: > > > > : 40x$187=$75,000 > > 2x24-port 4X (10Gb/s) 1U SDR IB Flextronics Switches : > > 2x$2400=$4,800 > > > ( > http://www.colfaxdirect.com/store/pc/viewPrd.asp?idcategory=7&idproduct=13 > )< > http://www.colfaxdirect.com/store/pc/viewPrd.asp?idcategory=7&idproduct=13 > > > > IB > > cables > > : 40$65=$,2600 > > 48U rack cabinet, PDUs, power cables, : > > $3000 > > > > ----------------------------------------------------------------------------------------------------------- > > TOTAL: $85,400 > > > > I would like to get some advice from experts from this list, is this > pricing > > realistic? Are there any flaws with configuration? > > > > In principle, we have some experience in building and managing clusters, > but > > with 40 node systems it would make sense to get a good cluster integrator > to > > do the job. Can people share their recent experiences and recommend > reliable > > vendors to deal with? > > > > Many thanks, > > > > Ivan Oleynik > > > > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org > > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080727/98eeb9a0/attachment.html From iioleynik at gmail.com Sun Jul 27 19:00:16 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: Message-ID: Matt, Thanks for your advice. I suggest that you need a minimum of 16GB/node (2GB/core) and possibly > 32GB/node (4GB/node). > 8 Gb/node is enough for types of applications we are going to run on this cluster. Additional memory stickscan be added later if necessary. > > You will want to set up IPMI on all of the nodes. You want to avoid > touching the harware or even going into the machine room whenever practical. > Why is IPMI so important? Because this is small cluster, we can physically inspect the faulty modes, not a big deal. I check the pricing, IPMI is extra $100/node or $4000/40 nodes=2 extra compute nodes. We are on tight budget, and every penny matters. That is why I sent this email to get advice concerning pricing. Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080727/7d1934be/attachment.html From himikehawaii1 at yahoo.com Mon Jul 28 00:45:14 2008 From: himikehawaii1 at yahoo.com (MDG) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080727235138.GA4331@hpegg.niftyegg.com> Message-ID: <385273.34120.qm@web54109.mail.re2.yahoo.com> seems some people are not reading. 1)fire code here requires all lectrical wores to be in metal cinduit, so "oulling" wire is not possib;e. 2) I never have or ever implied I HAD A SAFETY or fire issue, I just am trying to anticipate needs in the future (near) as several overseas business also will be using my home as offsite storage servers. ? Can we try and het on topic?? Thank you for the person who like me planned ahead but had the luxury of doing during construction. ? there will be a huge market as the USA lags far behind and we are beong leap-frogged by many emerging markets, just look at cell phones. ? Mike --- On Sun, 7/27/08, Nifty niftyompi Mitch wrote: From: Nifty niftyompi Mitch Subject: Re: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? To: "Eugen Leitl" Cc: "Nifty niftyompi Mitch" , Beowulf@beowulf.org Date: Sunday, July 27, 2008, 1:51 PM On Sun, Jul 27, 2008 at 09:28:07PM +0200, Eugen Leitl wrote: > On Sat, Jul 26, 2008 at 09:15:23PM -0700, Nifty niftyompi Mitch wrote: > > > Did I read this correctly? > > Are you cutting holes in the walls of a home.....? > > Real homes have cable ducts -- I'm limited to drilling through walls. > Unfortunately, due to lousy electric installation I'm stuck with > severe ground loop issues (I catched that too late orelse > I would have put in a heavy ground wire strung along the CAT 6/7). > Galvanic separation would be a god-send -- but for the price > tag, and unavailability of GBit polymer optical. > ...... > > CAT 6 is great, except if you see sparks when plugging in RJ-45s. > Also, most systems tend to shutdown spontaneously overnight when > on such dirty network. > > > Also all wires in the wall MUST be safety rated for such use. > OK if you have severe ground loop issues you must rethink what you are doing! Ground loops generated by a bad power install can have almost unlimited currents and present a serious fire hazard. Get the problem fixed. Contact a qualified electrician and find out what the root cause is. AFAIK most network switches do provide MAC level isolation at the physical layer so I am confused and _concerned_ about your comment about seeing sparks at the RJ-46. Get the problem addressed promptly. -- T o m M i t c h e l l Looking for a place to hang my hat. _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080728/28dbf1f7/attachment.html From niftyompi at niftyegg.com Mon Jul 28 16:56:58 2008 From: niftyompi at niftyegg.com (Nifty niftyompi Mitch) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080728085130.GN10566@leitl.org> References: <20080725110042.GL9875@leitl.org> <859578.1281.qm@web54106.mail.re2.yahoo.com> <20080727041523.GC4239@hpegg.niftyegg.com> <20080727192807.GE10566@leitl.org> <20080727191956.f75pi29o0sscocw8@webmail.jpl.nasa.gov> <20080728085130.GN10566@leitl.org> Message-ID: <20080728235559.GA5071@hpegg.niftyegg.com> On Mon, Jul 28, 2008 at 10:51:30AM +0200, Eugen Leitl wrote: > On Sun, Jul 27, 2008 at 07:19:56PM -0700, Jim Lux wrote: > > > bear in mind that ordinary ethernet both coax and twisted pair is > > galvanically isolated. > > This is strange, because I've seen (small) sparks and received (mild) > shocks from both, in two different locations. > > As you say, http://www.apcmedia.com/salestools/FLUU-5T3TLT_R1_EN.pdf > claims Ethernet is immune, yet I've read somewhere that Gigabit ethernet > is more susceptible than Fast Ethernet. I've got (cheap) UPSen for > almost all equipment, maybe they're the problem and not the switching > power supplies. > > In any case I'll have an electrician diagnose the problem. Unfortunately, > I anticipate his solution would involve pulling through a new large-crossection > ground wire to several locations. It is at this point that lack of wall > conduits will become quite painful. Tell us more about your UPS setup. Uninteruptible Power Supply boxes can do bad things with power and ground. Multiple UPSen are sometimes hard to install correctly it is easy to have 'earth' float.... -- T o m M i t c h e l l Looking for a place to hang my hat. From john.hearns at streamline-computing.com Tue Jul 29 00:22:22 2008 From: john.hearns at streamline-computing.com (John Hearns) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: <1217232982.4907.15.camel@Vigor13> Message-ID: <1217316153.5027.6.camel@Vigor13> On Mon, 2008-07-28 at 23:18 -0400, Ivan Oleynik wrote: > > Space is not tight. Computer room is quite spacious but air > conditioning is rudimental, no windows or water lines to dump the > heat. It looks like a big problem, therefore, consider to put the > system somewhere else on campus, although this is not quite > convenient. That's not so good. Youre going to have to get the BTU rating of the existing air conditioning, and consider getting more unit(s) installed - if you have an external wall the facilities people can surely drill through it for the. Give serious consideration to putting expensive and noisy kit like this elsewhere on campus if your facilities people have: a well cooled computer room / lots of spare amps / physical security - ie. tough steel doors / environmental monitoring Networks are fast these days, and with remote power switching you should not need to physically visit the machine that often. > Many thanks, this is very exciting opportunity. I can get 20 1-U units > in 42U rack, As I say, watch for the amount of amps you can provide per rack and the heat density. > a lot of space for thermal management and other infrastructure items. > Do you know any system integrators that can build 40-node cluster from > Supermicro twin units? That depends where you are physically. Have a look at From csamuel at vpac.org Tue Jul 29 04:34:00 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Strange SGE scheduling problem In-Reply-To: <5E0BB54BEC5EBA44B373175A080E64010A7948BF@ophelia.ad.utulsa.edu> Message-ID: <1475623273.322281217331240763.JavaMail.root@zimbra.vpac.org> ----- "Keith Schoenefeld" wrote: > This definitely looked promising, but unfortunately it didn't work. If you're really desperate you can hack the MVAPICH sources and remove the code that sets CPU affinity. Also if SGE supports CPU sets in the same way that Torque does then get it to use those, that will trump the setaffinity(). Alternatively could you switch to OpenMPI instead ? We found it gives far better error messages as an added bonus when we switched to it. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Tue Jul 29 04:40:58 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: Message-ID: <627174098.322311217331658556.JavaMail.root@zimbra.vpac.org> ----- "Ivan Oleynik" wrote: > I am going to run some tests, but our previous > experience with 2.0GHz Barcelona was not very > encouraging. A couple of points that we've found: 1) Use a mainline kernel, we've found benefit of that over stock CentOS kernels. 2) Make sure you don't disable ACPI on Barcelona as the K8 NUMA detection hack doesn't work with K10h and you'll find the kernel is faking NUMA.. But, as ever, it depends on what your code does - if it's not memory heavy and is more integer based then Intel will likely do better. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From rgb at phy.duke.edu Tue Jul 29 04:46:30 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <488E294F.8070009@tamu.edu> References: <488E294F.8070009@tamu.edu> Message-ID: On Mon, 28 Jul 2008, Gerry Creager wrote: > Back when the Earth was young, and the crust was still cooling, we ran serial > connections between computers, over long distances and sometimes between > power distributions. It wasn't uncommon to see ground loops lead to arcing. > I don't see it as much now because I'm a little more careful about my > grounds, and I bridge such problems with glass rather than copper. > > The potential is still very real. The potential is very real, and even if the wires at both ends are "supposed" to not be touching anything even as "neutral" as the case ground, given the number of machines with network interfaces made by small shops in taiwan or the phillipines out of a stock chip but with their own local design team, who can doubt that there are ones where they do? Ground loops are generally murphy's law objects, and since they CAN happen, sooner or later they will. rgb > > gerry > > David Mathog wrote: >> Jim Lux wrote >> >>> Quoting "Robert G. Brown" , on Mon 28 Jul 2008 06:15:44 >>> AM PDT: >>> >>>> On Mon, 28 Jul 2008, Eugen Leitl wrote: >>>> >>>>> On Sun, Jul 27, 2008 at 07:19:56PM -0700, Jim Lux wrote: >>>>> >>>>>> bear in mind that ordinary ethernet both coax and twisted pair is >>>>>> galvanically isolated. >>>>> This is strange, because I've seen (small) sparks and received (mild) >>>>> shocks from both, in two different locations. >>>> Ground loop. Very dangerous. You go first...;-) >>>> >>>> rgb >>> >>> Very odd.. I'd be looking for an outright short from the cables to >>> something (or, a LOT of capacitive coupling)... >> >> Could this possibly be static electricity discharging? Is the humidity >> very low where this is being seen, and or, is the operator moving over >> carpet shortly before the spark is observed? >> >> I can't say that I've ever seen sparks leave an ethernet cable even here >> in Pasadena when the winter humidity is close to zero, but I have had >> sparks jump off my fingers as they passed near mounting screws on wall >> plates. In spark season I routinely get blasted by my car's door >> handle, and there's definitely no ground loop going on there. >> >> David Mathog >> mathog@caltech.edu >> Manager, Sequence Analysis Facility, Biology Division, Caltech >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From hahn at mcmaster.ca Tue Jul 29 06:22:55 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: Message-ID: >> vendors have at least list prices available on their websites. > > I saw only one vendor siliconmechanics.com that has online integrator. > Others require direct contact of a saleperson. the price of the cluster should be dominated by the price of a node, and many sites offer web-configuration of nodes (HP, for instance). >> "thermal management"? servers need cold air in front and unobstructed >> exhaust. that means open or mesh front/back (and blanking panels). > > Yes, are there other options? Built-in airconditioning unit that will > exhaust the hot air through a pipe to dump the air outside computer room > other than heating it? there are racks with built-in heat-exchangers, yes. I was mainly cautioning against racks that are closed and have exhaust fans on the top, since they contradict the front-to-back airflow of servers. >> wouldn't a 5100-based board allow you to avoid the premium of fbdimms? > > May be I am wrong but I saw only FB-DIMMs options and assumed that we need > to wait for Nehalems for DDR3? 5100+ddr2 is perfectly viable. fbdimms, after all, just a wrapper/extender that introduces more latency with the claim of higher capacity (they contain ddr2 or ddr3 inside the memory-buffer interface.) but yes, if possible, I would wait for nehalem. unfortunately, it's hard to guess when nehalem will actually ship for servers - recent rumors claim an early release, but I can't guess whether this is only for uni-socket machines or not. >> - WD Caviar 750 Gb SATA HD >>> : >>> $110 >> >> I usually figure a node should have zero or as many disks as feasible. > > We prefer to avoid intensive IO over network, therefore, use local scratch. exactly, so more than one disk will provide better throughput. remember that one sata disk is about the same speed as 1 Gb interface. >> HP is safe and solid and not cheap. for a small cluster like this, I don't >> think vendor integration is terribly important. >> > Yes, it is important. Optimized cost is what matters. you asked for opinions: mine is that for small clusters, vendor integration is not important because anyone can build a small cluster without much effort. vendor support is more of a necessity for large clusters. From hahn at mcmaster.ca Tue Jul 29 06:28:26 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: Message-ID: > I check the pricing, IPMI is extra $100/node or $4000/40 nodes=2 extra IMO, your compute nodes will wind up $3k; spending a couple percent on managability is just smart. you're the one who asked for advice... From james.p.lux at jpl.nasa.gov Tue Jul 29 06:52:01 2008 From: james.p.lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: References: <488E294F.8070009@tamu.edu> Message-ID: <20080729065201.uol9ulvm044o0k0c@webmail.jpl.nasa.gov> Quoting "Robert G. Brown" , on Tue 29 Jul 2008 04:46:30 AM PDT: > On Mon, 28 Jul 2008, Gerry Creager wrote: > >> Back when the Earth was young, and the crust was still cooling, we >> ran serial connections between computers, over long distances and >> sometimes between power distributions. It wasn't uncommon to see >> ground loops lead to arcing. I don't see it as much now because I'm >> a little more careful about my grounds, and I bridge such problems >> with glass rather than copper. >> >> The potential is still very real. > > The potential is very real, and even if the wires at both ends are > "supposed" to not be touching anything even as "neutral" as the case > ground, given the number of machines with network interfaces made by > small shops in taiwan or the phillipines out of a stock chip but with > their own local design team, who can doubt that there are ones where > they do? Ground loops are generally murphy's law objects, and since > they CAN happen, sooner or later they will. Actually, if they're following the RS-232 spec, it WILL occur, because that spec requires, for instance, that pin 1 of the DB25 connector be connected to the chassis. If that chassis is connected to electrical safety ground (green wire/third prong) (as it should be), and the two ends are in different buildings, fed by different systems, the liklihood of significant potential difference is pretty high. (differentiate between pin 1, chassis ground, and pin 7, signal ground ). It's also why the RS232 spec calls for +/- 3V as the "deadband" in between the two levels. But, RS232 was never intended for distances over, say, 10 meters. That's what short haul modems were all about... (basically line drivers/receivers with galvanic isolation) Just this sort of thing is why the very first Ethernet (before it was even called that) called for galvanic isolation in the AUI. This is also why there are grounding rules in the National Electrical Code, especially dealing with connections between buildings/structures. From rgb at phy.duke.edu Tue Jul 29 07:33:01 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080729065201.uol9ulvm044o0k0c@webmail.jpl.nasa.gov> References: <488E294F.8070009@tamu.edu> <20080729065201.uol9ulvm044o0k0c@webmail.jpl.nasa.gov> Message-ID: On Tue, 29 Jul 2008, Jim Lux wrote: > But, RS232 was never intended for distances over, say, 10 meters. That's what > short haul modems were all about... (basically line drivers/receivers with > galvanic isolation) Although it was used, extensively to universally, over distances ten times that. Terminal servers were expensive, wire cheap, terminals were all there was to connect, so one ran UTP bundles all over the place. (I spent way too much time pulling said bundles and sorting out color coded pairs on both ends and crimping on the little poke-pins for an RS-232 shell, either size...). > Just this sort of thing is why the very first Ethernet (before it was even > called that) called for galvanic isolation in the AUI. > > This is also why there are grounding rules in the National Electrical Code, > especially dealing with connections between buildings/structures. Sure. And if the people who installed the network wiring and built the AUIs actually have read the spec, and if the manufacturing process actually didn't actually create a short, and if the wires themselves didn't get partially stripped by pulling them too hard through too small or too full or too sharply bent a grounded conduit, and if neither the computer or switch(es) have been hit by a lightning spike or heated to 90C in an AC fault so that their innards cooks, then the spec probably is enforced and works. Believably 99.9% of the time. Or maybe 99.99% of the time. It doesn't matter. Murphy is merciless, and to paraphrase Adams, one in ten thousand chances are practically a sure thing...;-) The point being that I'd never disbelieve somebody that saw arcing when making an ethernet connection just because it is oxymoronically "almost impossible" (which parses to mean "possible") -- I've seen it myself. Not even over long distances. Sometimes switches themselves are just plain faulty in their wiring, or have been blown by a nearby lightning strike. Things fall apart; the centre cannot hold; Mere anarchy is loosed upon the world, You have to love Yeats; he clearly understood Murphy. rgb > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From gerry.creager at tamu.edu Tue Jul 29 08:04:23 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: Message-ID: <488F3177.1050802@tamu.edu> Mark Hahn wrote: >> I check the pricing, IPMI is extra $100/node or $4000/40 nodes=2 extra > > IMO, your compute nodes will wind up $3k; spending a couple percent on > managability is just smart. you're the one who asked for advice... Buy the IPMI daughter cards. It's money well spent. -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From perry at piermont.com Tue Jul 29 09:33:16 2008 From: perry at piermont.com (Perry E. Metzger) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <20080728085130.GN10566@leitl.org> (Eugen Leitl's message of "Mon\, 28 Jul 2008 10\:51\:30 +0200") References: <20080725110042.GL9875@leitl.org> <859578.1281.qm@web54106.mail.re2.yahoo.com> <20080727041523.GC4239@hpegg.niftyegg.com> <20080727192807.GE10566@leitl.org> <20080727191956.f75pi29o0sscocw8@webmail.jpl.nasa.gov> <20080728085130.GN10566@leitl.org> Message-ID: <87tze84pmb.fsf@snark.cb.piermont.com> Eugen Leitl writes: > On Sun, Jul 27, 2008 at 07:19:56PM -0700, Jim Lux wrote: > >> bear in mind that ordinary ethernet both coax and twisted pair is >> galvanically isolated. > > This is strange, because I've seen (small) sparks and received (mild) > shocks from both, in two different locations. Ground loops are a real phenomenon in UTP Ethernet. For example, *NEVER* run UTP between buildings. If the grounds in the two buildings are at a different relative potential, and they often are, very bad things can happen. The building complex I live in ran Cat 5 between buildings in underground ducts. They were very surprised when lightning strikes some distance away regularly blew out the switches. Changing to fiber eliminated the problem, of course. > In any case I'll have an electrician diagnose the problem. If you're seeing sparks, as you say, I suspect you do indeed have an AC supply problem. Ground loop, or something worse. (The Electrical Wiring FAQ describes several problems that qualify as "worse"...) -- Perry E. Metzger perry@piermont.com From Bogdan.Costescu at iwr.uni-heidelberg.de Tue Jul 29 09:37:02 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: <627174098.322311217331658556.JavaMail.root@zimbra.vpac.org> References: <627174098.322311217331658556.JavaMail.root@zimbra.vpac.org> Message-ID: On Tue, 29 Jul 2008, Chris Samuel wrote: > 1) Use a mainline kernel, we've found benefit of that > over stock CentOS kernels. Care to comment on this statement ? -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From James.P.Lux at jpl.nasa.gov Tue Jul 29 10:10:21 2008 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf]Infrastruture planning for small HPC 40/100 gigabyet eyhernet or Infiniband? In-Reply-To: <87tze84pmb.fsf@snark.cb.piermont.com> References: <20080725110042.GL9875@leitl.org> <859578.1281.qm@web54106.mail.re2.yahoo.com> <20080727041523.GC4239@hpegg.niftyegg.com> <20080727192807.GE10566@leitl.org> <20080727191956.f75pi29o0sscocw8@webmail.jpl.nasa.gov> <20080728085130.GN10566@leitl.org> <87tze84pmb.fsf@snark.cb.piermont.com> Message-ID: <6.2.5.6.2.20080729095438.02c44310@jpl.nasa.gov> At 09:33 AM 7/29/2008, Perry E. Metzger wrote: >Eugen Leitl writes: > > On Sun, Jul 27, 2008 at 07:19:56PM -0700, Jim Lux wrote: > > > >> bear in mind that ordinary ethernet both coax and twisted pair is > >> galvanically isolated. > > > > This is strange, because I've seen (small) sparks and received (mild) > > shocks from both, in two different locations. > >Ground loops are a real phenomenon in UTP Ethernet. For example, >*NEVER* run UTP between buildings. If the grounds in the two buildings >are at a different relative potential, and they often are, very bad >things can happen. And for some interesting reasons... First off, there's an isolation transformer as part of every UTP Ethernet interface, which one might think would solve the problem.. http://www.freescale.com/files/microcontrollers/doc/app_note/AN2759.pdf So here's some specs.. 1500V isolation (ok, so you're probably not going to get outright breakdown).. and this is actually tested during manufacture, usually (HiPot testing).. -40dB differential to common mode isolation.. What's this mean in practical terms? let's just say that there's 120VAC on the pair (that's common mode)... the transformer will isolate that from the other side by 40dB.. a factor of 100 in voltage, so now you're looking at 1.2V as a differential mode signal into the receiver.. oops.. that's more than enough to screw up the connection. And, if the coupling to the two wires isn't the same, then you have a differential mode signal, which is coupled right on through the transformer. (granted, the isolation spec is at 1 MHz, who knows what it might be at 50 or 60 Hz) And, of course, if there is stray capacitance from UTP to the victim circuit, it could actually flow significant current.. 0.01 uF at 60 Hz is about 260Kohms.. you get around a milliamp leakage current, which, granted, won't make a spark, but imposed across a 10K input impedance for a receiver amplifier will certainly cause troubles. ESD is always an issue: http://ieeexplore.ieee.org/iel5/10111/32405/01513539.pdf?arnumber=1513539 is an interesting paper looking at coupling between cables and such And here's a ap note from Intel about transformerless interfaces ftp://download.intel.com/design/network/applnots/ap438.pdf It notes "We have developed a simple solution that can be used in a wide variety of such applications with the intent of simplifying the design cycle and reducing development time enabling products to enter the market in a shorter time frame than otherwise might be possible " On the other hand, they also warn:Magnetic-less LAN designs should not be done when the LAN signals must be routed through ables that are external to the system chassis. The isolation transformer in a magnetics module, provides some level of improved safety in the event that higher voltages or ESD gets onto the LAN cable. Magnetic-less LAN designs should only be done when the differential circuits or cables will be routed internal to the same chassis. So, "within the rack" in a cluster might be able to use these techniques. In any event, the design shown in the ap note basically uses 0.056 uF in series with each wire of the pair, with a 0.1 uF to "chassis"... That 0.1 uF is only 26K at 60Hz, so if you (foolishly) used one of these designs to connect between two buildings where the chassis are at, say, 50V differential, that's a BIG problem. >The building complex I live in ran Cat 5 between buildings in >underground ducts. They were very surprised when lightning strikes >some distance away regularly blew out the switches. Changing to fiber >eliminated the problem, of course. > > > In any case I'll have an electrician diagnose the problem. > >If you're seeing sparks, as you say, I suspect you do indeed have an >AC supply problem. Ground loop, or something worse. (The Electrical >Wiring FAQ describes several problems that qualify as "worse"...) > >-- >Perry E. Metzger perry@piermont.com From bill at cse.ucdavis.edu Tue Jul 29 10:14:07 2008 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: <627174098.322311217331658556.JavaMail.root@zimbra.vpac.org> Message-ID: <488F4FDF.3050300@cse.ucdavis.edu> Bogdan Costescu wrote: > On Tue, 29 Jul 2008, Chris Samuel wrote: > >> 1) Use a mainline kernel, we've found benefit of that >> over stock CentOS kernels. > > Care to comment on this statement ? > 2.6.18 (RHEL-5.2) is currently almost 2 years old. One improvement since then that I use heavily is ECC scrubbing, I don't like to have RAID arrays without it, silent errors can accumulate otherwise. It's also created a ugly nest of backports inside and outside of redhat. So things like sky2 gigE adapters are ugly to support (and don't have a driver disk), and are especially hard to fix when you have to modify the installer (CD or PXE) to work. I've seen similar with intel e1000s (which are always changing), infinipath, areca cards, etc. There have also been tweaks for NUMA, quad core, and related. I'm guessing that's why, er, one of the largest new clusters went with Fedora (TAC?). In general I'd say that the new kernels do much better on modern hardware than the ugly situation of downloading a random RPM, or waiting for official support. Seems like quite a few companies (ati, 3ware, areca, intel, amd, and many others I'm sure) are trying hard to improve the mainline kernel drivers. I understand why RHEL doesn't change the kernel (stability, testing, etc.), but not sure it's the best fit for HPC type applications, especially with the pace of hardware changes these days. From iioleynik at gmail.com Mon Jul 28 19:58:19 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: <271mgAaVh3670S10.1217118153@cmsweb10.cms.usa.net> <488E4B9D.1060609@cse.ucdavis.edu> Message-ID: Bill, Thank you for your comments. >> Yes, both opteron 2356 and Xeon E5440 are comparable in pricing (~ $700), >> but it is 0.5 GHz difference! >> > > Er, so, aren't you more concerned with performance than clockspeeds? I've > seen little if any correlation. > Yes, I care about performance, but our previous experience with running our mpi codes on TACC computers (Ranger, Barcelona 2.0 GHz) and Lonestar (Xeon 5100 2.66GHz) is not in favor of AMD. They have recently upgraded Ranger to 2.3 GHz, I am going to run tests and report the results. > > Keep in mind that Intel hasn't changed their memory system much in quite a > few > generations. > But FSB was bumped up substantially (from 400 MHz in 2002 to 1600 MHz now). > I am going to run some tests, but our previous experience with 2.0GHz >> Barcelona was not very encouraging. >> > > What tests did you run? 8 threads on each? Basically if you are cache > friendly the intel systems win, if not the opteron systems win. Over twice > the memory bandwidth can be quite handy. AMD still wins most 8 thread > benchmarks (spec CPU and spec web among others) and many application tests > that I've done. > These are the MPI jobs, 8 processes per 8 cores. I am not sure about the setup, but I thought it is worth to disable multithreading. Correct me if I am wrong. Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080728/0bcec9a4/attachment.html From iioleynik at gmail.com Mon Jul 28 20:02:49 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Re: cluster quote In-Reply-To: <200807282229.m6SMTqmv011570@asacomputers.com> References: <200807282229.m6SMTqmv011570@asacomputers.com> Message-ID: Sean, Yes, I am interested in 40 node, dual socked, Intel and AMD, specs were posted in my original posting. I am exploring the twin 1-U units (2 MBs with 4 sockets )from Supermicro with on-board IB cards. The config is basic, the cost what matters. Thanks, Ivan On Mon, Jul 28, 2008 at 6:30 PM, Sean wrote: > Ivan, > I just saw your posts in the Beowuld website. Is it possible that we can > quote for this opportunity? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080728/cd9496de/attachment.html From iioleynik at gmail.com Mon Jul 28 20:18:51 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: <1217232982.4907.15.camel@Vigor13> References: <1217232982.4907.15.camel@Vigor13> Message-ID: John, Thanks for your comments. > > > > > > 2. reasonably fast interconnect (IB SDR 10Gb/s would suffice our > > > computational needs (running LAMMPs molecular dynamics and VASP DFT > codes) > > > 3. 48U rack (preferably with good thermal management) > > > > "thermal management"? servers need cold air in front and unobstructed > > exhaust. that means open or mesh front/back (and blanking panels). > > > Agreed. However depending on the location if space is tight you could > think of an APC rack with the heavy fan exhaust door on th rear, and > vent the hot air. > Space is not tight. Computer room is quite spacious but air conditioning is rudimental, no windows or water lines to dump the heat. It looks like a big problem, therefore, consider to put the system somewhere else on campus, although this is not quite convenient. > > > > - 2x Intel Xeon E5420 Hapertown 2.5 GHz quad core CPU : > 2x$350=$700 > > > - Dual LGA 771 Intel 5400 Supermicro mb > : > > > $430 > > I'd recommend looking at the Intel Twin motherboard systems for this > project. Leaves plenty of room for cluster head node, and RAID arrays, a > UPS and switches. > Supermicro have these motherboards with onboard Infiniband, so no need > for extra cards. > > One thing you have to think about is power density - it is no use > cramming 40 1U systems into a rack plus switches and head nodes - it is > going to draw far too many amps. Think two times APC PDUs per cabinet at > the very maximum. The Intel twins help here again, as they have a high > efficiency PSU and the losses are shared between two systems. I'm not > sure if we would still have to spread this sort of load between two > racks - it depends on the calculations. > > You also need to put in some budget for power - importantly - air > conditioning. > > Many thanks, this is very exciting opportunity. I can get 20 1-U units in 42U rack, a lot of space for thermal management and other infrastructure items. Do you know any system integrators that can build 40-node cluster from Supermicro twin units? Are there similar solutions for AMD cpus? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080728/928ef375/attachment.html From iioleynik at gmail.com Mon Jul 28 20:33:16 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: Message-ID: > > vendors have at least list prices available on their websites. > I saw only one vendor siliconmechanics.com that has online integrator. Others require direct contact of a saleperson. > "thermal management"? servers need cold air in front and unobstructed > exhaust. that means open or mesh front/back (and blanking panels). > Yes, are there other options? Built-in airconditioning unit that will exhaust the hot air through a pipe to dump the air outside computer room other than heating it? > wouldn't a 5100-based board allow you to avoid the premium of fbdimms? > May be I am wrong but I saw only FB-DIMMs options and assumed that we need to wait for Nehalems for DDR3? > > > - WD Caviar 750 Gb SATA HD >> : >> $110 >> > > I usually figure a node should have zero or as many disks as feasible. > We prefer to avoid intensive IO over network, therefore, use local scratch. > HP is safe and solid and not cheap. for a small cluster like this, I don't > think vendor integration is terribly important. > > Yes, it is important. Optimized cost is what matters. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080728/96f6f1e4/attachment.html From landman at scalableinformatics.com Tue Jul 29 13:11:42 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: Message-ID: <488F797E.4020807@scalableinformatics.com> Ivan Oleynik wrote: > vendors have at least list prices available on their websites. > > > I saw only one vendor siliconmechanics.com > that has online integrator. Others require direct contact of a saleperson. This isn't usually a problem if you have good spec's that they can work with for you. > > "thermal management"? servers need cold air in front and unobstructed > exhaust. that means open or mesh front/back (and blanking panels). > > > Yes, are there other options? Built-in airconditioning unit that will > exhaust the hot air through a pipe to dump the air outside computer > room other than heating it? You will pay (significantly) more per rack to have this. You seemed to indicate that bells and whistles are not wanted (e.g. "cost is king"). The hallmarks of good design for management of power/heat/performance/systems *all* will add (fairly non-trivial) premiums over your pricing. IPMI will make your life easier on management, though there is a cross-over where serial consoles/addressable and switchable PDUs make more sense. Of course grad students are "free", though the latency to get one into a server room at 2am may be higher than that of the IPMI and other solutions. > > > > wouldn't a 5100-based board allow you to avoid the premium of fbdimms? > > > May be I am wrong but I saw only FB-DIMMs options and assumed that we > need to wait for Nehalems for DDR3? Some vendors here can deliver the San Clemente based boards in compute nodes (DDR2). DDR3 can be delivered on non-Xeon platforms, though you lose other things by going that route. > > > > > - WD Caviar 750 Gb SATA HD > : > $110 > > > I usually figure a node should have zero or as many disks as feasible. > > > > We prefer to avoid intensive IO over network, therefore, use local scratch. We are measuring about 460 MB/s with NFS over RDMA from a node to our JackRabbit unit. SDR all the way around, with a PCIx board in the client. Measuring ~800 MB/s on OSU benchmarks, and 750 MB/s on RDMA bw tests in OFED 1.3.1. If you are doing IB to the nodes this should work nicely. Also, 10 GbE would work as well, though NFS over RDMA is more limited here. > > > HP is safe and solid and not cheap. for a small cluster like this, > I don't think vendor integration is terribly important. > > > Yes, it is important. Optimized cost is what matters. If cost is king, then you don't want IPMI, switchable PDUs, serial consoles/kvm over IP, fast storage units, ... Listening to the words of wisdom coming from the folks on this list, suggest that revising this plan, to incorporate at least some elements that make your life easier, is definitely in your interest. We agree with those voices. We are often asked to help solve our customers problems, remotely. Having the ability to take complete control (power, console, ...) of a node via a connection enables us to provide our customer with better support. Especially when they are a long car/plane ride away. I might suggest polling the people who build them for their research offline and ask them what things they have done, or wish they have done. You can always buy all the parts from Newegg and build it yourself if you wish. Newegg won't likely help you with subtle booting/OS load/bios versioning problems. Or help you identify performance bottlenecks under load. If this is important to you, ask yourself (and the folks on the list) what knowledgeable support and good design is worth. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From rgb at phy.duke.edu Tue Jul 29 14:08:31 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: <271mgAaVh3670S10.1217118153@cmsweb10.cms.usa.net> <488E4B9D.1060609@cse.ucdavis.edu> Message-ID: On Mon, 28 Jul 2008, Ivan Oleynik wrote: > Bill, > > Thank you for your comments. > > >>> Yes, both opteron 2356 and Xeon E5440 are comparable in pricing (~ $700), >>> but it is 0.5 GHz difference! >>> >> >> Er, so, aren't you more concerned with performance than clockspeeds? I've >> seen little if any correlation. >> > Yes, I care about performance, but our previous experience with running our > mpi codes on TACC computers (Ranger, Barcelona 2.0 GHz) and Lonestar (Xeon > 5100 2.66GHz) is not in favor of AMD. They have recently upgraded Ranger to > 2.3 GHz, I am going to run tests and report the results. > In both processor families there is a strong (nearly linear) correlation with clock for obvious reasons, but they don't have the same base or slope. In my experience, AMDs outperform Intel at equivalent clock, by ROUGHLY a scale factor that ranges from 1.5 to maybe 1.1 or 1.2 (depending on just what family is being compared to what family). But this experience is far from exhaustive, and because it is so variable it is very difficult, as you say, to make definitive statements based on clock alone. In fact, the only sensible thing to do is indeed run tests and compare on a family-by-family, program-by-program basis since outside of simple stream-like stuff there are differences per application and program pattern -- integer and float don't even necessarily scale the same way. rgb > > >> >> Keep in mind that Intel hasn't changed their memory system much in quite a >> few >> generations. >> > > But FSB was bumped up substantially (from 400 MHz in 2002 to 1600 MHz now). > > > >> I am going to run some tests, but our previous experience with 2.0GHz >>> Barcelona was not very encouraging. >>> >> >> What tests did you run? 8 threads on each? Basically if you are cache >> friendly the intel systems win, if not the opteron systems win. Over twice >> the memory bandwidth can be quite handy. AMD still wins most 8 thread >> benchmarks (spec CPU and spec web among others) and many application tests >> that I've done. >> > > These are the MPI jobs, 8 processes per 8 cores. I am not sure about the > setup, but I thought it is worth to disable multithreading. Correct me if I > am wrong. > > > Ivan > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From hahn at mcmaster.ca Tue Jul 29 15:28:35 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: <1217232982.4907.15.camel@Vigor13> Message-ID: > Space is not tight. Computer room is quite spacious but air conditioning is > rudimental, no windows or water lines to dump the heat. It looks like a big if space is not a big deal, why are you even thinking about rack-mount? >> I'd recommend looking at the Intel Twin motherboard systems for this nothing special about Intel twins afaik - AMD twins are comparable. but it seems even sillier to go with twin systems since you say space is not tight, and you'd be just creating a hotter hot-spot to cool. (ultimately, of course, 40 nodes dissipate about the same amount regardless of formfactor, and you need to find out whether your room can extract around 40*400W, about 5 tons of cooling. and supply that 16 kw, of course. I haven't measured many dual-quads yet - the dissipation might be closer to 300W.) >> the very maximum. The Intel twins help here again, as they have a high >> efficiency PSU and the losses are shared between two systems. I'm not afaik, their efficiency is maybe 10% better than more routine hardware. doesn't really change the big picture. and high-eff PSU's are available in pretty much any form-factor. choosing lower-power processors (and perhaps avoiding fbdimms) will probably save more power than perseverating too much on the PSU... From niftyompi at niftyegg.com Tue Jul 29 16:39:45 2008 From: niftyompi at niftyegg.com (Nifty niftyompi Mitch) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: <271mgAaVh3670S10.1217118153@cmsweb10.cms.usa.net> Message-ID: <20080729233945.GA4514@hpegg.niftyegg.com> On Sun, Jul 27, 2008 at 09:58:38PM -0400, Ivan Oleynik wrote: > Sender: beowulf-bounces@beowulf.org > > Joshua, > Thanks for your response. > > I may be wrong but Barcelona at 2.3GHz is being offered at the same > price as > Harpertown at 2.8GHz. > > Yes, both opteron 2356 and Xeon E5440 are comparable in pricing (~ > $700), but it is 0.5 GHz difference! > I am going to run some tests, but our previous experience with 2.0GHz > Barcelona was not very encouraging. Run some memory bandwidth and cache interaction tests. A 0.5Ghz clock advantage with a slow memory interface may be a net loss. It no longer is 'sane' to count clocks per instruction or chart pipeline interactions by hand on paper so yes test with as many codes you you have handy (and as many compilers). -- T o m M i t c h e l l Looking for a place to hang my hat. From niftyompi at niftyegg.com Tue Jul 29 16:51:21 2008 From: niftyompi at niftyegg.com (Nifty niftyompi Mitch) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: <1217232982.4907.15.camel@Vigor13> References: <1217232982.4907.15.camel@Vigor13> Message-ID: <20080729235121.GB4514@hpegg.niftyegg.com> On Mon, Jul 28, 2008 at 09:16:11AM +0100, John Hearns wrote: > On Mon, 2008-07-28 at 01:52 -0400, Mark Hahn wrote: > > > > > > 2. reasonably fast interconnect (IB SDR 10Gb/s would suffice our > > > computational needs (running LAMMPs molecular dynamics and VASP DFT codes) > > > 3. 48U rack (preferably with good thermal management) > > > > "thermal management"? servers need cold air in front and unobstructed > > exhaust. that means open or mesh front/back (and blanking panels). > > > Agreed. However depending on the location if space is tight you could > think of an APC rack with the heavy fan exhaust door on th rear, and > vent the hot air. Vent the hot air? Heat management needs attention... if we are just discussing the rack airflow we are ok but it it involves the air flow of the building some planning is order. Venting hot air and pulling in hot Las Vegas daytime air, cold nighttime Las Vegas air, cold Minnesota winter or hot Minnesota summer air can be a problem. Plan on flexability and plan. -- T o m M i t c h e l l Looking for a place to hang my hat. From iioleynik at gmail.com Tue Jul 29 19:47:58 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: <1217316153.5027.6.camel@Vigor13> References: <1217232982.4907.15.camel@Vigor13> <1217316153.5027.6.camel@Vigor13> Message-ID: John, > That's not so good. Youre going to have to get the BTU rating of the > existing air conditioning, and consider getting more unit(s) installed - > if you have an external wall the facilities people can surely drill > through it for the. > > Give serious consideration to putting expensive and noisy kit like this > elsewhere on campus if your facilities people have: > a well cooled computer room / lots of spare amps / physical security - > ie. tough steel doors / environmental monitoring > Networks are fast these days, and with remote power switching you should > not need to physically visit the machine that often. > > Yes, I decided to put my new cluster in dedicated server room on campus. Although it is inconvenient, rewamping air conditioning is the pain in the neck. > Do you know any system integrators that can build 40-node cluster from > > Supermicro twin units? > That depends where you are physically. Have a look at > I did not get the end of your message. I am physically in the US (USF, Tampa, FL). Thanks, Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080729/b727e9ad/attachment.html From iioleynik at gmail.com Tue Jul 29 19:51:51 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: Message-ID: > > > > wouldn't a 5100-based board allow you to avoid the premium of fbdimms? >>> >> >> May be I am wrong but I saw only FB-DIMMs options and assumed that we need >> to wait for Nehalems for DDR3? >> > > 5100+ddr2 is perfectly viable. fbdimms, after all, just a wrapper/extender > that introduces more latency with the claim of higher capacity (they > contain > ddr2 or ddr3 inside the memory-buffer interface.) > Some info re specific motherboards with Intel 5400 chip set that support DDR2 would be very welcome. > > but yes, if possible, I would wait for nehalem. unfortunately, it's hard > to guess when nehalem will actually ship for servers - recent rumors claim > an early release, but I can't guess whether this is only for uni-socket > machines or not. Can not wait - have to spend the money till the end of September. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080729/98b0f93e/attachment.html From iioleynik at gmail.com Tue Jul 29 19:56:02 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: Message-ID: I check the pricing, IPMI is extra $100/node or $4000/40 nodes=2 extra >> > > IMO, your compute nodes will wind up $3k; spending a couple percent on > managability is just smart. you're the one who asked for advice... > Agree, it looks like there is a concensus regarding IPMI, I will follow this practical advice. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080729/a292221d/attachment.html From iioleynik at gmail.com Tue Jul 29 20:16:02 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: <488F797E.4020807@scalableinformatics.com> References: <488F797E.4020807@scalableinformatics.com> Message-ID: > > vendors have at least list prices available on their websites. >> >> >> I saw only one vendor siliconmechanics.com >> that has online integrator. Others require direct contact of a saleperson. >> > > This isn't usually a problem if you have good spec's that they can work > with for you. > Yes, I do have good spec's, see my original posting, although might consider AMD as well. Joe, can you provide a quote? > You will pay (significantly) more per rack to have this. You seemed to > indicate that bells and whistles are not wanted (e.g. "cost is king"). > Air conditioning problem has been solved, will put my new cluster in a proper place with enough power and BTUs to dissipate. The hallmarks of good design for management of > power/heat/performance/systems *all* will add (fairly non-trivial) premiums > over your pricing. IPMI will make your life easier on management, though > there is a cross-over where serial consoles/addressable and switchable PDUs > make more sense. Of course grad students are "free", though the latency to > get one into a server room at 2am may be higher than that of the IPMI and > other solutions. > Yes, will consider IPMI as people advise. > > > Some vendors here can deliver the San Clemente based boards in compute > nodes (DDR2). DDR3 can be delivered on non-Xeon platforms, though you lose > other things by going that route. > Would 5100 chipset work with 5400 Harpertown xeons? > If cost is king, then you don't want IPMI, switchable PDUs, serial > consoles/kvm over IP, fast storage units, ... > Yes, except of IPMI as people advised. > > Listening to the words of wisdom coming from the folks on this list, > suggest that revising this plan, to incorporate at least some elements that > make your life easier, is definitely in your interest. Yes, this is what I am doing after getting this excellent feedback from all of you. > > We agree with those voices. We are often asked to help solve our customers > problems, remotely. Having the ability to take complete control (power, > console, ...) of a node via a connection enables us to provide our customer > with better support. Especially when they are a long car/plane ride away. > We usually manage the clusters ourselves because don't have a resources in academia for expensive support contracts beyond standard 3 year hardware warranty. > > I might suggest polling the people who build them for their research > offline and ask them what things they have done, or wish they have done. > You can always buy all the parts from Newegg and build it yourself if you > wish. Newegg won't likely help you with subtle booting/OS load/bios > versioning problems. Or help you identify performance bottlenecks under > load. If this is important to you, ask yourself (and the folks on the list) > what knowledgeable support and good design is worth. Yes, agreed. Would like to get this feedback from people in academia, what things they wish have done if looking in the past. Thanks, Joe Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080729/c0601f74/attachment.html From iioleynik at gmail.com Tue Jul 29 20:26:02 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: <1217232982.4907.15.camel@Vigor13> Message-ID: Mark, if space is not a big deal, why are you even thinking about rack-mount? > 40 nodes is too much. Even if room is spacious, we do not want to mess up with boxes as we did in the past. > > nothing special about Intel twins afaik - AMD twins are comparable. > but it seems even sillier to go with twin systems since you say space is > not tight, and you'd be just creating a hotter hot-spot to cool. > (ultimately, of course, 40 nodes dissipate about the same amount regardless > of formfactor, and you need to find out whether your room > can extract around 40*400W, about 5 tons of cooling. and supply that 16 > kw, of course. I haven't measured many dual-quads yet - the dissipation > might be closer to 300W.) > afaik, their efficiency is maybe 10% better than more routine hardware. > doesn't really change the big picture. and high-eff PSU's are available > in pretty much any form-factor. choosing lower-power processors (and > perhaps > avoiding fbdimms) will probably save more power than perseverating too much > on the PSU... > I checked specs for Supermicro SuperServer 6015TW-INF, it looks very attractive - built-in IB interface. I can see the only objection if two MBs in one 1-U create an additional heat stress inside the unit. If it is not the case, then everything else is irrelevant, because I took care of the good air conditioning and power supply for my new cluster. It also looks like the twins give some money saving. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080729/a4e26873/attachment.html From iioleynik at gmail.com Tue Jul 29 20:42:16 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: <488E957B.9070905@cse.ucdavis.edu> References: <271mgAaVh3670S10.1217118153@cmsweb10.cms.usa.net> <488E4B9D.1060609@cse.ucdavis.edu> <488E957B.9070905@cse.ucdavis.edu> Message-ID: > > Yes, I care about performance, but our previous experience with running our >> mpi codes on TACC computers (Ranger, Barcelona 2.0 GHz) and Lonestar (Xeon >> 5100 2.66GHz) is not in favor of AMD. They have recently upgraded Ranger >> to >> 2.3 GHz, I am going to run tests and report the results. >> > > I believe you, it's not a particularly useful data point though. What > compilers on each (intel's compiler favors intel for instance). 8 threads > per core? I believe the 2.0 GHz ranger CPUs were hamstrung by the patch to > handle TLB in software. Were these microbenchmarks? Applications with > smaller (grid or timesteps) data? Applications will fullsize data? > Yes, intel compilers on both Intel and AMD. Again, I am not computer science expert, therefore, do standard things: compile the mpi code and run using mpirun which I assume should load 1 thread per core? My testing was done using our codes applied to problems with dimensions far beyond of what we usually run to get as much stress to the system as possible. But again, I will rerun the tests during this weekend and will report results. > > In any case the last 2 clusters I bought (I've got 10 or so running > currently) was single socket intel 45nm/12MB cache processors, and a dual > socket barcelona. Alas, one just shipped and one is scheduled to ship soon, > neither are onsite. > Any suggestion for a good vendor? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080729/1b0c3d49/attachment.html From iioleynik at gmail.com Tue Jul 29 21:00:12 2008 From: iioleynik at gmail.com (Ivan Oleynik) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik) In-Reply-To: <488EC70F.8090906@harddata.com> References: <200807261957.m6QJv6HE031997@bluewest.scyld.com> <488BD53A.2010907@harddata.com> <488EC70F.8090906@harddata.com> Message-ID: Maurice, Valuable information, thanks very much! > > > If you wanted it stacked that way (performance order), then this is more > closely aligned (ascending): > > BX80574E5410A Intel Quad-Core Xeon E5410 / 2.33 GHz ( 1333 MHz ) - > LGA771 Socket - L2 12 MB ( 2 x 6MB ) - Box $322.00 > OS2350WAL4BGHWOF AMD Third-Generation Opteron 2350 / 2 GHz - Socket F > (1207) - L3 2 MB - PIB/WOF $292.00 > > OS2352WAL4BGHWOF AMD Third-Generation Opteron 2352 / 2.1 GHz - Socket F > (1207) - L3 2 MB - PIB/WOF $366.00 > BX80574E5420A Intel Quad-Core Xeon E5420 / 2.5 GHz ( 1333 MHz ) - LGA771 > Socket - L2 12 MB ( 2 x 6MB ) - Box $393.00 > > OS2354WAL4BGHWOF AMD Third-Generation Opteron 2354 / 2.2 GHz - Socket F > (1207) - L3 2 MB - PIB/WOF $525.00 > BX80574E5430A Intel Quad-Core Xeon E5430 / 2.66 GHz ( 1333 MHz ) - > LGA771 Socket - L2 12 MB ( 2 x 6MB ) - Box $557.00 > > BX80574E5440A Intel Quad-Core Xeon E5440 / 2.83 GHz ( 1333 MHz ) - > LGA771 Socket - L2 12 MB ( 2 x 6MB ) - Box $833.00 > OS2356WAL4BGHWOF AMD Third-Generation Opteron 2356 / 2.3 GHz - Socket F > (1207) - L3 2 MB - PIB/WOF $796.00 > > Notice how the performance ramp is pretty close to identical to the price > ramp? > That is not accidental. > This is what I need to test using my codes. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080730/93094f73/attachment.html From hahn at mcmaster.ca Tue Jul 29 21:31:54 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: Message-ID: >> 5100+ddr2 is perfectly viable. fbdimms, after all, just a wrapper/extender >> that introduces more latency with the claim of higher capacity (they >> contain >> ddr2 or ddr3 inside the memory-buffer interface.) > > > Some info re specific motherboards with Intel 5400 chip set that support > DDR2 would be very welcome. why are you fixed on the 5400? it is, afaikt, fbdimm-specific, though the 5100 supports ddr2. From smulcahy at aplpi.com Tue Jul 29 23:13:32 2008 From: smulcahy at aplpi.com (stephen mulcahy) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: <488F4FDF.3050300@cse.ucdavis.edu> References: <627174098.322311217331658556.JavaMail.root@zimbra.vpac.org> <488F4FDF.3050300@cse.ucdavis.edu> Message-ID: <4890068C.3000905@aplpi.com> Bill Broadley wrote: > In general I'd say that the new kernels do much better on modern > hardware than the ugly situation of downloading a random RPM, or waiting > for official support. Seems like quite a few companies (ati, 3ware, > areca, intel, amd, and many others I'm sure) are trying hard to improve > the mainline kernel drivers. > > I understand why RHEL doesn't change the kernel (stability, testing, > etc.), but not sure it's the best fit for HPC type applications, > especially with the pace of hardware changes these days. Hi Bill, My take on recent (2.6.x) mainline kernels was that there isn't as clear a distinction between production quality and developer quality kernels these days as there used to be in the previous even/odd production/developer kernels. From scanning the kernel releases, it looks like you'd want to stay a minor revision or two behind the bleeding edge if you want some stability. Has this been your experience or do you have extensive test facilities before rolling out mainline kernels onto production systems? Thanks, -stephen -- Stephen Mulcahy, Applepie Solutions Ltd., Innovation in Business Center, GMIT, Dublin Rd, Galway, Ireland. +353.91.751262 http://www.aplpi.com Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway) From bill at cse.ucdavis.edu Tue Jul 29 23:42:19 2008 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: <4890068C.3000905@aplpi.com> References: <627174098.322311217331658556.JavaMail.root@zimbra.vpac.org> <488F4FDF.3050300@cse.ucdavis.edu> <4890068C.3000905@aplpi.com> Message-ID: <48900D4B.6060605@cse.ucdavis.edu> stephen mulcahy wrote: > > > Bill Broadley wrote: >> In general I'd say that the new kernels do much better on modern >> hardware than the ugly situation of downloading a random RPM, or >> waiting for official support. Seems like quite a few companies (ati, >> 3ware, areca, intel, amd, and many others I'm sure) are trying hard to >> improve the mainline kernel drivers. >> >> I understand why RHEL doesn't change the kernel (stability, testing, >> etc.), but not sure it's the best fit for HPC type applications, >> especially with the pace of hardware changes these days. > > Hi Bill, > > My take on recent (2.6.x) mainline kernels was that there isn't as clear > a distinction between production quality and developer quality kernels Yup, pretty much all the mainline kernel.org releases receive a fair bit of testing and percentage wise change very little, occasionally there is an exception like what happened in, er, I think it was 2.6.10 when they changed either the MMU or scheduler. > these days as there used to be in the previous even/odd > production/developer kernels. From scanning the kernel releases, it > looks like you'd want to stay a minor revision or two behind the > bleeding edge if you want some stability. Sure, although I'm not sure you mean 2.6.24 when 2.6.26 is out, or 2.4.26.1 when 2.6.26.3 is out. Seems pretty rare that any mainline kernel is outright unstable. Even when it is it's usually just a particular problem that effects a relatively small fraction of users.... something I'd hope would be exposed by relatively simple testing. With HPC type use if a kernel dies in product I'll revert, sure I like to run reliable clusters, but I'm usually abandoning the centos kernel because of a major win like a more reliable RAID. But sure, I'd recommend joining the kernel list if you run a kernel.org kernel to see if people start screaming bloody murder. I'd strongly recommend a mail reader that supports threads, it's basically impossible to read all of it. > Has this been your experience or do you have extensive test facilities > before rolling out mainline kernels onto production systems? Extensive test facilities... no definitely not. Enough to see that the centos kernels are completely broken on my hardware... often. Raid corruptions, dropped disks, horrible network performance, unsupported cards, poor memory performance, assuming wrong defaults for a CPU, missing PCI ids, disabled driver because someone somewhere on the planet made a broken motherboard, numa issues, CPU frequency issues, cpu temperature sensor issues, etc. But in the 10 clusters I run I usually make decisions for the file servers vs compute nodes differently and have a workload that i use to decide if it's good enough to try in small production runs. Not particularly comprehensive, but definitely tests the stuff I use heavily. After all I'm using something like less than 1% of the kernel, very few drivers and my hardware is identical (at least within a cluster). From john.hearns at streamline-computing.com Wed Jul 30 01:20:58 2008 From: john.hearns at streamline-computing.com (John Hearns) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: <488F797E.4020807@scalableinformatics.com> References: <488F797E.4020807@scalableinformatics.com> Message-ID: <1217406068.5212.3.camel@Vigor13> On Tue, 2008-07-29 at 16:11 -0400, Joe Landman wrote: > Ivan Oleynik wrote: > > vendors have at least list prices available on their websites. > > > > > > I saw only one vendor siliconmechanics.com > > that has online integrator. Others require direct contact of a saleperson. > Ivan, that is what good salespeople are FOR. You use them to gain knowledge of the latest and greatest chipsets/CPUs/doodahs. But you also use them to tap into the knowledge of their pre-sales engineers and delivery/integration manager. For instance, have you thought about a site survey? Any company worth its salt will come along and measure up height of doors, the delivery path from the loading dock - are there stairs? Are the lifts big enough to take a rack? Floor loading? Do you have enough Commando sockets? And once you have a quote from a company there's nothing to say you have to take it - unless you're following a formal tendering process of course, but I think in this case you are not. From john.hearns at streamline-computing.com Wed Jul 30 01:30:38 2008 From: john.hearns at streamline-computing.com (John Hearns) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: References: <1217232982.4907.15.camel@Vigor13> Message-ID: <1217406648.5212.10.camel@Vigor13> On Tue, 2008-07-29 at 18:28 -0400, Mark Hahn wrote: > > > > afaik, their efficiency is maybe 10% better than more routine hardware. > doesn't really change the big picture. and high-eff PSU's are available > in pretty much any form-factor. choosing lower-power processors (and perhaps > avoiding fbdimms) will probably save more power than perseverating too much > on the PSU... Points very well made Mark. I'm just passing on some recent good experiences of building clusters around the size that Ivan wants, and still would highly recommend the Intel twins. We've built many systems with these, for a range of applications and have have good experiences. As a for instance, here is one - admittedly 2.5 times bigger than the one Ivan is planning (right hand side of the page). As you say (and I've said further up thread) you have to pay attention to concentrating heat loads in one rack - hence this one is spread over three racks http://www.streamline-computing.com/ From mathog at caltech.edu Wed Jul 30 09:13:56 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] reboot without passing through BIOS? Message-ID: If one were to build nodes without ECC memory it would probably be a good idea to reboot them from time to time to clean out whatever bad bits might have accumulated. It then occurred to me that doing so would require a trip through the BIOS on every reboot, at least on every x86 based computer I'm familiar with. That is not a terrible thing, but it made me wonder if it is really necessary. Is there a way to configure a machine to reboot by having the OS pass control directly to the boot loader, and so skip the BIOS? An additional reason for being able to do this, although not so much on beowulf nodes, would be that, by loading a different boot loader configuration on the way down, one could choose which of several OS's to boot _before_ the reboot on a multi-boot computer. For instance, some multi-boot PCs I manage in a remote classroom boot to windows by default, so if I want to work on their linux systems I have to walk over there so as to be able to select that option from the boot menu. David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From hahn at mcmaster.ca Wed Jul 30 09:38:44 2008 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] reboot without passing through BIOS? In-Reply-To: References: Message-ID: > If one were to build nodes without ECC memory it would probably be a > good idea to reboot them from time to time to clean out whatever bad > bits might have accumulated. that's pretty cynical, but I suppose also true ;) > made me wonder if it is really necessary. Is there a way to configure a > machine to reboot by having the OS pass control directly to the boot > loader, and so skip the BIOS? An additional reason for being able to do well, there's kexec, which I believe has gotten more mainstream recently. From dnlombar at ichips.intel.com Wed Jul 30 10:33:05 2008 From: dnlombar at ichips.intel.com (Lombard, David N) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] reboot without passing through BIOS? In-Reply-To: References: Message-ID: <20080730173305.GA21941@nlxdcldnl2.cl.intel.com> On Wed, Jul 30, 2008 at 09:13:56AM -0700, David Mathog wrote: > If one were to build nodes without ECC memory it would probably be a > good idea to reboot them from time to time to clean out whatever bad > bits might have accumulated. It then occurred to me that doing so would > require a trip through the BIOS on every reboot, at least on every x86 > based computer I'm familiar with. Not since kexec was added to the kernel! kexec allows you to boot another kernel directly from Linux. I've also written shell scripts that allow you to use kexec to reboot from a grub configuration file or from a PXE server. > That is not a terrible thing, but it > made me wonder if it is really necessary. Is there a way to configure a > machine to reboot by having the OS pass control directly to the boot > loader, and so skip the BIOS? Intel's Rapid Boot Toolkit allows you to install a minimal BIOS that only gets the wires wiggling before handing control to an arbitrary payload. Is much faster than the normal payload, and allows complete control over the platform boot process. kboot is used as a sample payload to provide for a customized Linux boot, e.g., ssh directly into the pre-OS using keys you provided, boot from arbitrary fabric, filesystem, or remote storage. Infiscale has a payload that enables nodes to directly boot Perceus from Infiniband or Ethernet. > An additional reason for being able to do > this, although not so much on beowulf nodes, would be that, by loading a > different boot loader configuration on the way down, one could choose > which of several OS's to boot _before_ the reboot on a multi-boot > computer. Beyond using kexec as described above, grub directly supports this; lilo did too. > For instance, some multi-boot PCs I manage in a remote > classroom boot to windows by default, so if I want to work on their > linux systems I have to walk over there so as to be able to select that > option from the boot menu. Grub above may work quite well for that. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From mathog at caltech.edu Wed Jul 30 11:07:00 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] reboot without passing through BIOS? Message-ID: David Lombard wrote: > On Wed, Jul 30, 2008 at 09:13:56AM -0700, David Mathog wrote: > > It then occurred to me that doing so would > > require a trip through the BIOS on every reboot, at least on every x86 > > based computer I'm familiar with. > > Not since kexec was added to the kernel! That's exactly what I was thinking of for the Beowulf node problem. For instance: http://www.knoppix.net/forum/viewtopic.php?t=27192 > Beyond using kexec as described above, grub directly supports this; lilo > did too. I know how to do this by changing the configurations, but not how to specify a one time change that doesn't need to be manually undone later. Is either of these boot loaders capable of doing the logical equivalent of: grub-next-boot-only -default 3 (Override whatever default is in the config file, but just for the next boot.) Thanks, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From rgb at phy.duke.edu Wed Jul 30 11:27:22 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] reboot without passing through BIOS? In-Reply-To: References: Message-ID: On Wed, 30 Jul 2008, David Mathog wrote: > David Lombard wrote: > >> On Wed, Jul 30, 2008 at 09:13:56AM -0700, David Mathog wrote: >>> It then occurred to me that doing so would >>> require a trip through the BIOS on every reboot, at least on every x86 >>> based computer I'm familiar with. >> >> Not since kexec was added to the kernel! > > That's exactly what I was thinking of for the Beowulf node problem. > For instance: > > http://www.knoppix.net/forum/viewtopic.php?t=27192 > >> Beyond using kexec as described above, grub directly supports this; lilo >> did too. > > I know how to do this by changing the configurations, but not how to > specify a one time change that doesn't need to be manually undone later. > Is either of these boot loaders capable of doing the logical equivalent of: > > grub-next-boot-only -default 3 > > (Override whatever default is in the config file, but just for the next > boot.) There are several ways to accomplish this, and they can be automated. For example, run a script at boot time that runs a script like /etc/specialboot if it exists. Then put: #!/bin/sh # cp /boot/grub/grub.conf.default /boot/grub/grub.conf # cp /boot/grub/grub.conf.special /boot/grub/grub.conf in it. Copy grub.conf into the two files. Edit special to boot your special configuration, let the other one save out as the default/restore point. copy the special to grub.conf. Boot. Uncomment the script line that puts back the default and boot. Plus infinite permutations of the general idea. We used to have something very similar set up for hot installs. There may be better ways to do it now, but this is easy to understand and implement and will work. rgb > > Thanks, > > David Mathog > mathog@caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From mathog at caltech.edu Wed Jul 30 11:56:28 2008 From: mathog at caltech.edu (David Mathog) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] reboot without passing through BIOS? Message-ID: Robert G. Brown wrote: > There are several ways to accomplish this, and they can be automated. > For example, run a script at boot time that runs a script like > /etc/specialboot if it exists. Then put: > > #!/bin/sh > > # cp /boot/grub/grub.conf.default /boot/grub/grub.conf > # cp /boot/grub/grub.conf.special /boot/grub/grub.conf These all work great for controlling from the linux side, not so well for controlling from the Windows site. I have seen methods like this for Windows, but they depend upon putting grub.conf in a location where Windows can write it. Ideally grub/lilo would look in the MBR, or some other block in the first track on the boot disk, for a "boot once special" flag and field. If the flag was set it would read the field and then clear the flag. Then the tool on pretty much any OS to enable this would just be: read a block from disk (from some known special location) set the flag and field write the block back to disk Easiest if it is the MBR, not hard though for any other block in the first track. Heck, if lilo and grub could coordinate they could even use the exact same flag and field for this purpose, so that only one tool would be needed to accomplish it. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From kevin.a.henry at us.army.mil Wed Jul 30 12:11:26 2008 From: kevin.a.henry at us.army.mil (Henry, Kevin A CTR USA ATEC) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] reboot without passing through BIOS? (UNCLASSIFIED) In-Reply-To: References: Message-ID: <5285676DCE242842A0244030FED9F63501F3AFF8@APGR010BEC80006.nae.ds.army.mil> Classification: UNCLASSIFIED Caveats: NONE >From the grub prompt one can do the following savedefault --default=3 --once I don't know if there is a command-line equivilent. Henry, Kevin A TEDT-AT-TTI Bldg. 350 400 Colleran Road APG, MD 21005 Email: kevin.a.henry@us.army.mil Phone: 410-278-0692 -----Original Message----- From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of David Mathog Sent: Wednesday, July 30, 2008 2:07 PM To: Lombard, David N Cc: beowulf@beowulf.org Subject: Re: [Beowulf] reboot without passing through BIOS? David Lombard wrote: > On Wed, Jul 30, 2008 at 09:13:56AM -0700, David Mathog wrote: > > It then occurred to me that doing so would > > require a trip through the BIOS on every reboot, at least on every x86 > > based computer I'm familiar with. > > Not since kexec was added to the kernel! That's exactly what I was thinking of for the Beowulf node problem. For instance: http://www.knoppix.net/forum/viewtopic.php?t=27192 > Beyond using kexec as described above, grub directly supports this; lilo > did too. I know how to do this by changing the configurations, but not how to specify a one time change that doesn't need to be manually undone later. Is either of these boot loaders capable of doing the logical equivalent of: grub-next-boot-only -default 3 (Override whatever default is in the config file, but just for the next boot.) Thanks, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf Classification: UNCLASSIFIED Caveats: NONE From rgb at phy.duke.edu Wed Jul 30 12:49:15 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] reboot without passing through BIOS? In-Reply-To: References: Message-ID: On Wed, 30 Jul 2008, David Mathog wrote: > Robert G. Brown wrote: > >> There are several ways to accomplish this, and they can be automated. >> For example, run a script at boot time that runs a script like >> /etc/specialboot if it exists. Then put: >> >> #!/bin/sh >> >> # cp /boot/grub/grub.conf.default /boot/grub/grub.conf >> # cp /boot/grub/grub.conf.special /boot/grub/grub.conf > > These all work great for controlling from the linux side, not so well > for controlling from the Windows site. I have seen methods like this > for Windows, but they depend upon putting grub.conf in a location where > Windows can write it. Windows? What is this Windows? I don't do Windows...;-) Hmmm, I did think about this once upon a time when I lusted after nighttime use of a Windows cluster in a lab here as a linux cluster. I remember that there was an automatable solution, but I don't remember what it is. Windows can definitely execute commands on a timed basis, so that was no problem. I'll think about it and see if I can remember -- it was close to a decade ago that these fantasies occurred, before it became clear that I wasn't going to get permission. But since Linux is perfectly happy mounting /boot as root-writeable msdos or ntfs, what's wrong with this solution as a boot toggle between the two? XP Pro should be able to at least do a copy exactly like the one above as a .bat file. Or maybe somebody has a better way. Oh, one last way that is VERY easy is to set up dhcp to direct your boot. That is, boot ONLY via PXE/DHCP (or at the very least, put it as the first default boot target and boot from it if it says to). Tell dhcp.conf what kernel you want it to boot, and boot it, then change it serverside. There is a fallthrough solution like this as well -- put a pxe boot target on the server like "windows" that you type in by hand at the pxe/dhcp prompt (not the grub prompt), with the default after a timeout of falling through to linux. But you could do that with grub now if you wanted to, so I'm guessing that the idea is to do this unattended. Unattended suggests server side control, and you have (presumed) absolute control over the dhcp server on a time-dependent basis. Hope this helps -- I missed the early part of why you are doing this so I'm firing blind, but if you (re)describe what you're trying to accomplish I can try to be more directed. In my copious spare time -- I'm teaching physics summer school at Beaufort (if anybody wants to see my summer "house", check out: http://maps.google.com/maps?f=q&hl=en&geocode=&q=pivers%20island%20road%2C%20beaufort%20nc&jsv=121&sll=34.71919,-76.66006&sspn=0.04762,0.110292&num=10&iwloc=addr&iwstate1=saveplace&ie=UTF-8&sa=N&tab=il and zoom in to the max possible. My house is the southwest corner building on the island. That's me waving at you, fishing from the dock...:-). So I'm working 14-16 hour days. rgb > > Ideally grub/lilo would look in the MBR, or some other block in the > first track on the boot disk, for a "boot once special" flag and field. > If the flag was set it would read the field and then clear the flag. > Then the tool on pretty much any OS to enable this would just be: > > read a block from disk (from some known special location) > set the flag and field > write the block back to disk > > Easiest if it is the MBR, not hard though for any other block in the > first track. Heck, if lilo and grub could coordinate they could even use > the exact same flag and field for this purpose, so that only one tool > would be needed to accomplish it. > > Regards, > > David Mathog > mathog@caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From csamuel at vpac.org Wed Jul 30 17:39:30 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Building new cluster - estimate In-Reply-To: Message-ID: <841759592.334291217464770684.JavaMail.root@zimbra.vpac.org> ----- "Bogdan Costescu" wrote: > On Tue, 29 Jul 2008, Chris Samuel wrote: > > > 1) Use a mainline kernel, we've found benefit of that > > over stock CentOS kernels. > > Care to comment on this statement ? a) We found that we got better performance out of the mainline kernels than the CentOS ones; we guess because they handle newer hardware better (RHEL is meant to aim for stability over performance) b) We can use XFS for scratch space rather than being tied to the RHEL One True Filesystem (ext3) which (in our experience) can't handle large amounts of disk I/O. YMMV! cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Wed Jul 30 21:10:55 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] reboot without passing through BIOS? In-Reply-To: <54747381.338551217477350458.JavaMail.root@zimbra.vpac.org> Message-ID: <1615878226.338571217477455796.JavaMail.root@zimbra.vpac.org> ----- "Robert G. Brown" wrote: > Hmmm, I did think about this once upon a time when I lusted > after nighttime use of a Windows cluster in a lab here as > a linux cluster. I remember that there was an automatable > solution, but I don't remember what it is. It's probably not hard to knock one up from scratch, you'd need an automated shutdown on the boxes in 'doze, make sure they have WOL and PXE set up, and then on the management node start a DHCP server configured to do diskless processing for them from cron and send them a WOL signal once it's up. As long as your scheduler has reservations for the lab hours to stop jobs overrunning then you should be able to shut them down nicely (again from cron, along with the DHCP server) ready for the 'doze users in the morning. Of course it'd get more complicated if you have multiple DHCP servers and PXE already in use on that network for those machines! -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Wed Jul 30 21:34:14 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] Linux cluster authenticating against multiple Active Directory domains In-Reply-To: <1558421722.338611217477814134.JavaMail.root@zimbra.vpac.org> Message-ID: <1457489960.339001217478854977.JavaMail.root@zimbra.vpac.org> Here's a curly one.. We are helping a Uni set up a Linux cluster (CentOS 5 based) and we've found out that they have two separate Active Directory instances, one for staff and one for students. They want the cluster to be able to authenticate against both, as users might be on either service. They have assured us that we can just their ADSs as if they are LDAP servers, which is OK, but it looks like Linux doesn't really want to know about using multiple LDAP servers except in a failover/round-robin situation. Our current best guess is to get an LDIF dump of the users who are to be given access (signified by an LDAP attribute) and then load those into a local OpenLDAP or FDS server. We do have various other wacky ideas about using Samba 4, but I don't know if that can belong to multiple AD instances.. Unfortunately our contact at the institute who knows about their ADS config is tied up for the moment so we can't pick his brains and I was wondering if anyone else had run into this sort of issue and knows if it does have a solution ? cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From Bogdan.Costescu at iwr.uni-heidelberg.de Thu Jul 31 03:33:19 2008 From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed Nov 25 01:07:31 2009 Subject: [Beowulf] reboot without passing through BIOS? In-Reply-To: References: Message-ID: On Wed, 30 Jul 2008, David Mathog wrote: > If one were to build nodes without ECC memory Some years ago when I have done this, I have also instituted a schedule of running memory tests every several weeks-months. The time taken by these tests has made the BIOS loading time insignificant... if time was your main reason for this question. > Is there a way to configure a machine to reboot by having the OS > pass control directly to the boot loader, and so skip the BIOS? If your mainboard supports LinuxBIOS/coreboot, you could run that instead. -- Bogdan Costescu IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850 E-mail: bogdan.costescu@iwr.uni-heidelberg.de From landman at scalableinformatics.com Thu Jul 31 05:38:06 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Linux cluster authenticating against multiple Active Directory domains In-Reply-To: <1457489960.339001217478854977.JavaMail.root@zimbra.vpac.org> References: <1457489960.339001217478854977.JavaMail.root@zimbra.vpac.org> Message-ID: <4891B22E.5080008@scalableinformatics.com> If you don't mind using commercial tools, have a look at Centrify. Also Centeris might work for this. Chris Samuel wrote: > Here's a curly one.. > > We are helping a Uni set up a Linux cluster (CentOS 5 > based) and we've found out that they have two separate > Active Directory instances, one for staff and one for > students. > > They want the cluster to be able to authenticate against > both, as users might be on either service. > > They have assured us that we can just their ADSs as > if they are LDAP servers, which is OK, but it looks > like Linux doesn't really want to know about using > multiple LDAP servers except in a failover/round-robin > situation. > > Our current best guess is to get an LDIF dump of > the users who are to be given access (signified > by an LDAP attribute) and then load those into a > local OpenLDAP or FDS server. > > We do have various other wacky ideas about using > Samba 4, but I don't know if that can belong to > multiple AD instances.. > > Unfortunately our contact at the institute who > knows about their ADS config is tied up for the > moment so we can't pick his brains and I was > wondering if anyone else had run into this sort > of issue and knows if it does have a solution ? > > cheers, > Chris -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615 From gdjacobs at gmail.com Thu Jul 31 05:57:05 2008 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Linux cluster authenticating against multiple Active Directory domains In-Reply-To: <4891B22E.5080008@scalableinformatics.com> References: <1457489960.339001217478854977.JavaMail.root@zimbra.vpac.org> <4891B22E.5080008@scalableinformatics.com> Message-ID: <4891B6A1.9040801@gmail.com> Joe Landman wrote: > If you don't mind using commercial tools, have a look at Centrify. Also > Centeris might work for this. > > Chris Samuel wrote: >> They have assured us that we can just their ADSs as >> if they are LDAP servers, which is OK, but it looks >> like Linux doesn't really want to know about using >> multiple LDAP servers except in a failover/round-robin >> situation. >> >> Our current best guess is to get an LDIF dump of >> the users who are to be given access (signified >> by an LDAP attribute) and then load those into a >> local OpenLDAP or FDS server. Looks like Likewise nee Centeris has a FOSS version. From the blurb... "Supports multiple forests with one-way and two-way cross forest trusts" Apparently it's GPL, so legal compatibility shouldn't be an issue. It might be sufficient for your purposes, but check out the feature comparison to see. http://www.likewisesoftware.com/products/likewise_open/comparing_enterprise_and_open.php -- Geoffrey D. Jacobs From dnlombar at ichips.intel.com Thu Jul 31 06:43:29 2008 From: dnlombar at ichips.intel.com (Lombard, David N) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] reboot without passing through BIOS? In-Reply-To: References: Message-ID: <20080731134329.GA23066@nlxdcldnl2.cl.intel.com> On Wed, Jul 30, 2008 at 11:27:22AM -0700, Robert G. Brown wrote: > On Wed, 30 Jul 2008, David Mathog wrote: > > > David Lombard wrote: > > > >> On Wed, Jul 30, 2008 at 09:13:56AM -0700, David Mathog wrote: > >>> It then occurred to me that doing so would > >>> require a trip through the BIOS on every reboot, at least on every x86 > >>> based computer I'm familiar with. > >> > >> Not since kexec was added to the kernel! > > > > That's exactly what I was thinking of for the Beowulf node problem. > > For instance: > > > > http://www.knoppix.net/forum/viewtopic.php?t=27192 > > > >> Beyond using kexec as described above, grub directly supports this; lilo > >> did too. > > > > I know how to do this by changing the configurations, but not how to > > specify a one time change that doesn't need to be manually undone later. > > Is either of these boot loaders capable of doing the logical equivalent of: > > > > grub-next-boot-only -default 3 > > > > (Override whatever default is in the config file, but just for the next > > boot.) > > There are several ways to accomplish this, and they can be automated. > For example, run a script at boot time that runs a script like > /etc/specialboot if it exists. Then put: > > #!/bin/sh > > # cp /boot/grub/grub.conf.default /boot/grub/grub.conf > # cp /boot/grub/grub.conf.special /boot/grub/grub.conf According the the FM, there's a "grub-set-default" program that does the trick. It supposedly creates a "default" file in the grub directory, nominally /boot, that causes grub to behave differently. That's what I was alluding to in my first post. Sadly, no such program exists in my F7. GIYF teaches us that the new method is: # echo "savedefault --default=2 --once" | grub --batch # reboot where "2" is the choice for your next one-time boot. I haven't tried this; I did use the LILO method when it was the bootloader of choice... -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From maurice at harddata.com Tue Jul 29 21:34:50 2008 From: maurice at harddata.com (Maurice Hilarius) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik) In-Reply-To: References: <200807261957.m6QJv6HE031997@bluewest.scyld.com> <488BD53A.2010907@harddata.com> <488EC70F.8090906@harddata.com> Message-ID: <488FEF6A.30003@harddata.com> Ivan Oleynik wrote: > Maurice, > > Valuable information, thanks very much! My pleasure. > > > > If you wanted it stacked that way (performance order), then this > is more closely aligned (ascending): > > .. > > Notice how the performance ramp is pretty close to identical to > the price ramp? > That is not accidental. > > > > This is what I need to test using my codes. That is the thing. No matter what anyone says, your codes are all that really count. Good luck with it. BTW< where a lot of people are jumping on the "Get IPMI " bandwagon, I suggest getting PDUs with remote IP controlled ports is more useful. More reliable, anyway. I have seen too many cases where IPMI jams up. If you set your machines BIOS to start on power up, it is trivial to stop and start machines with the PD U power, and that is definitely reliable. Plus , with a lot of those PDUs you can add thermal sensors and trigger power off on high temperature conditions. -- With our best regards, //Maurice W. Hilarius Telephone: 01-780-456-9771/ /Hard Data Ltd. FAX: 01-780-456-9772/ /11060 - 166 Avenue email:maurice@harddata.com/ /Edmonton, AB, Canada http://www.harddata.com// / T5X 1Y3/ / -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080729/252c2891/attachment.html From tim at timbury.net Wed Jul 30 20:21:09 2008 From: tim at timbury.net (Tim Kissane) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] reboot without passing through BIOS? In-Reply-To: References: Message-ID: <48912FA5.8060002@timbury.net> Robert G. Brown wrote: > On Wed, 30 Jul 2008, David Mathog wrote: > >> Robert G. Brown wrote: >> >>> There are several ways to accomplish this, and they can be automated. >>> For example, run a script at boot time that runs a script like >>> /etc/specialboot if it exists. Then put: >>> >>> #!/bin/sh >>> >>> # cp /boot/grub/grub.conf.default /boot/grub/grub.conf >>> # cp /boot/grub/grub.conf.special /boot/grub/grub.conf >> >> These all work great for controlling from the linux side, not so well >> for controlling from the Windows site. I have seen methods like this >> for Windows, but they depend upon putting grub.conf in a location where >> Windows can write it. > > Windows? What is this Windows? I don't do Windows...;-) > rgb > I'm a Linux fanatic, I admit it. So far there have been interesting suggestions, all of which (save the dhcp idea) assume grub or lilo as the boot manager. Why not use the NT boot loader to dual boot the box; it's config file is writable from Windows (obviously) and Linux with the ntfs driver. Then, automated scripts can be run from either OS to switch the default OS for a one shot deal or to set a new default. Not the usual solution from a Linux nut, but it's worked for me (back when I actually still ran Windows). My two coins of the realm. TimK -- Tim Kissane Timbury Computer Services http://timkissane.com/ http://tcs.timbury.com/ 732.604.3817 ========================= "More hideous crimes have been committed in the name of obedience than in the name of rebellion." -- CP Snow From lynesh at cardiff.ac.uk Thu Jul 31 01:16:11 2008 From: lynesh at cardiff.ac.uk (Huw Lynes) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Linux cluster authenticating against multiple Active Directory domains In-Reply-To: <1457489960.339001217478854977.JavaMail.root@zimbra.vpac.org> References: <1457489960.339001217478854977.JavaMail.root@zimbra.vpac.org> Message-ID: <1217492171.3072.4.camel@w1199.insrv.cf.ac.uk> On Thu, 2008-07-31 at 14:34 +1000, Chris Samuel wrote: > Here's a curly one.. > > We are helping a Uni set up a Linux cluster (CentOS 5 > based) and we've found out that they have two separate > Active Directory instances, one for staff and one for > students. > > They want the cluster to be able to authenticate against > both, as users might be on either service. > > They have assured us that we can just their ADSs as > if they are LDAP servers, which is OK, but it looks > like Linux doesn't really want to know about using > multiple LDAP servers except in a failover/round-robin > situation. > Funnily enough we used to do something similar here. Falling through from the main campus LDAP (on an e-directory cluster) to the LDAP in Computer Science. It required some patches to nss_ldap to make it work properly and the pam config was a little bit tricky, but it did work. I still have that config up and running on some of my older machines so I can hunt down the config and patches if it would be useful. Thanks, Huw -- Huw Lynes | Advanced Research Computing HEC Sysadmin | Cardiff University | Redwood Building, Tel: +44 (0) 29208 70626 | King Edward VII Avenue, CF10 3NB From d.love at liverpool.ac.uk Thu Jul 31 06:52:46 2008 From: d.love at liverpool.ac.uk (Dave Love) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Re: Linux cluster authenticating against multiple Active Directory domains In-Reply-To: <1457489960.339001217478854977.JavaMail.root@zimbra.vpac.org> (Chris Samuel's message of "Thu, 31 Jul 2008 14:34:14 +1000 (EST)") References: <1558421722.338611217477814134.JavaMail.root@zimbra.vpac.org> <1457489960.339001217478854977.JavaMail.root@zimbra.vpac.org> Message-ID: <87abfy40up.fsf@liv.ac.uk> Chris Samuel writes: > They have assured us that we can just their ADSs as > if they are LDAP servers, which is OK, but it looks > like Linux doesn't really want to know about using > multiple LDAP servers except in a failover/round-robin > situation. Having completely separate ADs for staff and students seems odd... Why doesn't it work to have two `sufficient' cases of pam_ldap with different `config' args pointing to different servers? However, LDAP isn't an authentication protocol. Use Kerberos for authentication. If two cases of pam_krb5 with different `realm' args doesn't work (as it should with Russ Allbery's version in Debian), you should be able to drop in a ~/.k5login for each user to authenticate with a principal in the appropriate realm (Windows domain, or whatever the correct AD terminology is). See the doc for whichever pam_krb5 you have, or use http://www.eyrie.org/~eagle/software/pam-krb5/. > Our current best guess is to get an LDIF dump of > the users who are to be given access (signified > by an LDAP attribute) and then load those into a > local OpenLDAP or FDS server. [Can't OpenLDAP just refer to the AD LDAPs?] You could also set up your own Kerberos to do cross-real authentication to AD, but I doubt you need to. From d.love at liverpool.ac.uk Thu Jul 31 09:07:50 2008 From: d.love at liverpool.ac.uk (Dave Love) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Re: Linux cluster authenticating against multiple Active Directory domains In-Reply-To: <4891B6A1.9040801@gmail.com> (Geoff Jacobs's message of "Thu, 31 Jul 2008 07:57:05 -0500") References: <1457489960.339001217478854977.JavaMail.root@zimbra.vpac.org> <4891B22E.5080008@scalableinformatics.com> <4891B6A1.9040801@gmail.com> Message-ID: <87tze62g15.fsf@liv.ac.uk> Geoff Jacobs writes: > Joe Landman wrote: >> If you don't mind using commercial tools, have a look at Centrify. Centrify needs admin on the AD systems, and in my experience it doesn't provide anything except grief, unless you want your systems to be adminned from the Windows world. [It's proprietary, not just commercial.] A recent GNU/Linux distribution you're likely to use will provide all you need if you have to be an authentication and/or directory client of the Windows world. > Looks like Likewise nee Centeris has a FOSS version. From the blurb... > > "Supports multiple forests with one-way and two-way cross forest trusts" Normal Kerberos clients will work cross-realm anyhow. > Apparently it's GPL, so legal compatibility shouldn't be an issue. That's actually an odd choice for (presumably) PAM and NSS modules which you expect to be dynamically linked into programs with non-GPL-compatible licences. From kilian at stanford.edu Thu Jul 31 13:00:45 2008 From: kilian at stanford.edu (Kilian CAVALOTTI) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] reboot without passing through BIOS? In-Reply-To: References: Message-ID: <200807311300.46109.kilian@stanford.edu> On Wednesday 30 July 2008 09:13:56 am David Mathog wrote: > If one were to build nodes without ECC memory it would probably be a > good idea to reboot them from time to time to clean out whatever bad > bits might have accumulated. It then occurred to me that doing so > would require a trip through the BIOS on every reboot, at least on > every x86 based computer I'm familiar with. That is not a terrible > thing, but it made me wonder if it is really necessary. I may be totally missing the point, but doesn't the memory need to be physically (as in electrically) reset in order to clean out those bad bits? And doesn't this require a hard reboot, for the machine to be power cycled, so that memory cells are reinitialized? I mean, if the BIOS stage is skipped, as in kexec'ing a new kernel, electrical initialization doesn't occur, and the bad bits will probably stick there. Unless the kernel does this kind of scrubbing in its initialization phase, which I don't know, I don't see any reason why the memory would be cleaned from errors. And another point I wonder about, is to know if a reboot would do any good for non-ECC memory anyway. As far as I understand it, a memory error is either a repeatable, hard one, like a bad chip, and a reboot won't change anything about it, since the hardware is faulty ; either a transient, soft error, where a bad value is read once, but where next reads are ok. So unless there's a sort of accumulation somewhere in the soft case, I don't really understand what a reboot could do about it? If you got some light to shed on this, I'd be interested. Cheers, -- Kilian From mark.kosmowski at gmail.com Thu Jul 31 13:17:26 2008 From: mark.kosmowski at gmail.com (Mark Kosmowski) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] reboot without passing through BIOS? Message-ID: > > Message: 1 > Date: Thu, 31 Jul 2008 06:43:29 -0700 > From: "Lombard, David N" > Subject: Re: [Beowulf] reboot without passing through BIOS? > To: "Robert G. Brown" > Cc: "Lombard, David N" , > "beowulf@beowulf.org" , David Mathog > > Message-ID: <20080731134329.GA23066@nlxdcldnl2.cl.intel.com> > Content-Type: text/plain; charset=us-ascii > > On Wed, Jul 30, 2008 at 11:27:22AM -0700, Robert G. Brown wrote: > > On Wed, 30 Jul 2008, David Mathog wrote: > > > > > David Lombard wrote: > > > > > >> On Wed, Jul 30, 2008 at 09:13:56AM -0700, David Mathog wrote: > > >>> It then occurred to me that doing so would > > >>> require a trip through the BIOS on every reboot, at least on every x86 > > >>> based computer I'm familiar with. > > >> > > >> Not since kexec was added to the kernel! > > > > > > That's exactly what I was thinking of for the Beowulf node problem. > > > For instance: > > > > > > http://www.knoppix.net/forum/viewtopic.php?t=27192 > > > > > >> Beyond using kexec as described above, grub directly supports this; lilo > > >> did too. > > > > > > I know how to do this by changing the configurations, but not how to > > > specify a one time change that doesn't need to be manually undone later. > > > Is either of these boot loaders capable of doing the logical equivalent of: > > > > > > grub-next-boot-only -default 3 > > > > > > (Override whatever default is in the config file, but just for the next > > > boot.) > > > > There are several ways to accomplish this, and they can be automated. > > For example, run a script at boot time that runs a script like > > /etc/specialboot if it exists. Then put: > > > > #!/bin/sh > > > > # cp /boot/grub/grub.conf.default /boot/grub/grub.conf > > # cp /boot/grub/grub.conf.special /boot/grub/grub.conf > > According the the FM, there's a "grub-set-default" program that does > the trick. It supposedly creates a "default" file in the grub directory, > nominally /boot, that causes grub to behave differently. That's what I > was alluding to in my first post. Sadly, no such program exists in my F7. > > GIYF teaches us that the new method is: > > # echo "savedefault --default=2 --once" | grub --batch > # reboot > > where "2" is the choice for your next one-time boot. > > I haven't tried this; I did use the LILO method when it was the bootloader > of choice... > > -- >From KDE 3.5.x under OpenSUSE 10.x (and, presumably, 11.0) one can choose which bootloader option to reboot to when reboot is selected. Do the Windows requirements need 3-d graphics for the default booting? If not, would it be possible to boot into Linux and provide a virtual Windows environment? Or, setup the default boot to be Linux, then run a script* at Linux boot to set the one-time next boot be to Windows. This way, everytime Windows reboots Linux would start and every time Linux reboots Windows would start. * How to exactly implement such a script is beyond the scope of my current expertise, though I am more confident that this is possible, mayhaps even easily possible, than I am that quantum mechanics is a valid descriptor of the natural world. Having posted this I will be rather embarrassed if this is the thread that began as the non-ECC memory periodic refresh thread and not the Windows-by-day / Linux-by-night thread. Mark Kosmowski From csamuel at vpac.org Thu Jul 31 22:26:28 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Linux cluster authenticating against multiple Active Directory domains In-Reply-To: <1791598739.6271217568021062.JavaMail.root@mail.vpac.org> Message-ID: <1147379837.6461217568388320.JavaMail.root@mail.vpac.org> ----- "Chris Samuel" wrote: > We are helping a Uni set up a Linux cluster (CentOS 5 > based) and we've found out that they have two separate > Active Directory instances, one for staff and one for > students. Thanks to *everyone* who responded, very kind of you all to take the time! We will look into the various suggestions, but the major issue we've just found out about is that they use the same algorithm to create usernames in both AD systems, and so all you need are a staff member and a student with the same name and you have a collision. My gut feeling is that this pretty much rules out using their AD system, but I'd love some more sage advice about whether any of the systems are able to cope with that situation ? I'll reply to a couple of the points that people have brought up separately, but I did want to thank everyone first so those I don't reply to don't feel I'm ignoring them! :-) cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Thu Jul 31 22:28:44 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Linux cluster authenticating against multiple Active Directory domains In-Reply-To: <1217492171.3072.4.camel@w1199.insrv.cf.ac.uk> Message-ID: <1050108012.6491217568524892.JavaMail.root@mail.vpac.org> ----- "Huw Lynes" wrote: Bore da Huw, > Funnily enough we used to do something similar here. Falling through > from the main campus LDAP (on an e-directory cluster) to the LDAP in > Computer Science. Do you have clashes in user names between the two LDAPs ? If so, how do you deal with that ? > It required some patches to nss_ldap to make it work properly and the > pam config was a little bit tricky, but it did work. Yeah, we'd looked at some of the NSS stuff and realised it would need patching.. :-( > I still have that config up and running on some of my older > machines so I can hunt down the config and patches if it > would be useful. That would be awesome, if nothing else it would tell us how feasible it's going to be for this system! Diolch yn fawr, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Thu Jul 31 22:37:12 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Re: Linux cluster authenticating against multiple Active Directory domains In-Reply-To: <810152863.6581217568750370.JavaMail.root@mail.vpac.org> Message-ID: <371991977.6661217569032542.JavaMail.root@mail.vpac.org> ----- "Dave Love" wrote: > Having completely separate ADs for staff and students seems odd... Yeah, I think they're wishing they'd not done that now.. :-) > Why doesn't it work to have two `sufficient' cases > of pam_ldap with different `config' args pointing > to different servers? My information is that it's NSS that's more the problem here rather than PAm, because of the assumptions it makes. > However, LDAP isn't an authentication protocol. Use > Kerberos for authentication. We'd prefer to steer clear of Kerberos, it introduces arbitrary job limitations through ticket lives that are not tolerable for HPC work. Say you submit a job that is in the queue for a week and then will run for 3 months - we don't know if the AD admins will permit the creation of a 4 month ticket "just in case".. There's also the fact that Torque doesn't have GSSAPI support in the mainline versions yet and what I hear about the GSSAPI branch implies that it is just for testing and development at present. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Thu Jul 31 22:39:35 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Re: Linux cluster authenticating against multiple Active Directory domains In-Reply-To: <87tze62g15.fsf@liv.ac.uk> Message-ID: <2114458976.6761217569175748.JavaMail.root@mail.vpac.org> ----- "Dave Love" wrote: > Geoff Jacobs writes: > > > Apparently it's GPL, so legal compatibility shouldn't > be an issue. > > That's actually an odd choice for (presumably) PAM and > NSS modules which you expect to be dynamically linked > into programs with non-GPL-compatible licences. I dunno, you can hardly say that a program that uses PAM is a derivative work of a GPL'd module when it will work perfectly well with any old module. Of course a BSD licensed one would be ideal, but not in their businesses interest I suspect. :-) cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From csamuel at vpac.org Thu Jul 31 22:41:51 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik) In-Reply-To: <488FEF6A.30003@harddata.com> Message-ID: <1284627153.6851217569311940.JavaMail.root@mail.vpac.org> ----- "Maurice Hilarius" wrote: > No matter what anyone says, your codes are all that > really count. Indeed! > BTW< where a lot of people are jumping on the "Get IPMI " > bandwagon, I suggest getting PDUs with remote IP controlled > ports is more useful. Well, it depends on what you're trying to do, if it's get the system and CPU temperatures then a PDU isn't much cop.. :) > I have seen too many cases where IPMI jams up. Yeah, same here. :-( -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From bill at cse.ucdavis.edu Thu Jul 31 23:07:39 2008 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik) In-Reply-To: <1284627153.6851217569311940.JavaMail.root@mail.vpac.org> References: <1284627153.6851217569311940.JavaMail.root@mail.vpac.org> Message-ID: <4892A82B.4080702@cse.ucdavis.edu> Chris Samuel wrote: > ----- "Maurice Hilarius" wrote: > >> No matter what anyone says, your codes are all that >> really count. > > Indeed! > >> BTW< where a lot of people are jumping on the "Get IPMI " >> bandwagon, I suggest getting PDUs with remote IP controlled >> ports is more useful. > > Well, it depends on what you're trying to do, if it's get > the system and CPU temperatures then a PDU isn't much cop.. :) > True, but then again lm_sensors can collects fans speeds and temperatures. >> I have seen too many cases where IPMI jams up. > > Yeah, same here. :-( > From csamuel at vpac.org Thu Jul 31 23:07:24 2008 From: csamuel at vpac.org (Chris Samuel) Date: Wed Nov 25 01:07:32 2009 Subject: [Beowulf] Re: Building new cluster - estimate (Ivan Oleynik) In-Reply-To: <4892A82B.4080702@cse.ucdavis.edu> Message-ID: <198335234.7431217570844350.JavaMail.root@mail.vpac.org> ----- "Bill Broadley" wrote: > True, but then again lm_sensors can collects fans speeds and > temperatures. Indeed, but getting it working and calibrated can be, umm, interesting.. -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency