Mar 022016

I get the feeling it’s time for someone to set the record straight on OpenHPC. Social media and the press have had some strong sentiments of OpenHPC is Intel and OpenPOWER is IBM. While each vendor did start its own “open” initiative, they’re addressing different challenges and are not as contradictory as you might think.

First, I should state that these are strictly my own opinions and I’m not entirely impartial. I am not a founding or current member of either organization, but I have been actively watching both since their inception. I have fairly strong opinions on each, but I also expect to be working with both. In other words, OpenHPC and OpenPOWER could work together!

Logo of the OpenHPC initiative

I say this because OpenHPC really does appear to be opening up. When the OpenHPC git repo was put up, the naysayers pointed out that all existing commits were from Intel. However, this repo is merely a set of build scripts and unit tests for existing open-source tools. OpenHPC is built using the open-source packages that HPC sites have been using for years (SLURM, Warewulf, OpenMPI, MVAPICH, etc). The building blocks of OpenHPC have been open for years.

Intel is pushing this initiative along through the generous contributions of Intel employees, but they are not locking others out. For instance, Intel’s Chief Evangelist of Software Products has openly invited IBM to join OpenHPC (and has stated it would be an “extreme disappointment” if the foundation didn’t feel open enough for such a move).

Furthermore, there’s at least one person looking into a port of OpenHPC to ARM64. I doubt Intel will assist too much in such an effort, but there is a clear precedent of cooperation (that individual has been given access to the OpenHPC build system).

Only time will tell, but I expect OpenHPC and OpenPOWER to significantly impact HPC in the coming years. The end of 2015 was very exciting for HPC, with clear support from President Obama and one of the most lively HPC conferences on record. I’m very excited for 2016.

Jan 022015

I just finished watching Particle Fever, which describes the ~30-year path that physicists endured before the confirmation of the Higgs boson particle. Thousands of people spent years of excruciatingly painstaking efforts to confirm one aspect of our reality. Yet there were setbacks (some taking years) and the collider won’t even be operating at full power until 2015 (although the original schedule called for full-power operation in 2008)…

Candidate Higgs boson event in CERN CMS detector

Candidate Higgs boson event in CERN CMS detector

I know (from both colleagues and personal experience) that the efforts from the IT and computational folks backing up these experiments are no less painstaking and mundane. Keeping a single computer operating correctly can be a pain. Keeping hundreds or thousands operating correctly (along with the incredible diversity of dodgy scientific software packages) is basically impossible.

Continue reading »

Mar 312014

I’ve been thinking a lot about the ways we automate tasks and abstract away difficult/complicated aspects of our lives. That’s what all the “progress” of the last 100 years has been – better ways to save labor and still get the same tasks accomplished. Our species is growing increasingly efficient.

I’m sure the HPC industry has also seen improvements in personal efficiency. Certainly we can get the compute portions of our work done much more quickly. But do you still find yourself fighting many of the same systems/software issues you faced years ago? I feel our industry still has a long way to go as far as making the everyday user’s life simpler.

Continue reading »

Dec 312013

I’ve spent a year reading Clean Code: A Handbook of Agile Software Craftsmanship. That’s not to say I’ve been reading it once a week or had so little time I only managed to read a single book. Instead, I’ve been slowly making my way through it – sometimes only reading a single page at a time. It has made me a better coder and I’m certain I’ll be reading it again.

Clean Code Textbook Cover

The premise is: bad code can operate, but only clean code will work long-term. As code develops and matures, effort must be made to clean and improve it. If your code starts out ugly (or becomes ugly over time), things will inevitably fall apart. Bad code requires too much effort and too many resources to maintain. Businesses and organizations can be weighed down or destroyed by bad code.

Continue reading »

Nov 302013

Computers are complex systems, which makes them difficult to predict. Often times the hardware layers are fairly sophisticated, with the software adding even more factors – many more than a person can fit in their head. That’s why unit tests, integration tests, compatibility tests, performance tests, etc are so important. It’s also why leadership compute facilities (e.g., ORNL Titan, TACC Stampede) have such onerous acceptance tests. Until you’ve verified that an installed HPC system is fully functioning (compute, communication, I/O, reliability, …), it’s pretty likely something isn’t functioning.

Stampede InfiniBand Topology

Stampede InfiniBand Topology

The Stampede cluster at TACC contains over 320 56Gbps FDR InfiniBand switches. Including the node-to-switch and switch-to-switch cables, over 11,520 cables are installed. How much testing would you perform before you said “everything is working”?

Continue reading »

Oct 092013

Due to a lapse in government funding, National Science Foundation staff will not be receiving or responding to email until further notice. We sincerely regret this inconvenience and look forward to responding to you once we reopen.

Updates regarding government operating status and resumption of normal operations can be found at

In cases of imminent threat to life or property, please call the Office of the Inspector General at 1-800-428-2189.

Aug 312013

Every Linux/Unix user ought to be familiar with the old rm -Rf * gag. Or the more subtle rm -Rf files/ * issue, in which a misplaced space results in the removal of all files and directories. An administrator is going to use the rm utility a hundred times a day. How can they remain efficient while insuring a simple mistake doesn’t result in downtime and serious data loss?

Continue reading »

Jul 312013

I’ve been carrying on a personal challenge for a while – attempting to remain in the 90th percentile for any activities I attempt. To some extent, I think it’s in my nature to compete with others. Alternatively, it’s a good way to be certain I’m actually applying myself.

Some activities are harder than others to gauge, but I find this system provides a good metric. It’s exceptionally hard to be the best at anything, but if you’re in the top 10% then you can be comfortable knowing that you’re doing reasonably good work. If you’re not in the top 10%, then you know you need to improve your methods and/or put in more effort.

Eliot Eshelman Top 10% of Profiles on LinkedIn

Continue reading »

Jun 302013

Dealing with hoards of e-mail is a challenge a lot of us face. Sometimes, I think we slog through without taking a moment to consider improvements. Making yourself more efficient can be worth it, though:

Chart to determine if an optimization effort is worth the time.

I’m not really discussing SPAM filtering. At this point, I think the big guys are already doing a good job taking care of that nuisance. Google Apps with GMail serves me well.

However, people get a lot of “bacon” – e-mail you did technically subscribe to, but may not read regularly. This will be companies you’ve purchased from, LinkedIn updates, professional association newsletters & journals, etc. Not things you’d like to delete, but also not items that should demand your immediate attention.

I was able to clear most of my inbox using a single word:


Because we’re assuming these are legitimate senders (not spammers), they will adhere to standards by including a link to unsubscribe from the mailings. If you’re feeling obsessive, you might also include “Opt Out” in your filter.

This one change has made me quite happy at work.