From the bad things and bad people, you learn the right way and right direction towards the successful life.
― Ehsan Sehgal
Computational science is an extremely powerful set of disciplines for conducting scientific investigations. The end result of computational science is usually grounded in the physical sciences, and engineering, but depends on a chain of expertise spanning much of modern science. Doing computational science well completely depends on all of these disparate disciplines working in concert. A big area of focus these days are the supercomputers being used. The predicate for acquiring a these immensely expensive machines is the improvement in scientific and engineering product arising from their use. While this should be true, getting across this finish line requires a huge chain of activities to be done correctly.
Let’s take a look at all the things we need to do right. Computer engineering and computer science are closest to the machines needed for computational science. These disciplines make these exotic computers accessible and useful for domain science and engineering. A big piece of this work is computer programming and software engineering. The computer program is a way of expressing mathematics in a way for the computer to operate on. Efficient and correct computer programs are a difficult endeavor all by themselves. Mathematics is the language of physics and engineering and essential for the conduct of computing. Mathematics is a middle layer of work between the computer and their practical utility. It is a deeply troubling and ironic trend that applied mathematics is disappearing from computational science. As the bridge between the computer and its practical use, it forms the basis for conducting and believing the computed results. Instead of being an area of increased focus, the applied math is disappearing into either the maw of computer programming or domain science/engineering. It is being lost as a separate contributor. Finally, we have the end result in science and engineering. Quite often we lose sight of computers and computing as a mere tool that must follow its specific rules for quality, reliable results. Too often the computer is treated like it is a magic wand.
Another common thread to horribleness is the increasing tendency for science and engineering to be marketed. The press release has given way to the tweet, but the sentiment is the same. Science is marketed for the masses who have no taste for the details necessary for high quality work. A deep problem is that this lack of focus and detail is creeping back into science itself. Aspects of scientific and engineering work that used to be utterly essential are becoming increasingly optional. Much of this essential intellectual labor is associated with the hidden aspects of the investigation. Things related to mathematics, checking for correctness, assessment of error, preceding work, various doubts about results and alternative means of investigation. This sort of deep work has been crowded out by flashy graphics, movies and undisciplined demonstrations of vast computing power.
Some of the terrible things we discuss here are simply bad science and engineering. These terrible things would be awful with or without a computer being involved. Other things come from a lack of understanding of how to add computing to an investigation in a quality focused manner. The failure to recognize the multidisciplinary nature of computational science is often at the root of many of the awful things I will now describe.
Fake is the new real, You gotta keep a lot a shit to yourself.
― Genereux Philip
Without further ado, here are some terrible things to look out for. Every single item on the list will be accompanied by a link to a full blog post expanding on the topic.
- If one follows high performance computing online (institutional sites, Facebook, Twitter) you might believe that the biggest calculations on the fastest computers are the very best science. You are sold that these massive calculations have the greatest impact on the bottom line. This is absolutely not the case. These calculations are usually one-off demonstrations with little or no technical value. Almost everything of enduring value happens on the computers being used by the rank and file to do the daily work of science and engineering. These press release calculations are simply marketing. They almost never have the pedigree or hard-nosed quality work necessary for good science and engineering. – https://wjrider.wordpress.com/2016/11/17/a-single-massive-calculation-isnt-science-its-a-tech-demo/, https://wjrider.wordpress.com/2017/02/10/it-is-high-time-to-envision-a-better-hpc-future/
- The second thing you come across is the notion that a calculation with larger-finer mesh is better than one with a coarser mesh. In the naïve pedestrian analysis, this would seem to be utterly axiomatic. The truth is that computational modeling is an assembly of many things all working in concert. This is another example of proof by brute force. In the best circumstances this would hold, but most modeling is hardly taking places under the best conditions. The proposition is that the fine mesh allows one to include all sorts of geometric details, so the computational world looks more like reality. This is a priori What isn’t usually discussed is where the challenge is in modeling. Is geometric detail driving uncertainty? What is biggest challenge, and is the modeling focused there? – https://wjrider.wordpress.com/2017/07/21/the-foundations-of-verification-solution-verification/, https://wjrider.wordpress.com/2017/03/03/you-want-quality-you-cant-handle-the-quality/, https://wjrider.wordpress.com/2014/04/04/unrecognized-bias-can-govern-modeling-simulation-quality/
- In concert with these two horrible trends, you often see results presented as the result of single massive calculation that magically unveils the mysteries of the universe. This is computing as a magic wand, and has very little to do with science or engineering. This simply does not happen. Real science and engineering takes 100’s or 1000’s of calculations to happen. There is an immense amount of background work needed to create high quality results. A great deal of modeling is associated with bounding uncertainty or bounding the knowledge we possess. A single calculation is incapable of this sort of rigor and focus. If you see a single massive calculation as the sole evidence of work, you should smell and call “bullshit”. – https://wjrider.wordpress.com/2016/11/17/a-single-massive-calculation-isnt-science-its-a-tech-demo/
- One of the key elements in modern computing is the complete avoidance of discussing how the equations in the code are being solved. The notion is that this detail has no importance. On the one hand, this is evidence of progress, our methods for solving equations are pretty damn good. The methods and the code itself is still an immensely important detail, and constitute part of the effective model. There seems to be a mentality that the methods and codes are so good that this sort of thing can be ignored. All one needs are a sufficiently fine mesh, and the results are pristine. This is almost always false. What this almost willful ignorance shows are lack of sophistication. The methods are immensely important to the results, and we are a very long way from being able to apply the sort of ignorance of this detail that is rampant. The powers that be want you to believe that the method disappears from importance because the computers are so fast. Don’t fall for it. – https://wjrider.wordpress.com/2017/05/19/we-need-better-theory-and-understanding-of-numerical-errors/, https://wjrider.wordpress.com/2017/05/12/numerical-approximation-is-subtle-and-we-dont-do-subtle/
- The George Box maxim about models being wrong, but useful is essential to keep in mind. This maxim is almost uniformly ignored in the high-performance computing bullshit machine. The politically correct view is that the super-fast computers will solve the models so accurately that we can stop doing experiments. The truth is that eventually, if we are doing everything correct, the models will be solved with great accuracy and their incorrectness will be made evident. I strongly expect that we are already there in many cases; the models are being solved too accurately and the real answer to our challenges is building new models. Model building as an enterprise is being systematically disregarded in favor of chasing faster computers. We need far greater balance and focus on building better models worthy of the computers they are being solved on. We need to build the models that are needed for better science and engineering befitting the work we need to do. –https://wjrider.wordpress.com/2017/09/01/if-you-dont-know-uncertainty-bounding-is-the-first-step-to-estimating-it/
- Calculational error bars are an endangered species. We never see them in practice even though we know how to compute them. They should simply be a routine element of modern computing. They are almost never demanded by anyone, and their lack never precludes publication. It certainly never precludes a calculation being promoted as marketing for computing. If I was cynically minded, I might even day that error bars when used are opposed to marketing the calculation. The implicit message in the computing marketing is that the calculations are so accurate that they are basically exact, no error at all. If you don’t see error bars or some explicit discussion of uncertainty you should see the calculation as flawed, and potentially simply bullshit. – https://wjrider.wordpress.com/2017/07/07/good-validation-practices-are-our-greatest-opportunity-to-advance-modeling-and-simulation/, https://wjrider.wordpress.com/2017/09/22/testing-the-limits-of-our-knowledge/, https://wjrider.wordpress.com/2017/04/06/validation-is-much-more-than-uncertainty-quantification/
- One way for a calculation to seem really super valuable is to declare that it is direct numerical simulation (DNS). Sometimes this is an utterly valid designator. The other term that follows DNS is “first principles”. Each of these terms seeks to endow the calculation with legitimacy that it may, or may not deserve. One of the biggest problems with DNS is the general lack of evidence for quality and legitimacy. There is a broad spectrum of the technical World that seems to be OK with treating DNS as equivalent (or even better) with experiments. This is tremendously dangerous to the scientific process. DNS and first principles is still based on solving a model, and models are always wrong. This doesn’t say that DNS isn’t useful, but this utility needs to be proven and bounded by uncertainty. – https://wjrider.wordpress.com/2017/11/02/how-to-properly-use-direct-numerical-simulations-dns/
- Most press releases are rather naked in the implicit assertion that the bigger computer gives a better answer. This is treated as being completely axiomatic. As such there is no evidence provided to underpin this assertion. Usually some colorful graphics, or color movies beautifully rendered accompany the calculation. Their coolness is all the proof we need. This is not science or engineering even though this mode of delivery dominates the narrative today. –https://wjrider.wordpress.com/2017/01/20/breaking-bad-priorities-intentions-and-responsibility-in-high-performance-computing/, https://wjrider.wordpress.com/2014/09/19/what-would-we-actually-do-with-an-exascale-computer/, https://wjrider.wordpress.com/2014/10/03/colorful-fluid-dynamics/
- Modeling is the use of mathematics to connect reality to theory and understanding. Mathematics is translated into methods and algorithms implemented in computer code. It is ironic that the mathematics that forms the bridge between physical world and the computer is increasingly ignored by science. Applied mathematics has been a tremendous partner for physics, engineering and computing throughout the history of computational science. This partnership has waned in priority over the last thirty years. Less and less applied math is called upon and happens being replaced by computer programming or domain science and engineering. Our programs seem to think that the applied math part of the problem is basically done. Nothing could be further from the truth. – https://wjrider.wordpress.com/2014/10/16/what-is-the-point-of-applied-math/, https://wjrider.wordpress.com/2016/09/27/the-success-of-computing-depends-on-more-than-computers/
- A frequent way of describing a computation is to describe the mesh as defining the solution. Little else is given about the calculation such as the equations being solved or how the equations are being approximated. Frequently, the fact that the solutions are approximated is left out. This fact is damaging to the accuracy narrative of massive computing. The designed message is that the massive computer is so powerful that the solution to the equations is effectively exact. The equations themselves basically describe reality without error. All of this is in service of saying computing can replace experiments, or real-world observations. The entire narrative is anathema to science and engineering doing each great disservice. – https://wjrider.wordpress.com/2015/07/03/modeling-issues-for-exascale-computation/
- Computational science is often described in terms that are not consistent with the rest of science. We act like it is somehow different in a fundamental way. Computers are just tools for doing science, and allowing us to solve models of reality far more generally than analytical methods. With all of this power comes a lot of tedious detail needed to do things with quality. This quality comes from the skillful execution of this entire chain of activities described at the beginning of this Post. These details all need to be done right to get good results. One of the biggest problems in the current computing narrative is ignorance to the huge set of activities bridging a model of reality and the computer itself. The narrative wants to ignore all of this because it diminishes the sense that these computers are magical in their ability. The power isn’t magic, it is hard work, success is not a forgone conclusion, and everyone should ask for evidence, not take their word for it. – https://wjrider.wordpress.com/2016/12/22/verification-and-validation-with-uncertainty-quantification-is-the-scientific-method/
Taking the word of the marketing narrative is injurious to high quality science and engineering. The narrative seeks to defend the idea is that buying these super expensive computers is worthwhile, and magically produces great science and engineering. The path to advancing the impact of computational science dominantly flows through computing hardware. This is simply a deeply flawed and utterly naïve perspective. Great science and engineering is hard work and never a foregone conclusion. Getting high quality results depends on spanning the full range of disciplines associated with computational science adaptively as evidence and results demand. We should always ask hard questions of scientific work, and demand hard evidence of claims. Press releases and tweets are renowned for simply being cynical advertisements and lacking all rigor and substance.
One reason for elaborating upon things that are superficially great, but really terrible is cautionary. The current approach allows shitty work to be viewed as successful by receiving lots of attention. The bad habit of selling horrible low-quality work as success destroys progress and undermines accomplishing truly high-quality work. We all need to be able to recognize these horrors and strenuously reject them. If we start to effectively police ourselves perhaps this plague can be driven back, and progress can flourish.
The thing about chameleoning your way through life is that it gets to where nothing is real.
― John Green