What’s measured improves

― Peter F. Drucker

In area of endeavor standards of excellence are important. Numerical methods are no different. Every area of study has a standard set of test problems that researchers can demonstrate and study their work on. These test problems end up being used not just to communicate work, but also test whether work has been reproduced successfully or compare methods. **Where the standards are sharp and refined the testing of methods has a degree of precision and results in actionable consequences. Where the standards are weak, expert judgment reigns and progress is stymied**. In shock physics, the Sod shock tube (Sod 1978) is such a standard test. The problem is effectively a “hello World” problem for the field, but suffers from weak standards of acceptance focused on expert opinion of what is good and bad without any unbiased quantitative standard being applied. Ultimately, this weakness in accepted standards contributes to stagnant progress we are witnessing in the field. It also allows a rather misguided focus and assessment of capability to persist unperturbed by results (standards and metrics can energize progress, https://wjrider.wordpress.com/2016/08/22/progress-is-incremental-then-it-isnt/).

Sod’s shock tube is an example of a test problem being at the right time in the right place. It was published right at the nexus of progress in hyperbolic PDE’s, but before breakthroughs were well publicized. The article introduced a single problem applied to a large number of methods all of which performed poorly in one way or another. **The methods were an amalgam of old and new methods demonstrating the general poor state of affairs for shock capturing methods in the late 1970’s**. Since its publication is has become the opening ante for a method to demonstrate competence in computing shocks. The issues with this problem were highlighted in an earlier post, https://wjrider.wordpress.com/2016/08/18/getting-real-about-computing-shock-waves-myth-versus-reality/, where a variety of mythological thoughts are applied to computing shocks.

This problem is a very idealized shock problem in one dimension that is amenable to semi-analytical solution. As a result an effectively exact solution may be obtained via solution of nonlinear equations. **The evaluation of the exact solution appropriately for comparison with numerical solutions is itself slightly nontrivial. The analytical solution needs to be properly integrated over the mesh cells to represent the correct integrated control volume values (more over this integration needs to be done for the correct conserved quantities)**. Comparison is usually done via the primitive variables, which may be derived from the conserved variable using standard techniques (I wrote about this a little while ago https://wjrider.wordpress.com/2016/08/08/the-benefits-of-using-primitive-variables/ ). A shock tube is the flow that is results when two semi-infinite slabs of gas at different state conditions are held separately. They are then allowed to interact and a self-similar flow is created. This flow can contain all the basic compressible flow structures, shocks, rarefactions, and contact discontinuities.

Specifically, Sod’s shock tube (https://en.wikipedia.org/wiki/Sod_shock_tube) has the following conditions: in a one dimensional domain filled with an ideal gamma law gas, , , the domain is divided into two equal regions; on , ; on , . The flow is described by the compressible Euler equations (conservation of mass, , momentum and energy ), and an equation of state, . **At time zero the flow develops a self-similar structure with a right moving shock followed by a contact discontinuity, and a left moving rarefaction (expansion fan). This is the classical Riemann problem.** The solution may be found through semi-analytical means solving a nonlinear equation defined by the Rankine-Hugoniot relations (see Gottlieb and Groth for a wonderful exposition on this solution via Newton’s method).

The crux of the big issue with how this problem is utilized is that the analytical solution is not used for more than display in plotting comparisons with numerical solutions. The quality of numerical solutions is then only assessed qualitatively. This is a huge problem that directly inhibits progress. **This is a direct result of having no standard beyond expert judgment on the quality. It leads to the classic “hand waving” argument for the quality of solutions.** Actual quantitative differences are not discussed as part of the accepted standard. The expert can deftly focus on the parts of the solution they want to and ignore the parts that might be less beneficial to their argument. Real problems can persist and effectively be ignored (such as the very dissipative nature of some very popular high-order methods). Under this lack of standard relatively poorly performing methods can retain a high level of esteem while better performing methods are effectively ignored.

With all these problems, why does this state of affairs persist year after year? The first thing to note is that the standard of expert judgment is really good for experts. **The expert can rule by asserting their expertise creating a bit of a flywheel effect**. For experts whose favored methods would be exposed by better standards, it allows their continued use with relative impunity. The experts are then gate keepers for publications and standards, which tends to further the persistence of this sad state of affairs. The lack of any standard simply energizes the status quo and drives progress into hiding.

The key thing that has allowed this absurdity to exist for so long is the loss of accuracy associated with discontinuous solutions. **For nonlinear solutions of the compressible Euler equations, high order accuracy is lost in shock capturing. As a result the designed order of accuracy for a computational method cannot be measured with a shock tube solution**. As a result, one of the primary aims of verification is not achieved using this problem. One must always remember that order of accuracy is the confluence of two aspects, the method and the problem. Those stars need to align for the order of accuracy to be delivered.

Order of accuracy is almost always shown in results for other problems where no discontinuity exists. Typically a mesh refinement study, error norms, order of accuracy is provided as a matter of course. **The same data is (almost) never shown for Sod’s shock tube. For discontinuous solutions the order of accuracy is (less than one).** Ideally, the nonlinear features of the solution (shocks and expansions) converge at first-order, and the linearly degenerate feature (shears and contacts) converge at less than first order based on the details of the method (see the paper by Aslam, Banks and Rider (me). The core of the acceptance of the practice of not showing the error or convergence for shocked problems is the lack of differentiation of methods due to similar convergence rates for all methods (if they converge!). The relative security offered by the Lax-Wendroff theorem further emboldens people to ignore things (the weak solution guaranteed by it has to be entropy satisfying to be the right one!).

This is because the primal point of verification cannot be satisfied, but other aspects are still worth (or even essential) to pursue. **Verification is also all about error estimation, and when the aims of order verification cannot be achieved, this becomes a primary concern.** What people do not report and the aspect that is missing from the literature is the relatively large differences in error levels from different methods, and the impact of these differences practically. For most practical problems, the design order of accuracy cannot be achieved. *These problems almost invariably converge at the lower order, but the level of error from a numerical method is still important, and may vary greatly based on details. In fact, the details and error levels actually have greater bearing on the utility of the method and its efficacy pragmatically under these conditions*.

For all these reasons the current standard and practice with shock capturing methods are doing a great disservice to the community**. The current practice inhibits progress by hiding deep issues and failing to expose the true performance of methods. Interestingly the source of this issue extends back to the inception of the problem by Sod**. I want to be clear that Sod wasn’t to blame because none of the methods available to him were acceptable, but within 5 years very good methods arose, but the manner of presentation chosen originally persisted. Sod on showed qualitative pictures of the solution at a single mesh resolution (100 cells), and relative run times for the solution. This manner of presentation has persisted to the modern day (nearly 40 years almost without deviation). One can travel through the archival literature and see this pattern repeated over and over in an (almost) unthinking manner. The bottom line is that it is well past time to do better and set about using a higher standard.

At a bare minimum we need to start reporting errors for these problems. This ought to not be enough, but it is an absolute minimum requirement. The problem is that the precise measurement of error is prone to vary due to details of implementation. This puts the onus on the full expression of the error measurement, itself an uncommon practice. It is uncommonly appreciated that the difference between different methods is actually substantial. **For example in my own work with Jeff Greenough, the error level for the density in Sod’s problem between fifth order WENO, and a really good second-order MUSCL method is a factor of two in favor of the second-order method! (see Greenough and Rider 2004, the data is given in the tables below from the paper)**. This is exactly the sort of issue the experts are happy to resist exposing. Beyond this small step forward the application of mesh refinement with convergence testing should be standard practice. In reality we would be greatly served by looking at the rate of convergence to problems feature-by-feature. We could cut up the problem into regions and measure the error and rate of convergence separately for the shock, rarefaction and contact. This would provide a substantial amount of data that could be used to measure quality of solutions in detail and spur progress.

Two tables of data from Greenough and Rider 2004 displaying the density error for Sod’s problem (PLMDE = MUSCL).

We still use methods quite commonly that do not converge to the right solution for discontinuous problems (mostly in “production” codes). Without convergence testing this sort of pathology goes undetected. For a problem like Sod’s shock tube, it can still go undetected because the defect is relatively small. Usually it is only evident when the testing is on a more difficult problem with stronger shocks and rarefactions. **Even then it is something that has to be looked for showing up as reduced convergence rates, or the presence of constant un-ordered error in the error structure, instead of the standard .** This subtlety is usually lost in a field where people don’t convergence test at all unless they expect full order of accuracy for the problem.

Now that I’ve thrown a recipe for improvement out there to consider, I think it’s worthwhile to defend expert judgment just a bit. Expertise has its role to play in progress. There are aspects of science that are not prone to measurement, science is still a human activity with tastes and emotion. This can be a force of good and bad, the need for dispassionate measurement is there as a counter-weight to the worst instincts of mankind. **Expertise can be used to express a purely qualitative assessment that can make the difference between something that is merely good and great. Expert judgment can see through complexity to remediate results into a form with greater meaning**. Expertise is more of a tiebreaker than the deciding factor. The problem today is that current practice means all we have is expert judgment and this is a complete recipe for the status quo and an utter lack of meaningful progress.

*The important outcome from this discussion is crafting a path forward that makes the best use of our resources. Apply appropriate and meaningful metrics to the performance of methods and algorithms to make progress or lack of it concrete. Reduce, but retain the use of expertise and apply it to the qualitative aspects of results. The key to doing better is striking an appropriate balance. We don’t have it now, but getting to an improved practice is actually easy. This path is only obstructed by the tendency of the experts to hold onto their stranglehold.*

An expert is someone who knows some of the worst mistakes that can be made in his subject, and how to avoid them.

― Werner Heisenberg

Historical montage of Sod shock tube results from Sod 1978, Harten 1983, Huynh 1996, Jiang and Shu 1996. First Sod’s result for perhaps the best performing method from his paper (just expert judgment on my part LOL).

Harten, Ami. “High resolution schemes for hyperbolic conservation laws.”*Journal of computational physics* 49, no. 3 (1983): 357-393.

Suresh, A., and H. T. Huynh. “Accurate monotonicity-preserving schemes with Runge–Kutta time stepping.” *Journal of Computational Physics* 136, no. 1 (1997): 83-99.

Jiang, Guang-Shan, and Chi-Wang Shu. “Efficient Implementation of Weighted ENO Schemes.” *Journal of Computational Physics* 126, no. 1 (1996): 202-228.

Sod, Gary A. “A survey of several finite difference methods for systems of nonlinear hyperbolic conservation laws.” *Journal of computational physics* 27, no. 1 (1978): 1-31.

Gottlieb, J. J., and C. P. T. Groth. “Assessment of Riemann solvers for unsteady one-dimensional inviscid flows of perfect gases.” *Journal of Computational Physics* 78, no. 2 (1988): 437-458.

Banks, Jeffrey W., T. Aslam, and W. J. Rider. “On sub-linear convergence for linearly degenerate waves in capturing schemes.” *Journal of Computational Physics* 227, no. 14 (2008): 6985-7002.

Greenough, J. A., and W. J. Rider. “A quantitative comparison of numerical methods for the compressible Euler equations: fifth-order WENO and piecewise-linear Godunov.” *Journal of Computational Physics* 196, no. 1 (2004): 259-281.

Katate Masatsuka

said:Hi Bill,

Interesting article, which reminds me of a computer project in Bram’s CFD-II class in 1996, where we were asked to compute discrete L1 errors to compare various algorithms for two-fluid shock tube problems. I remember I wrote a code to compute exact cell-averaged solutions on a given grid in order to compute the errors (not very important for 2nd-order solutions, though).

For high-order methods with polynomials of high degrees, the errors perhaps should be computed in integral norms.

Hiro