― Philip Houston

In the constellation of numerical analysis theorems the Lax equivalence theorem may have no equals in its importance. It is simple and its impact is profound on the whole business of numerical approximations. The theorem basically implies that if you provide a consistent approximation to the differential equations of interest and it is stable, the solution will converge. The devil of course is in the details. Consistency is defined by having an approximation with ordered errors in the mesh or time discretization, which implies that the approximation is at least first-order accurate, if not better. A key aspect of this that is overlooked is the necessity to have mesh spacing sufficiently small to achieve the defined error where failure to do so renders the solution erratically convergent at best.

The importance of this theorem in the basis for our commitment to high performance computing cannot be understated. The generally implicit assumption that more computer power is the path to better calculation is incredibly widespread. The truth and limits of this assumption are important to understand. If things are working well, it is a good assumption, but it should not be taken for granted. A finer mesh is almost a priori assumed to be better. This basic premise is rarely if ever challenged, but it should be. Under ideal circumstances it is a fine assumption, but it is not challenged nearly as often as it should be. Part of this narrative is the description of when it should be challenged and the nature of circumstances where verification achieves outright conflict with the utility of the model. We end with the prospect that the verified model might only be accurate and verified in cases where the model is actually invalid.

For example we can carry out a purely mathematical investigation of the convergence of a numerical solution without any regard for its connection to a models utility. Thus we can solve a problem using time and length scales that would render the underlying model as ridiculous and satisfy the mathematical conditions we are examining. Stepping away from this limited perspective to look at the combination of convergence with model utility is necessary. In practical terms the numerical error and convergence where a model is valid is essentially important to understand. Ultimately we need accurate numerical solutions in regimes where a model is valid. More than just being accurate, we need to understand what the numerical errors are in these same regions. Too often we see meshes refined to a ridiculous degree without regard to whether the length scales in the mesh render the model invalid.

I have always thought the actions of men the best interpreters of their thoughts.

― John Locke

A second widespread and equally troubling perspective is the wrong-headed application of the concept of grid independent solutions. Achieving a grid-independent solution is generally viewed as a good to great thing. It is if the conditions are proper. It is not under almost identical conditions. The difference between the good almost great thing, and the awful almost horrible thing is a literal devil in the details, but essential to the conduct of excellence in computational science. One absolutely must engage in the quantitative interrogation of the results. Without a quantitative investigation, the good grid independent results could actually be an illusion.

If the calculation is convergent, the grid independence actually means the calculation is not sensitive to the grid in a substantive manner. This requires that the solution have small errors and the grid independence is the effect of those errors being ordered and negligible. In fact it probably means that you’re engaged in overkill and could get away with a coarser (cheaper) grid without creating any substantial side effects. On the other hand a grid independent solution could mean that the calculation is complete garbage numerically because the grid doesn’t affect the solution. It is not convergent. These two cases are virtually identical in practice and only revealed by doing detailed quantitative analysis of the results. In the end grid independence isn’t either necessary, or sufficient for quality computations. Yet at the same time it is often the advised course of action in conducting computations!

Stability then becomes the issue where you must assure that the approximations produce bounded results under the appropriate control of the solution. Usually the stability is defined as a character of the time stepping approach and requires that the time step be sufficiently small to provide stability. A lesser-known equivalence theorem is due to Dahlquist and applies to integrating ordinary differential equations and applies to multistep methods. From this work the whole aspect of zero stability arises where you have to assure that a non-zero time step size gives stability in the first place. More deeply, Dahlquist’s version of the equivalence theorem applies to nonlinear equations, but is limited to multistep methods where as Lax’s applies to linear equations.

Here is the theorem in all its glory (based on wikipedia entries).

The Lax Equivalence Theorem:

For the numerical solution of partial differential equations. It states that for a consistent finite difference method for a well-posed linear initial value problem, the method is convergent if and only if it is stable. Published in 1956, discussed in a seminar by Lax in 1954.

The Dahlquist Equivalence Theorem:

Now suppose that a consistent linear multistep method is applied to a sufficiently smooth differential equation and that the starting values $y_0, y_1, y_2, \ldots y_n$ all converge to the initial value $y_0$ as $h \rightarrow 0$. Then, the numerical solution converges to the exact solution as $h \rightarrow 0$ if and only if the method is zero-stable.

Something to take serious note of regarding these theorems is the concept of numerical stability was relatively new and novel in 1956. Both theorems helped to lock the concept of stability into being central to numerical analysis. As such both theorems have place as one of the cornerstones of modern numerical analysis. The discovery of the issue of stability was first addressed by Von Neumann for PDEs in the late 1940’s, and independently explored by Dahlquist for ODEs in the early 1950’s. Nonetheless it should be noted that these theorems were published when numerical stability was a new concept and not completely accepted and understood for its importance.

Given that we can very effectively solve nonlinear equations using the principles applied by Lax’s theorem is remarkable. Even though the theorem doesn’t formally apply to the nonlinear case the guidance is remarkably powerful and appropriate. We have a simple and limited theorem that produces incredible consequences for any approximation methodology that we are applying to partial differential equations. Moreover the whole this was derived in the early 1950’s and generally thought through even earlier. The theorem came to pass because we knew that approximations to PDEs and their solution on computers do work. Dahlquist’s work is founded on a similar path; the availability of the computers shows us what the possibilities are and the issues that must be elucidated. We do see a virtuous cycle where the availability of computing capability spurs on developments in theory. This is an important aspect of healthy science where different aspects of a given field push and pull each other. Today we are counting on hardware advances to push the field forward. We should be careful that our focus is set where advances are ripe, its my opinion that hardware isn’t it.

The equivalence theorem also figures prominently in the exercise of assuring solution quality through verification. This requires the theorem to be turned ever so slightly on its head in conducting the work. Consistency is not something that can be easily demonstrated; instead verification focuses on convergence, which may be. We can control the discretization parameters, mesh spacing or time step size to show how the solution changes with respect to them. Thus in verification, we use the apparent convergence of solutions to imply both stability and consistency. Stability is demonstrated by fiat in that we have a bounded solution to use to demonstrate convergence. Having a convergent solution with ordered errors then provides consistency. The whole thing barely hangs together, but makes for the essential practice in numerical analysis.

At this time we are starting to get at the practical limitations of this essential theorem. One way to get around the issues is the practice of hard-nosed code verification as a tool. With code verification you can establish that the code-discretization produces the correct solution to an exact solution to the PDEs. In this way the consistency of the approximation can become firmly established. Part of the remaining issue is the fact that the analytical exact solution is generally far nicer and better behaved that the general solutions you will be applying the code to. Still this is a vital and important step in the overall practice. Without firm code verification as a foundation, the practice of verification without exact solutions is simply untethered. Here we have firmly established the second major limitation of the equivalence theorem, the first being it’s being limited to linear equations.
One of the very large topics in V&V that is generally overlooked is models and their range of validity. All models are limited in terms of their range of applicability based on time and length scales. For some phenomena this is relatively obvious, e.g., multiphase flow. For other phenomena the range of applicability is much more subtle. Among the first important topics to examine is the satisfaction of the continuum hypothesis, the capacity of a homogenization or averaging to be representative. The degree of satisfaction of homogenization is dependent on the scale of the problem and degrades as phenomenon becomes smaller scale. For multiphase flow this is obvious as the example of bubbly flow shows. As the number of bubbles becomes smaller any averaging becomes highly problematic. It argues that the models should be modified in some fashion to account for the change in scale size.

I’ll get at another extreme example of the nature of scale size and the capacity of standard models to apply. Take the rather pervasive principle associated with the second law of thermodynamics. As the size of the system being averaged becomes smaller and smaller eventually violations of this law can be occasionally observed. This clearly means that natural and observable fluctuations may cause the macroscopic laws to be violated. The usual averaged continuum equations do not support this behavior. It would imply that the models should be modified for appropriate scale dependence where such violations in the fundamental laws might naturally appear in solutions.

Another more pernicious and difficult issues are homogenization assumptions that are not so fundamental. Consider the situation where a solid is being modeled in a continuum fashion. When the mesh is very large, the solid comprised of discrete grains can be modeled by averaging over these grains because there are so many of them. Over time we are able to solve problems with smaller and smaller mesh scales. Ultimately we now solve problems where the mesh size approaches the grain size. Clearly under this circumstance the homogenization used for averaging will lose its validity. The structural variations in the homogenized equations are removed and should become substantial and not be ignored as the mesh size becomes small. In the quest for exascale computing this issue is completely and utterly ignored. Some areas of study for high performance computing consider these issues carefully most notably climate and weather modeling where the mesh size issues are rather glaring. I would note that these fields are subjected to continual and rather public validation.

Let’s get to some other deeper limitations of the power of this theorem. For example the theorem gets applied to the models without any sense of the validity of the model. Without consideration for validity, the numerical exploration of convergence may be explored with vigor and be utterly meaningless. The notion of validation is important in considering whether verification should be explored in depth. As such one gets to the nature of V&V as holistic and iterative. As such the overall exploration should always look back at results to determine whether everything is consistent and complete. If the verification exercise takes the model outside its region of validity, the verification error estimation may be regarded as suspect.

The theorem is applied to the convergence of the model’s solution in the limit where the “mesh” spacing goes to zero. Models are always limited in their applicability as a function of length and time scale. The equivalence theorem will be applied and take many models outside their true applicability. An important thing to wrangle in the grand scheme of things is whether models are being solved and convergent in the actual range of scales where it is applicable. A true tragedy would be a model that is only accurate and convergent in regimes where it is not applicable. This may actually be the case in many cases most notably the aforementioned multiphase flow. This calls into question the nature of the modeling and numerical methods used to solve the equations.

Huge volumes of data may be compelling at first glance, but without an interpretive structure they are meaningless.

― Tom Boellstorff

The truth is we have an unhealthy addiction to high performance computing as a safe and trivial way to make progress computationally. We continually move forward with complete faith that a larger calculations are intrinsically better without doing the legwork to demonstrate that they are. We avoid focus on other routes to better solutions in modeling, methods and algorithms in favor of simply porting old models, methods and algorithms to much larger computers. Without having a balanced program with appropriate intellectual investment and ownership of the full breadth of computational science, our approach is headed for disaster. We have a significant risk of producing a computational science program that completely fools itself. We fail to do due diligence activities in favor of simply making the assumption that a finer mesh computed on bigger computers is a priori better than taking another route to improvement.

The difficulty lies not so much in developing new ideas as in escaping from old ones.

― John Maynard Keynes

Lax, Peter D., and Robert D. Richtmyer. “Survey of the stability of linear finite difference equations.” Communications on Pure and Applied Mathematics 9.2 (1956): 267-293.

Dahlquist, Germund. “Convergence and stability in the numerical integration of ordinary differential equations.” Mathematica Scandinavica 4 (1956): 33-53.