Mathematics is the door and key to the sciences.

— Roger Bacon

images-2It is time to return to great papers of the past. The past has clear lessons about how progress can be achieved. Here, I will discuss a trio of papers that came at a critical juncture in the history of numerically solving hyperbolic conservation laws. In a sense, these papers were nothing new, but provided a systematic explanation and skillful articulation of the progress at that time. In a deep sense these papers represent applied math at its zenith, providing a structural explanation along with proof to accompany progress made by others. These papers helped mark the transition of modern methods from heuristic ideas to broad adoption and common use. Interestingly, the depth of applied mathematics ended up paving the way for broader adoption in the engineering world. This episode also provides a cautionary lesson about what holds higher order methods back from broader acceptance, and the relatively limited progress since.

The three papers I will focus on are:

Harten, Ami. “High resolution schemes for hyperbolic conservation laws.” Journal of computational physics 49, no. 3 (1983): 357-393.

Harten, Ami. “On a class of high resolution total-variation-stable finite-difference schemes.” SIAM Journal on Numerical Analysis 21, no. 1 (1984): 1-23.

Sweby, Peter K. “High resolution schemes using flux limiters for hyperbolic conservation laws.” SIAM journal on numerical analysis 21, no. 5 (1984): 995-1011.

The first two are by the late Ami Harten providing a proof of the monotone behavior seen with the heuristic methods existing at that time. The proofs provided some confidence to many that had been lacking from the truly innovative, but largely heuristic invention of the methods. The third paper by Peter Sweby provided a clear narrative and an important graphical tool for understanding these methods and displaying limiters, the nonlinear mechanism that produced the great results. The “Sweby diagram” was the reduction of these complex nonlinear methods to a nonlinear function. The limiter was then a switch between two commonly used classical methods. The diagram produced a simple way of seeing whether any given limiter was going to give second-order non-oscillatory results. Together these three papers paved the way for common adoption of these methods.

Mathematics is the art of giving the same name to different things.

– Henri Poincaré

In the 1970’s three researchers principally invented these nonlinear methods, Jay Boris, Bram Van Leer, and Vladimir Kolgan.  Of these three Boris and Van Leer achieved fame and great professional success. The methods were developed heuristically and worked very well. Each of these methods explicitly worked to overcome Godunov’s barrier theorem that says a second-order linear method cannot be monotone. Both made the methods nonlinear through adapting the approximation based on the local structure of the solution. Interestingly Boris and Van Leer were physicists, Kolgan was an engineer (Van Leer went on to work extensively in engineering). Kolgan was a Russian in the Soviet Union and died before his discovery could take its rightful place next to Boris and Van Leer (Van Leer has gone to great effort to correct the official record).

[Mathematics] is security. Certainty. Truth. Beauty. Insight. Structure. Architecture. I see mathematics, the part of human knowledge that I call mathematics, as one thing—one great, glorious thing. Whether it is differential topology, or functional analysis, or homological algebra, it is all one thing. … They are intimately interconnected, they are all facets of the same thing. That interconnection, that architecture, is secure truth and is beauty. That’s what mathematics is to me.

― Paul R. Halmos

The problem with all these methods was a lack of mathematical certainty on the quality of results along with proofs and structured explanations of their success. This made the broader community a bit suspicious of the results. In a flux corrected transport (FCT, Boris’ invention) commemorative volume this suspicion is noted. At conferences, there were questions raised about the results that implied that the solutions were faked. The breakthrough with these new methods was that good, too good to be true. Then the explanations came and made a strong connection to theory. The behavior seen in the results had a strong justification in mathematics, and the trust in the methodology grew. Acceptance came on the heals of this trust and widespread adoption.

Harten and others continued to search for even better methods after introducing TVD schemes. The broad category of essentially non-oscillatory (ENO) methods was invented. It has been a broad research success, but never experienced the wide spread adoption that these other methods enjoyed. Broadly speaking, the TVD methods are used in virtually every production code for solving hyperbolic conservation laws. In the physics world, many use Van Leer’s approach and engineering uses Harten-Sweby’s formalism broadly. FCT is used somewhat in the physics world, but its adoption is far less common. Part of the reason for this disparity comes down to the power of mathematical proof and the faith it gives. The lack of success of follow-on methods to get adoption and have success comes from the lack of strong theory with its requisite confidence. Faith, confidence and systematic explanation are all provided by well executed applied mathematics.

What is TVD the theory and how does it work?

(Note: WordPress’ Latex capability continues to frustrate, I cannot get them to typeset so if you can read TeX the equations will make sense)

In a nutshell, TVD is a way of extending the behavior of monotone methods (upwind for the purposes of this discussion) to high-order nonlinear methods. Upwind methods have the benefit of positive coefficients in their stencil. If we write this down for a scalar advection equation, u_t + a u_x = 0 , we get the following form, $u_j^{n+1} = u_j^n – C_{j-1/2} \left( u_j^n – u_{j-1}^n \right) + D_{j+1/2} \left(u_{j+1}^n – u_j^n \right) $. The key for the methods is the positivity of the functions  C_{j-1/2} \ge 0 and D_{j+1/2} \ge 0. For example, an upwind method will give constants for these functions, $latex  C_{j-1/2}  = a \Delta t/\Delta x = \nu $ and D_{j+1/2} = 0 for a > 0. The coefficient is the famous CFL (Courant-Friedrichs-Lewy) number. For the TVD methods, these functions become nonlinear functions of the solution itself, but satisfy the inequalities. Harten had done other work that connected monotone methods to entropy satisfying (i.e., physically relevant solutions), which then implies that TVD methods would be a route to similar results (this would seem to be true, but definitive proofs are lacking). Still the connections are all there and close enough to provide faith in the methodology. This is where Sweby’s work comes in and provides a crucial tool for broad acceptance of this methodology.

200px-LimiterRegionWhat Sweby did was provide a wonderful narrative description of TVD methods, and a graphical manner to depict them. In the form that Sweby described, TVD methods were a nonlinear combination of classical methods: upwind, Lax-Wendroff and Beam Warming. The limiter was drawn out of the formulation and parameterized by the ratio of local finite differences. The limiter is a way to take an upwind method and modify it with some part of the selection of second-order methods and satisfy the inequalities needed to be TVD. This technical specification took the following form, $ C_{j-1/2}  = \nu \left( 1 + 1/2\nu(1-\nu) \phi\ledt(r_{j-1/2}\right) \right) $ and D_{j+1/2} =1/2\nu(1-\nu) \phi\left(r_{r+1/2}\right) for a > 0 and $r_{j-1/2} = \frac{ u_{j}^{n} – u_{j-1}^{n} }{ u_{j-1}^{n}  – u_{j-2}^{n}} $. This produced a beautiful and simple diagram that usefully displayed how any given method compared to others. This graphical means was probably the essential step for broad acceptance (my opinion, but for visual people it was essential and a lot of technical folks are visual).

Beyond the power of applied mathematics, other aspects of the technical problem have contributed to the subsequent lack of progress. The biggest issue is the quantum leap in performance from first- to second-order accuracy. The second order methods produce results that seem turbulent because first-order methods produce a truncation error that laminarizes flows. The second-order method produces results for complex problems that have the look and feel of real flows (this may also be quantitatively true, but the jury is out). Important flows are turbulent, high energy with very large Reynolds numbers. First-order schemes cannot produce these realistically at all. Second-order methods can, and for this reason the new schemes unleashed utility upon the World. With these methods, the solutions took on the look, feel and nature of reality. For this reason, these schemes became essential for codes.

The second reason is the robustness of these methods. First-order monotone methods like upwind are terribly robust. These methods produce physically admissible solutions and do not fail often. Codes run problems to completion. The reason is their extremely dissipative nature. This makes them very attractive for difficult problems and almost guarantees a solution for the calculation. The same dissipation also destroys almost every structure in the solution and smears out all the details that matter. You get answer, but an answer that is fuzzy and inaccurate. These first order methods end up being as extremely expensive when accuracy is desired. Harten’s TVD methods provided a systematic connection of the new second-order methods to the old reliable first-order methods. The new methods were almost as reliable as the first-order methods, but got rid of much of the smearing dissipation that plagued them. Having a structured and expertly produced explanation for the behavior of these methods with clear connections to things people already knew produced rapid adoption by practitioners.

Mathematics is the cheapest science. Unlike physics or chemistry, it does not require any expensive equipment. All one needs for mathematics is a pencil and paper.

― George Pólya

The follow-on efforts with higher than second-order methods have lacked these clear wins. It is clear that going past second-order does not provide the same sort of quantum leap in results. The clear connection and expectations of robustness is also lacking. The problems do not stop there. The essentially non-oscillatory methods select the least oscillatory local approximation, which also happens to be quite dissipative by its very nature. Quite often the high-order method is actually not threatening oscillations at all yet a less accurate approximation is chosen needlessly reducing accuracy. Furthermore, the adaptive approximation selection can preferentially choose unstable approximation in an evolutionary sense, which can result in catastrophe. The tendency to produce the worst of both Worlds has doomed their success and broad adoption. Who wants dissipative and fragile? No one! No production code would make these choices, ever!

Recent efforts have sought to rectify this shortcoming. Weighted ENO methods (WENO) have provided far less intrinsically dissipative methods that also enhance the accuracy. These methods are still relatively dissipative compared to the best TVD methods and invoke their expensive approximations needlessly in regions of the solution where the nonlinear mechanisms are unnecessary. Efforts have produced positivity preserving methods that avoid the production of inherently unphysical results with high-order methods. These developments are certainly a step in the right direction. The current environment of producing new legacy codes is killing any other the energy to stewart these methods into broad adoption. The expense, overly dissipative nature and relatively small payoff all stand in the way.

What might help in making progress past second-order methods?

The first thing to note is that TVD methods are mixed in their order of accuracy. They are second-order in a very loose sense and only when one takes the most liberal norm for computations (L1 for you nerds out there). For the worst-case error, TVD methods are still first-order (L-infinity, and multiple dimensions). This is a pretty grim picture until one also realizes that for nonlinear PDEs with general solutions, first-order accuracy is all you get anyway unless you are willing to track all discontinuities. These same conditions hold for high-order methods we might like to adopt. The accuracy from the new methods is always quite limited and puts a severe constraint on the efficiency of the methods, and a challenge to development and progress. The effort that it takes to get full accuracy for nonlinear problems is quite large, and if this accuracy is not realized, the effort is not worth it. We do know that some basic elements of high-order methods yield substantial benefits, but these benefits are limited (an example are high-order edge values used in the piecewise parabolic method – PPM).

I asked myself, what worked so well for TVD? To me there is a clear and unambiguous connection to what worked in the past. The past was defined by the combination of upwind, Lax-Wendroff, and Beam-Warming methods. These methods along with largely ad hoc stabilization mechanisms provided the backbone of production codes preceding the introduction of these methods. Now TVD schemes form the backbone of production codes. It would seem that new higher order methods should preserve this sort of connection. ENO and WENO methods did not do this, which partially explains their lack of adoption. My suggestion would be a design of methods where one uses a high-order method that can be shown to be TVD, or the high-order method closest to a chosen TVD scheme. This selection would be high-order accurate by construction, but would also produce oscillations at third-order. This is not the design principle that ENO methods use where the unproven assertion is oscillations at the order of approximation. The tradeoff between these two principles is larger potential oscillations with less dissipation and a more unambiguous connection to the backbone TVD methods.

1. Everyone is entitled to their opinion about the things they read (or watch, or listen to, or taste, or whatever). They’re also entitled to express them online.

2. Sometimes those opinions will be ones you don’t like.

3. Sometimes those opinions won’t be very nice.

4. The people expressing those may be (but are not always) assholes.

5. However, if your solution to this “problem” is to vex, annoy, threaten or harrass them, you are almost certainlya bigger asshole.

6. You may also be twelve.

7. You are not responsible for anyone else’s actions or karma, but you are responsible for your own.

8. So leave them alone and go about your own life.

[Bad Reviews: I Can Handle Them, and So Should You(Blog post, July 17, 2012)]

 John Scalzi

pileofshitMy own connection to this work is a nice way of rounding out this discussion. When I started looking at modern numerical methods, I started to look at the selection of approaches. FCT was the first thing I hit upon and tried. Compared to the classical methods I was using, it was clearly better, but its lack of theory was deeply unsatisfying. FCT would occasionally do weird things. TVD methods had the theory and this made is far more appealing to my technically immature mind. After the fact, I tried to project FCT methods onto the TVD theory. I wrote a paper documenting this effort. It was my first paper in the field. Unknowingly, I walked into a veritable mine field and complete shit show. All three of my reviewers were very well-known contributors to the field (I know it is supposed to be anonymous, and the shit show that unveiled itself, unveiled the reviewers too).

The end result was that the paper was never published. This decision occurred five years after it was submitted, and I had simply moved on. My first review was from Ami Harten who basically said this paper is awesome and publish it. He signed the review and sent me some lecture notes on the same topic. I was over the moon, and did call Ami and talk briefly. Six months later my second review came in. It was as different as possible from Ami’s. It didn’t say this exactly, but in a nutshell, it said the paper was a piece of shit. It still remains the nastiest and most visceral review I’ve ever gotten. It was technically flawless on one hand and thoroughly unprofessional in tone on the other. My third review came a year later and was largely editorial in nature. I revised the paper and resubmitted. While all this unfolded Ami died, and the journal it was submitted to descended into chaos partially due to the end of the cold war and its research largess. When it emerged from chaos, I decided to publish the work was largely pointless and not worth the effort.

Some commentary about why this shit show happened is worth explaining. It is all related to the holy war between two armed camps that arose via the invention of these methods and who gets the credit. The paper was attempting to bridge the FCT and TVD worlds, and stepped into the bitter fighting around previous publications. In retrospect, it is pretty clear that FCT was first, and others like Kolgan and Van Leer came after. Their methodologies and approaches were also fully independent, and the full similarity was not clear at the time. While the fullness of time sees these approaches are utterly complementary, at the time of development it was seen as a competition. It was definitely not a collaborative endeavor, and the professional disagreements were bitter. They poisoned the field and people took sides viewing the other side with vitriolic fury. A friend and associate editor of the Journal of Computational Physics quipped that this was one of the nastiest sub-communities in the Journal, and why did I insist on working in this area. It is also one of the most important areas in computational physics working on a very difficult problem. The whole field also hinges upon expert judgement and resists a firm quantitative standard of acceptance.

What an introduction to the field and its genuinely amazing that I continue to work in it at all. If I didn’t enjoy the technical content so much, and not appreciated the importance of the field, I would have run. Perhaps greater success professionally would have followed such a departure. In the long run this resistance and the rule of experts works to halt progress.

If you can’t solve a problem, then there is an easier problem you can solve: find it.

― George Pólya

Kolgan, V. P. “Application of the principle of minimum values of the derivative to the construction of finite-difference schemes for calculating discontinuous gasdynamics solutions.” TsAGI, Uchenye Zapiski 3, no. 6 (1972): 68-77.

Boris, Jay P., and David L. Book. “Flux-corrected transport. I. SHASTA, a fluid transport algorithm that works.” Journal of computational physics 11, no. 1 (1973): 38-69.

Van Leer, Bram. “Towards the ultimate conservative difference scheme. II. Monotonicity and conservation combined in a second-order scheme.” Journal of computational physics 14, no. 4 (1974): 361-370.\

Van Leer, Bram. “Towards the ultimate conservative difference scheme. V. A second-order sequel to Godunov’s method.” Journal of computational Physics 32, no. 1 (1979): 101-136.

Harten, Ami, Bjorn Engquist, Stanley Osher, and Sukumar R. Chakravarthy. “Uniformly high order accurate essentially non-oscillatory schemes, III.” Journal of computational physics 71, no. 2 (1987): 231-303.

Harten, Ami, and Stanley Osher. “Uniformly high-order accurate nonoscillatory schemes. I.” SIAM Journal on Numerical Analysis 24, no. 2 (1987): 279-309.

Harten, Amiram, James M. Hyman, Peter D. Lax, and Barbara Keyfitz. “On finite‐difference approximations and entropy conditions for shocks.” Communications on pure and applied mathematics 29, no. 3 (1976): 297-322.