Everything that looks too perfect is too perfect to be perfect.

― Dejan Stojanovic

This post is going to delve directly into this blog’s name, how to regularize a singularity, but in this case we are talking about an artificial one. When one is writing a computer program to solve differential equations it is easy to introduce a discontinuity into how the program operates. This is particularly true when you’re implementing an elaborate model or numerical method. In the process of taking care of special cases or making a program robust for general use, logic is employed. Should a circumstance arise that causes the program to fail, it can be detected and avoided by making a logical change in the operation. The most common way to do this uses logic in the program through an “if” statement or some sort of switch. When the “if” triggers on a floating-point value, the impact on the solution can be subtle and creates a host of practical issues.

Ability to find the answers is more important than ability to know the answers.

― Amit Kalantri

As computer program development becomes more rigorous testing of various sorts becomes important and valuable to quality of the work. One form of the testing is regression testing. Here the program is run through a series of usually simple problems with well-defined answers. If the program’s solution changes in some way that is unexpected, the testing should pick it up and alert the development team. In addition this testing often runs across a bunch of different computers to make sure the answers are the same, and the program works properly on all of them. For basic quality assessment and control, regression testing is essential. It is one of the hallmarks of serious, professional code development. The logic and if statements introducing discontinuous behavior into the code based on floating point numbers can wreck havoc with the testing! We can end up with a situation where the tests produce much different results because of infinitesimal changes in the numbers at some point in the calculation.

You might be asking how this can happen? This all seems rather disturbing, and it is. It is simple matter whenever the logical decision is made on the basis of a floating-point number. Consider a simple, but common bit of (pseudo) computer code,

if (value > 0.0) then

newvalue = value1

else

newvalue = value2

endif

which is a very simple test that one might see with upwind finite differences. In some cases a logical switch like this might trigger an elaborate mathematical expression, or even call a very different function or subroutine. Consider the case where the “value” is very near zero, but not exactly. In this case small differences in the quantity “zero” will trigger completely different evaluations of the logic. For special values (especially zero) this happens all the time. If the solution is dependent upon a certain sequence any regression test depending on this test could differ based on inconsequential numerical differences. As programs become more complex these branches and differences explode. Code development teams relying upon regression testing end up chasing this sort of problem over and over. It becomes a huge drain on the productivity and quality.

These problems can be replicated with a set of standard functions. The logic above can be replaced by a single statement using a “sign” function,

newvalue = sign(value) * value1 + 0.5*(1.0 – sign(value)) * value2

which gives exactly the same result as the if test in the previous paragraph. It is also prone to exactly the same problems in practical testing. These issues are the tip of a proverbial iceberg. It isn’t just regression testing that suffers from these issues, if the method for solution involves solving a nonlinear equation that goes through the above logic, the solution can stall and stagnate causing solution accuracy to suffer. The same switches can produce breaks in symmetry or bifurcation of solutions near critical points. Next, I will describe ways of implementing the sign function to alleviate these problems. It turns out that there is a whole family of functions that can replace the discontinuous behavior with something continuous, and the sign function can be used to construct other functions with the same switching behavior built in. I’ve written about some of these functions before in a different context where discontinuous logical functions were replaced by differentiable functions for the purpose of conducting modified equation analysis that rely upon valid Taylor series expansions.

Here are a couple of times I’ve hit upon this topic before: https://wjrider.wordpress.com/2016/06/07/the-marvelous-magical-median/, https://wjrider.wordpress.com/2015/08/17/evolution-equations-for-developing-improved-high-resolution-schemes-part-2/, https://wjrider.wordpress.com/2016/06/22/a-path-to-better-limiters/. This post might be a good postscript to these because the techniques here can cure some of the practical ills remaining for these rather powerful methods. We see issues with solving nonlinear equations where limiters are used in discretizations, various symmetry breaking effects, and extreme sensitivity to initial conditions. As I200px-LimiterRegionwill touch upon at the very end of this post, Riemann solvers-numerical flux functions can also benefit from this, but some technicalities must be proactively dealt with.

Slow is smooth; smooth is fast.

― Jack Coughlin

Using the sign function we can systematically remove the switching behavior that plagues regression testing, nonlinear solutions, symmetry preservation, and extreme sensitivity to initial conditions.

For me, the first function to start this was the hyperbolic tangent function. The basic behavior of this function acts to create a switch between two states based on the argument and steepness of the transition, \mbox{softsign}(x) = \tanh(a x) as a becomes larger, the function approaches the idealized step function. It turns out that there are a number of smooth functional representations of the sign function including \mbox{softsign}(x) = \mbox{erf}(a x), \mbox{softsign}(x) = x/\left(a + \|x\| \right), and \mbox{softsign}(x) = x/\sqrt{a + x^2}. There are many others that can be derived as well as other more exotic functions. These functions are used in other fields to remove the discontinuity from a switching function sign(https://en.wikipedia.org/wiki/Sigmoid_function ).

These functions provide a flexible foundation to build upon. As an initial example take the definition of the absolute value, \|x\| = sign(x)x (https://en.wikipedia.org/wiki/Sign_function ). This can be rearranged in a number of useful forms, sign(x) = x/\|x\| = \|x\|/x . We can see that a simple smoothed version of the absolute value is \mbox{softabs}(x) = \mbox{softsign}(x)x. We can now build an entire family of softened or smoothed functions that can be differentiated (they are C_\infty). Each of the classical versions of these functions cannot be differentiated everywhere and create a host of problems in practical programs. Another common switching function are “min” and “max”. We can rewrite both functions as \min(a,b) = \frac{1}{2}(a+b) - \frac{1}{2}\|a-b\|, and \max(a,b) = \frac{1}{2}(a+b) + \frac{1}{2}\|a-b\|. The modified smooth versions are relatively obvious, \mbox{softmin}( (a,b) = \frac{1}{2}(a+b) - \frac{1}{2}softabs(a-b), and \mbox{softmax}(a,b) = \frac{1}{2}(a+b) + \frac{1}{2}\mbox{softabs}(a-b). From this basic set of function we can build the backbone of the limiters, the minmod function and the marvelous magical median function. What we have removed in the process is the discontinuous switching process that can wreck havoc with finite precision arithmetic.abs

We note that there is a separate version of the softmin and softmax functions used in ladders-or-a-tightropesome optimization solutions (https://www.johndcook.com/blog/2010/01/13/soft-maximum/, https://en.wikipedia.org/wiki/Softmax_function, ). This uses a combination of exponentials and logrithmns to provide a continuously differentiable way to take the maximum (or minimum) of a set of arguments. My naming convention “soft” comes from being introduced to the ideas in this blog post. This separates the idea from a “hard” max where the arguments switch based on the precision of the floating-point numbers as opposed to being continuous. For completeness the softmax uses the following expression \mbox{softmax}(a,b) = \log\left(\exp(n a) + \exp(n b) \right)/n, which may be expanded to additional arguments without complications. By the same token we can define a “softmin” as \mbox{softmin}(a,b) = -\log\left( \exp(-n a) + \exp(-n b) \right)/n that can be similarly be expanded to more arguments. In both cases the parameter n controls the sharpness of the smoothed version of the standard function, the larger the value the closer the function is to the standard function.

Using our previous definitions of “softmax” we can derive a new version of “softabs”.   We rearrange \mbox{softmax}(a,b) = \frac{1}{2}(a+b) +\frac{1}{2} \mbox{softabs}(a-b) to derive a \mbox{softabs}(a). We start with the observation that \mbox{softmax}(a,-a) =\frac{1}{2}(a-a) +\frac{1}{2} \mbox{softabs}(a+a) therefore \mbox{softabs}(a) = \mbox{softmax}(a,-a). We find that this version of the absolute value has much different properties than the previous softened version in some important ways. The key thing about this version of absolute value is always greater in value than the classical absolute value function. This would turn out be useful for use with Riemann solvers by not violating the entropy condition. With appropriate wavespeed estimates the entropy condition will be satisfied (wavespeed estimates are out of scope for this post). By the same token this absolute value is not valuable for limiters because of the same property!

Ultimately we want to understand whether these functions alter the basic accuracy-consistency or stability properties of the numerical methods based on using the classical functions. The answer to this question is subtle, but can be answered via analysis and numerical experiment. Not to belabor the details, but we use series expansions and discover that with appropriate regularization of the smoothed functions, we can use them to replace the classical functions and not undermine the accuracy of the discretization. This has been confirmed for the softened version of the “minmod” limiter. A downside of the softened limiters are small deviations from idealistic monotonticity preserving behavior.

Finally as eluded to early in the post, we can also use these functions to modify Riemann solvers. The first code example can form the logical basis for upwind bias with a finite difference by choosing a one-sided difference based upon the sign of the characteristic velocity. When Riemann solvers are examined we see that either “if” statements are used or when full flux functions are used absolute values (the flux is in a general sense a characteristic quantity multiplied by the characteristic velocity), the absolute value of the characteristic velocity introduces the sign convention the “if” statement provides.

The lingering problem with this approach is the concept of entropy violating sodxapproximations. This issue can easily be explained by looking at the smooth sign function compared with the standard form. Since the dissipation in the Riemann solver is proportional to the characteristic velocity, we can see that the smoothed sign function is everywhere less than the standard function resulting in less dissipation. This is a stability issue analogous to concerns around limiters where these smoothed functions are slightly more permissive. Using the exponential version of “softabs” where the value is always greater than the standard absolute value can modulate this permissive nature.

Let us study things that are no more. It is necessary to understand them, if only to avoid them.

― Victor Hugo

Advertisements