All Models Are Wrong, Part 3 – Freezing Cold Poison

All models are wrong, but some are useful is the familiar statement attributed to statistician George Box. More recently, a youtuber who I follow released this video, along the same lines. In my work, I think more often of a mantra I have come to recognize over the years, which is a slightly modified version of the same idea. The mantra is try first to be useful, and then to be accurate. Either way you say it, it is a foundational point which I would like to expand on a bit here.

An example of the utility of this idea is a sine wave. Sine of x? What is sine of x? How do you calculate the “sine” of number? Leaving the geometry details aside, we know that you can calculate it using a series. But the entire function is an infinite series, so in this case the only saving grace is that we only need some specific level of accuracy within some range of x values. It is the only thing making sine a tractable problem. Figure 1 below shows a sine wave as a black dashed line, and the curves of series with different number of terms (it’s curves for the sine approximations with between one and seven terms). Over a small range, y=x does pretty well, and of course as you add more terms the approximation becomes close to sine over a larger range of x values, and that’s plain to see. The other side of this coin is that the computational cost. Simple as it may be, the two-term equation costs roughly twice what the y=x approximation does. For sufficiently small values of x, it’s not worth paying twice for little to no improvement in the approximation. And again the 4 term approximation costs twice again as much as the 2 term approximation, and so on. It’s an important consideration, and really compounds when you start thinking about large 3D simulations with large scale differences in them.

Figure 1 – A sine wave with various series approximations of sine, with varying numbers of terms

This trade off is easy to see in the example above. In the real world, it’s a bit easier to lose the forest for the trees, and end up with an un-meshable, un-manageable model which bogs down the design process due to long lead times. When it comes to delivering results with engineering utility, I’d propose the chart below as an extrapolation of the lesson from the sine wave approximation case. Funny enough, when you can deliver a simplified model result in less than 24 hours, you look like a genius, but if the process drags out… you just end up looking like a moron who can’t get the modelling right. It’s important, in the capacity of engineering manager, to know what your folks can do to avoid perpetually using half-baked models on a long timeline, which nobody wants. Somebody I know was negatively affected by this, *cough*, *cough*.

The Tyranny of Large Scale Differences

An ever-evolving theme in fluids and simulation is people coming up with ways to deal with large differences in scale present in a given situation. Theory tells us that the ratio of largest feature size to smallest feature size in a turbulent flow goes like Re^3/4, which drives a three dimensional mesh count which then scales like Re^9/4 – scaling which is the defining characteristic of what can and can’t be simulated using DNS. Similar scaling considerations are important across a range of categories. Even in RANS simulations with conjugate heat transfer, there are important scale differences; the highest rate of heat transport will almost certainly be the advective component that the flow carries with it. Then there are conduction effects in the solid which are markedly slower and need to be treated with a separate pseudo-time step (if you’re using p-v-coupled pseudo transient, that is) in order to get a solution in this lifetime. And then, finally, there is a conduction component in the fluid as well. Good codes will allow you to separate the advective component from these other two for steady simulations. After all, in steady cases, as long as you end up with balanced equations, it doesn’t matter how you got there.

Really, in the world of fluids, one of the major contributors to complexity in modeling is having a large difference between the largest and smallest features in the model. Modeling a complex product like an airplane or car, then, has the same kind of problem as we face in the fluids world with turbulence: at a certain point, details of the smaller geometry details must be modeled, rather than fully captured by the simulation. It’s an important difference, between modeling or fully resolving (“capturing”) something in a model. More capturing and less modeling can bring more accuracy, but the cost curve really takes off as more physics and details are captured in a CFD model.

Boussinesq: A Man After Our Own Hearts

If you are involved in fluids work, you will already have some familiarity with the RANS formulation of the governing equations, which are the basis for essentially all turbulence modeling we do today. But another impactful figure in modeling, rather than capturing physics, is Joseph Boussinesq. Boussinesq has several contributions which we lean on to make our lives easier today. I still am amazed at the insight that both Reynolds and Boussinesq had and the utility of the simplifications they put forth nearly 100 years before the advent of numerical simulations. Boussinesq’s legacy to us is several approximations which you can read about (bouyancy approximation here, eddy viscosity assumption here), and one or both of them are daily factors in the majority of fluid simulations done today. Truly, it is astonishing that Boussinesq was thinking about the simplest way to approximate the effects of turbulence in the flow equations and in 1877 put forth his eddy viscosity concept, which reduces the number of turbulence-related transport equations from 6 to two, one, or even zero, through contributions of future folks.

Reynolds and Boussinesq were searching for order in the turbulent chaos, but the main contributions that I think of them for are not relating to accuracy, they relate to finding incredible resource-saving ways to get useful approximations to real physics. We know now, but these two could not have known, that their contributions would make flow simulations economical much earlier than they otherwise would have been. For example, a simulation that can be done as DNS now was able to be done (albeit, as an approximation) decades ago as a RANS simulation.

Appropriate Complexity Growth

These days there is a continuous stream of new and useful features in CFD codes to reduce computational cost or increase realism of the simulations. Shell conduction comes to mind as one of these things. But, at the end of the day, the best you can do is treat everything as a hand calculation until you can’t justify it anymore, at which stage it becomes a 1D/lumped capacity simulation until you can’t justify that, and then a 2D (or, ok, 3D) thermal simulation until you can’t justify that… then a steady CFD case, and so on. At the end of the HBO Chernobyl series, the main nuclear physicist says that, “every lie we tell incurs a debt to the truth,” and it rings true there and for simulation as well. With the proper knowledge of the physics and the tools, you know how much of this debt you can get away with in a given simulation and maximize utility according to what you’re trying to do. The cost of not making appropriate simplifications point is high – very high.

Leave a Reply Cancel reply