Archiv des Autors: mga010

Adaptive Integral in Python

Continuing my efforts to learn Python for Math, I wrote a function for adaptive integration. It uses 10-point Gauss quadrature. To check the accuracy, the algorithm uses a bipartition of the interval. So each step takes 30 evaluations of the function, one for the complete step and two for the check. If the accuracy is not high enough, the step size will decreased. Otherwise, the function will try to double the step size in the next step.

Here is the complete code.

def legendre (n):
    """ Get the Legendre Polynomials
    n : maximal degree
    Return: list of Lenegdre Polynomals
    p0 = np.poly1d([1])
    if n<=0:
        return [p0]
    p1 = np.poly1d([1,0])
    px = p1
    if n==1:
        return [p0,p1]
    for i in range(1,n):
        p = ((2*i+1)*px*p1 - i*p0) / (i+1)
        p0 = p1
        p1 = p
    return v

def make_gauss(n):
    """ Compute the Coefficiens for Gauss Integration
    n : Number of points
    return: x,a
    x : Gauss popints
    a : Gauss coefficients
    v = legendre(n)
    x = np.roots(v[n])
    A = np.array([v[i](x) for i in range(n)])
    w = np.zeros(n)
    for i in range(n):
        p = np.polyint(v[i])
        w[i] = p(1)-p(-1)
    return x,np.linalg.solve(A,w)

gauss_x5,gauss_a5 = make_gauss(5)

def gauss5 (f,a,b):
    """ Gauss Integrate with 5 points
    f : function of one variable
    a,b : Inteterval bounds
    returns: Integral
    return np.sum(f((a+b)/2+(b-a)/2*gauss_x5)*gauss_a5)*(b-a)/2

gauss_x10,gauss_a10 = make_gauss(10)

def gauss10 (f,a,b):
    """ Gauss Integrate with 10 points
    f : function of one variable
    a,b : Inteterval bounds
    returns: Integral
    return np.sum(f((a+b)/2+(b-a)/2*gauss_x10)*gauss_a10)*(b-a)/2

def integrate (f,a,b,n=1,eps=1e-14):
    """ Adaptive Integral using 10 point Gauss
    f : function of one variable
    a,b : Interval bounds
    n=1 : Initial step size is (b-a)/n
    eps : Desired absolute accuracy
    returns: Integral
    h = (b-a)/n
    x = a
    I = gauss10(f,x,x+h)
    I1 = gauss10(f,x,x+h/2)
    I2 = gauss10(f,x+h/2,x+h)
    res = 0.
    while x<b-eps:
        ## print(h)
        if np.abs(I-(I1+I2))>h/(b-a)*eps:
            h = h/2
            I = I1;
            I1 = gauss10(f,x,x+h/2)
            I2 = gauss10(f,x+h/2,x+h)
            x = x+h
            if x+h>=b-eps:
            res = res+I1+I2
            I = gauss10(f,x,x+h)
            I1 = gauss10(f,x,x+h/2)
            I2 = gauss10(f,x+h/2,x+h)
    return res

This algorithm is usually fast enough and very accurate. The implementation is by no means optimal. E.g., if a factor of 2 is used in the step size, a lot of computations are redone. Moreover, it might not be optimal to try to double the step size in each step. However, it works for interactive use, and that is what it is intended for.

Let me try to explain the code a bit. The function legendre() computes the Legendre polynomials using the recursion formula. These polynomials are orthogonal with respect to the weight w=1. The function returns a list of the polynomials. It makes use of the polynomial handling in Python (in old style).

The function make_gauss() computes the zeros of the Legendre polynomials using a function from Python, and coefficients a[k], such that the following is exact for all polynomials p up to degree n-1:

\(\sum_{k=1}^n a_k p(x_k) = \int\limits_{-1}^1 p(t) \,dt\)

It sets up n linear equations from this requirement using the Legendre polynomials we just computed as test cases and solves the equations. To compute the correct integral for polynomials, there is a function in Python.

Now, we can define the functions for Gauss integration with 5 and 10 points, and use it for our adaptive integration.

For the demonstration, I created a very easy to use function fplot() which plots a function adaptively using the code of the last posting.

def fplot (f,a,b,n=100,nmax=100000):
    """ Simple plot of a function
    a,b : interval bounds
    n=100 : initital step size (a-b)/n
    nmax=100000 : minimal step size (a-b)/nmax
    x,y = adaptive_ev(f,a,b,n,nmax)
    fig,ax = plt.subplots(figsize=(8,8))

I now have put all this and the code of the last posting into a Python module. When using the Python notebook interface in a Browser, the file should be placed in the same directory where the „.ipynb“ file. The Python file can even be opened and edited in the Python notebook IDE.

Then it can be used as follows.

import renestools as rt
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

def f(x):
    return np.abs(x)


This is not easy to integrate numerically, since the function is not smooth. The result is exact to the last digit. Even the plot is not trivial. With the arbitrary step size, the point x=0 might not be met exactly, and the corner may look round.

Here is another example using the Gaussian distribution. All digits agree. Note, that the Gauss function is almost zero for large x.

def f(x):
    return np.exp(-x**2)



You can even make the integral to a function and plot it adaptively.

def F(x):
    return rt.integrate(f,0,x)/np.sqrt(np.pi)


Adaptive Plots in Matplotlib

I could not find a solution for adaptive plots in Matplotlib. So, I wrote a small script that can evaluate a function in adaptive step sizes. „Adaptive“ means that the step size adapts to the output of the plot. This is to prevent coarse line segments with corners in the plot.

Here is the simple code for a function of one variable.

import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np

def adaptive_ev (f,a,b,n=100):
    xc = a
    yc = f(xc)
    x = [xc]
    y = [yc]
    h = (b-a)/n
    uplim = (2*h)**2
    lowlim = uplim/6
    eps = (b-a)/100000.
    while xc<b:
        if xcnew>=b:
        d = h**2+(ycnew-yc)**2 
        if d>uplim and h>eps:
            h = h/2
            if d<lowlim/3:
                h = 2*h
            xc = xcnew
            yc = ycnew
    return np.array(x),np.array(y)

def func(x):
    return np.sin(1/x)

print(x.size,' points generated')
fig,ax = plt.subplots(figsize=(14,10))

To apply the function adaptive_ev(), you need a function of one variable, an interval and the target number of plot points. The function will double or half the step size as needed to keep the output under control.

This is a screenshot of the graph generated by the code above. The function is difficult to plot near 0. Thus, adaptive evaluate is the only chance to do that. The routine generated 4270 data points to create this plot.

Euler Math Toolbox has the adaptive evaluation built into the plot2d() routine. So, a similar code would look like this.

>aspect(1.2); plot2d("sin(1/x)",0.001,1,thickness=1.5,color=blue):

Admittedly, the anti-aliasing of Matplotlib is outstanding. I like this library a lot and recommend using it.

I also did a routine for curves in the plane, which are parameterized by two functions for the x- and y-coordinate.

def adaptive_ev2 (fx,fy,a,b,size=1,n=1000):
    tc = a
    xc = fx(tc)
    yc = fy(tc)
    x = [xc]
    y = [yc]
    h = size/n
    uplim = (2*h)**2
    lowlim = uplim/6
    eps = size/100000.
    while tc<b:
        if tcnew>=b:
        d = (xcnew-xc)**2+(ycnew-yc)**2 
        if d>uplim and h>eps:
            h = h/2
            if d<lowlim/3:
                h = 2*h
            xc = xcnew
            yc = ycnew
            tc = tcnew
        ## print((tc,xc,yc,h,d))
    return np.array(x),np.array(y)

def adaptive_ev2 (fx,fy,a,b,size=1,n=1000):
    tc = a
    xc = fx(tc)
    yc = fy(tc)
    x = [xc]
    y = [yc]
    h = size/n
    uplim = (2*h)**2
    lowlim = uplim/6
    eps = size/100000.
    while tc<b:
        if tcnew>=b:
        d = (xcnew-xc)**2+(ycnew-yc)**2 
        if d>uplim and h>eps:
            h = h/2
            if d<lowlim/3:
                h = 2*h
            xc = xcnew
            yc = ycnew
            tc = tcnew
        ## print((tc,xc,yc,h,d))
    return np.array(x),np.array(y)

def funcx(t):
    return np.exp(-t/10)*np.cos(t)
def funcy(t):
    return np.exp(-t/10)*np.sin(t)

print(x.size,' points generated')
fig,ax = plt.subplots(figsize=(10,10))

The code follows the same ideas. However, it uses a plot size to set the dimensions of the curve in the plane. You only need a guideline. The step size will be approximately 1/1000 of this size.

Close to 0, the plot should avoid generating too many points. But it should look smooth. This plot generates 14509 points. That is 14 times the given size of 1. But one should be aware that it winds 20 times around the origin!

Euler Math Toolbox does the same, if two functions are fed into plot2d(). It uses a function adaptive() to compute the data points, which works for vector valued functions. This function could be ported to Python easily.

function adaptive (f$: call, a: number, b: number, eps,  ..
dmin, dmax, dinit, factor)

## Default for eps : 0.01
## Default for dmin : 1e-05
## Default for dmax : 100000
## Default for dinit : 0.1
## Default for factor : 2

    if sn==0 then return {a,sy}; endif;
    if cols(sy)>1 then 
        error(f$|" does not produce a real or column vector"); 
    x=zeros(1,1000); y=zeros(rows(sy),1000); 
    n=1; h=dinit;
            if sn*(tn-b)>=0 then tn=b; endif;
            if isNAN(d) then break; endif;
            if d>eps and h>dmin then h=max(h/factor,dmin);
            elseif d<eps/factor then h=min(h*factor,dmax); break;
            else break;
        if n>cols(x) then
        x[:,n]=tn; y[:,n]=syn;
        if sn*(t-b)>=0 then break; endif;
    return {x[:,1:n],y[:,1:n]};

I find Python very aspiring for mathematics. The next step will be to write an adaptive integration using Gauss formulas. Stand by.

EMT, Python, Numpy, Sympy on Macs with M1

I am currently using an 11′ Mac Air Notebook with the M1 processor as my laptop. We have to admit that Apple has made a big step forward again with this architecture. I never used Macs before because they were expensive and did not offer enough benefit to justify leaving Windows. Admittedly, they always had a much better user experience than Windows, let alone Linux. What works, does work well! But I have been judging the systems by more technical features, like processor power, ease of administrator work, and support for programmers. And in that respect, Macs were a step back. So, I could never justify using a system that serves me worse at a higher cost.

Now, things have changed with the M1 architecture. The laptop I am using is a bargain for what it offers. The processor is very fast, the display excellent, and the battery life extraordinary. The build quality is far better than any notebook I had before, with a rugged, indestructible case that still feels light and slim. For my retirement, that is precisely what I need.

The purchase of this laptop also gives me the chance to test my math programs on that operating system. There is a Windows emulation for Macs, but it does not work well with the M1 processor. So, a program like Euler Math Toolbox will have to be ported which is a lot of work, most likely not worth the effort in the end. I am going to produce a simplified version nevertheless.

Java is working well, and that gave me the chance to fix my program C.a.R. (Compass and Ruler) for Macs. Indeed, there were some annoying quirks that needed to be fixed. It is now running very well. (By the way, the same applies to my editor, JE, which I still use.)

Until I have a working version, Euler Math Toolbox needs to replaced by other, free tools on Macs. For the numerical side, there is Scilab. It has a working version for Macs, and I found no obvious flaw. For the numerical side, there is no free solution. Maxima can only be installed in a very complicated way using a sort of emulator, and I still have to explore this method. More on this on another day.

But, there is Python. We have a numerical matrix library „numpy“ and a symbolic library „sympy“. There is also a very nice plot library „matplotlib“. All those are not as easy to use as Euler Math Toolbox. I’d say, they are clumsy in comparison. Nevertheless, I am going to explore these libraries and compare to EMT in a series of postings on this blog.

You can follow me without a Mac. So let us start with an example which requires symbolic computations. We want to compute the polynomial of degree n with the property

\(p_{n+1}(x)-p_{n+1}(x-1)=x^n, \quad p_n(0)=0\)

for all real x, and natural n (investigated by Bernoulli and Euler). By this equation and induction, the polynomial is defined for all natural numbers x. Thus, the polynomial is uniquely determined if it exists. And it does indeed exist. You just need to interpolate in the first n+2 natural numbers, and since the equation holds there, it holds everywhere.

We now have

\(p_{n+1}(N) = \sum\limits_{k=1}^N k^n.\)

We could use interpolation to determine the polynomials. It is an exercise to do this in any symbolic program. For a start, we use Maxima in Euler Math Toolbox to determine the case n=3.

>p &= sum(a[k]*x^k,k,0,4)
                       4       3       2
                   a  x  + a  x  + a  x  + a  x + a
                    4       3       2       1      0
>equ &= makelist((p with [x=k]) = sum(j^3,j,1,k),k,1,5)
        [a  + a  + a  + a  + a  = 1, 
          4    3    2    1    0
 16 a  + 8 a  + 4 a  + 2 a  + a  = 9, 
     4      3      2      1    0
 81 a  + 27 a  + 9 a  + 3 a  + a  = 36, 
     4       3      2      1    0
 256 a  + 64 a  + 16 a  + 4 a  + a  = 100, 
      4       3       2      1    0
 625 a  + 125 a  + 25 a  + 5 a  + a  = 225]
      4        3       2      1    0
>&factor(p with solve(equ)[1])

                               2        2
                              x  (x + 1)

We have created a polynomial with indexed parameters using the sum() function of Maxima. The command makelist() then creates a list of equations which our polynomial has to satisfy (due to the sum requirement above). This can be solved with solve(). We use the first solution to substitute the values for a[k] back into the polynomial and factor the result. We get the well-known formula for the sum of the cubes of the first x integers.

Let us switch to Python. In Windows, you need to install „Anaconda“ for the following. Then start the „Anaconda Navigator“ and „jupyter notebook“. There you enter the following.

from sympy import *
x, k = symbols('x k')
a = IndexedBase('a')
p = sum([a[k]*x**k for k in range(5)])
equ = [Eq(p.subs(x,k),sum([l**3 for l in range(1,k+1)])) for k in range(1,6)]
sol = solve(equ)
factor(p.subs([(a[k],sol[a[k]]) for k in range(5)]))

This is more difficult to understand than the EMT code above, but it follows the same ideas.

After importing everything from sympy, we need to define the symbols x and k, and the indexed symbol a[]. Then we use the sum function of Python on a list. The result is

x**4*a[4] + x**3*a[3] + x**2*a[2] + x*a[1] + a[0]

After that, we create a list of equations, just as in the Maxima code above. The problem is that we cannot use „==“. This would test the two sides for equality and evaluate to true or false. We need to use „Eq(a,b)“ instead.

[Eq(a[0] + a[1] + a[2] + a[3] + a[4], 1), Eq(a[0] + 2*a[1] + 4*a[2] + 8*a[3] + 16*a[4], 9), Eq(a[0] + 3*a[1] + 9*a[2] + 27*a[3] + 81*a[4], 36), Eq(a[0] + 4*a[1] + 16*a[2] + 64*a[3] + 256*a[4], 100), Eq(a[0] + 5*a[1] + 25*a[2] + 125*a[3] + 625*a[4], 225)]

Then we solve this equation, which yields a dictionary with solutions. Dictionaries are Python structures that contain entries and values. They print as follows.

{a[0]: 0, a[1]: 0, a[2]: 1/4, a[3]: 1/2, a[4]: 1/4}

We now use the subs() function to substitute the values of the parameters into the polynomial p and factor the result. In a jupyter notebook, it appears in pretty format automatically. Here is a screenshot.

So much for a start of the sequence of postings. Clearly, Euler Math Toolbox and Maxima re easier to use. But Python is available for Macs. On Linux, you can use EMT quite nicely with the modern Wine emulation. But you could also use Python.

Finally, as a mathematical side remark, you could differentiate the definition equation for our polynomials and get.

\(p_{n+1}'(x)-p_{n+1}'(x-1) = n x^{n-1}\)

Since our polynomials are uniquely defined by the equation, we get

\(p_n(x) = \dfrac{1}{n} (p_{n+1}'(x) -p_{n+1}'(0))\)

On the right hand side, we have a polynomial which satisfies the equation and has the property p(0)=0. We can use this to compute the next polynomial.

\(p_{n+1}(x) = n \int\limits_0^x p_n(t) \,dt + c x\)

To determine the constant c, we use p(1)=1 for our polynomials. Thus

\( p_{n+1}(x) = n \int\limits_0^x p_n(t) \,dt + (1- n\int\limits_0^1 p_n(t) \,dt) x\)

Let us make a function in Euler Math Toolbox from this formula. The function will call Maxima several times, and thus not be fast.

>function p(n) ...
$pn &= "x";
$if n>1 then
$  for k=1 to n-1;
$    q &= "subst(t,x,pn)";
$    pn &= "@k*integrate(q,t,0,x)+(1-@k*integrate(q,t,0,1))*x";
$  end;
$return &pn;
>p(4); &factor(%)
                               2        2
                              x  (x + 1)

In Euler functions, you need to use „…“ for symbolic expressions. Moreover, you can use numerical values via @… in symbolic expressions.

Of course, loops like this can also be programmed in Maxima direct. You will have to learn the syntax of Maxima for this.

Here is a solution in Python.

def p(n):
    t,x = symbols('t x')
    p = x
    if n<=1:
        return p
    for i in range(1,n):
        q = p.subs(x,t)
        p = i*integrate(q,(t,0,x)) + (1-i*integrate(q,(t,0,1)))*x
    return p

This is a good start.

But it turns out that the Anaconda Navigator does not run on Macs. It looks like it needs Rosetta2. Usually, a program asks if Rosetta should be installed. But Anaconda didn’t on my system.

After installing Anaconda nevertheless, you have a Python 3.9 environment. So you can run „python“ to start the Python interpreter, and „ipython“ to start the advanced Python terminal. For this test, I saved the following to a file „“ using an editor (I am using my JE, of course).

from sympy import *

def p(n):
    t,x = symbols('t x')
    p = x
    if n<=1:
        return p
    for i in range(1,n):
        q = p.subs(x,t)
        p = i*integrate(q,(t,0,x)) + (1-i*integrate(q,(t,0,1)))*x
    return p

Then running „python“ will run this script and output the result in pretty format.

 2        2 ⎛ 2        ⎞ ⎛   4      3    2          ⎞
x ⋅(x + 1) ⋅⎝x  + x - 1⎠⋅⎝2⋅x  + 4⋅x  - x  - 3⋅x + 3⎠

You can also try to run „jupyter notebook“ from the terminal. If things go well, your browser will open and you can start an ipython terminal, clicking on „New“. Use Option+Return to run any command.

Cambrige Entry Examination

I recently came across a video with the problem to find the sum of the solutions of

\(3^x – \sqrt{3}^{(x+4)} + 20 = 0\)

Maybe anybody who is intimate with this kind of computations sees the trick. You can set

\(y = \sqrt{3}^x,\)

and get the equivalent equation


Solving this, you get

\(y = \sqrt{3}^x = 3^{x/2} = \dfrac{9 \pm 1}{2}.\)


\(x_1+x_2 = 2 \log_3(20) = \dfrac{2 \ln(20)}{\ln(3)}.\)

That’s not very exciting.

But why are they asking to find the sum of the solutions? If you hear that, you might think of the Vieta’s fact that the sum and the product of the solutions of a quadratic equation are the coefficients. Using this, you get

\(y_1y_2 = \sqrt{3}^{(x_1+x_2)} = 20\)


That’s were I decided to write about this problem here. E.g., you might change the original problem to

\(3^x – \sqrt{3}^{(x+4)} + 9 = 0\)

and get the much more pleasing solution

\(x_1 + x_2 = 2 \log_3(9) = 4.\)

Nice trick! But it is dangerous. Applying the trick to

\(3^x – \sqrt{3}^{(x+4)} + 81 = 0\)

yields the false sum 8.

In fact, this equation does not have a solution at all! Unless you allow complex numbers, and are careful with complex logarithms. In that case, you get multiple solutions.

So be careful with tricks!

Integrals and Monte Carlo Methods

I am teaching surface integrals right now. Searching for some interesting applications away from physics, I came up with mean values along surfaces. After all, mean or „expected“ values are a center part of probability theory. It is tempting to try to verify the analytic results by Monte Carlo methods. This can be done nicely in Euler Math Toolbox.

E.g., an integral on a finite interval [a,b] can be simulated this way. We generate random numbers in [a,b] uniformly, and compute the mean of the values of the function. The expected mean value is the integral of the function divided by the length of the interval.

>&integrate(log(x),x,0,2), %()
                              2 log(2) - 2
>N = 10 000 000;
>x = random(N)*2;

Syntax notes for EMT: The blanks in integer numbers can be generated by pressing Ctrl-Space. This generates a hard space which is skipped when scanning the number. The random() function generates uniform numbers in [0,1] which we need to be scaled properly to fall into [0,2].

The accuracy of these Monte Carlo methods are like 1/√N. So, we cannot expect more than 4 digits, even with ten million samples.

As another example, the area of the circle can be simulated by throwing random numbers into the unit cube and counting the number of hits inside the circle. We expect a proportion of pi/4 there.

>N = 10 000 000;
>x = 2*random(N)-1; y=2*random(N)-1;

Syntax notes for EMT: Applying the boolean „≤“ to a vector generates a vector of zeros and ones. Summing this vector up counts the „true“ cases.

This rejection method is also a simple way to create random numbers inside the unit circle. It is not as bad as it sounds, since more than every second pair of numbers hits inside the circle.

For a further test, let us compute the integral of r^2 on the unit circle (which is proportional to the energy in a rotating disk). The integral is the mean value of r^2 on the unit circle times the area of the unit circle.

>N = 10 000 000;
>x = 2*random(N)-1; y=2*random(N)-1;
>r = sqrt(x^2+y^2);
>i = nonzeros(r<=1);
>&2*pi*integrate(r^3,r,0,1), %()

Syntax notes for EMT: The nonzeros() function generates a vector of indices of nonzero elements. We can use that vector to select the hits from the vector r.

For the correct integral, we used the formula for rotational invariant functions on circles.

As a side remark, there is a direct method to generate random numbers inside the unit circle. For this, we need to select a random angle between 0 and 2pi. The radius must be selected in proportion to the length of the circle with this radius, i.e. with density 0≤r≤1. The following does the trick. We explain it after the code.

>function circrandom (N) ...
$  t = random(N)*2*pi;
$  r = sqrt(1-random(N));
$  return {r*cos(t),r*sin(t)}
>{x,y} = circrandom(10000);

This will generate the following points, which look quite nicely distributed in the unit circle.

Why does this work? The reason is that any probability distribution can be simulated by inverting its accumulated distribution, in this case

\(F(R) = p(r \ge R) = \int\limits_R^1 2r \, dr = 1-r^2\)

Here, we take 2r because this generates a probability distribution with density proportionally to r. Then we create a random number between 0 and 1 and apply the inverted function.

Using this trick, we can do another Monte Carlo simulation of the integral.

>{x,y} = circrandom(N);
>r = sqrt(x^2+y^2);
>i = nonzeros(r<=1);

Syntax notes on EMT: Note that circrandom() returns two values. It is possible to assign these two values to two variables.

Finally, we try an integral on a surface. We want to know the mean geographical width of a point in the Northern Hemisphere.

We try a simple rejection method. For this, we generate random numbers in the unit cube and reject those with z<0, r>1 or r<1/2. The remaining points, projected to the surface, are randomly distributed on the Northern Hemisphere.

>N = 10 000 000;
>x = random(N)*2-1; y = random(N)*2-1; z=random(N)*2-1;
>r = sqrt(x^2+y^2+z^2);
>i = nonzeros(z>0 && r<1 && r>1/2);
>&2*pi*integrate(asin(z),z,0,1) / (2*pi) | factor, ...
>   %(), %->"°"
                                 pi - 2

The analytic surface integral is done by parameterizing the surface using

\(g(z,t) = [\sqrt{1-z^2}\cos(t),\sqrt{1-z^2} \sin(t),z]\)

We can compute the Gram determinant for this parameterization, which is actually 1 (or use a geometric reasoning with the same result). Thus, for any function on the hemisphere H which depends on the height z only, we get

\(\int_H f(x,y,z) \,dS = 2 \pi \int\limits_0^1h(z) \,dz\)

The rejection trick works on symmetric surfaces only, where we can easily do an orthogonal projection and know the distance to the surface. Otherwise, we have to distribute the random points on the parameters in accordance with the distribution of the surface measure. This can be very tricky.

Corona, Impfen und Ethik

Dieser Blogeintrag muss auf Deutsch sein. Genauso wie das Wort „Querdenker“, das nun sein unschuldige Anmutung von kritischem Denken verloren hat und mehr an Flacherdler erinnert als an Kant und Schopenhauer. Wir schätzen alle das Denken, wenn es stattfindet, und wir tolerieren andere Gedanken. Das bedeutet, wir halten sie aus. Schwerer auszuhalten sind die Konsequenzen dieser Gedanken. Und noch schwerer auszuhalten ist, wenn das Querdenken zum Querglauben mutiert, der einfach nur noch behauptet und für Argumente nicht mehr zu erreichen ist. Wir müssen die Welt schon so sehen, wie sie ist, bevor wir anfangen über sie nachzudenken.

Schauen wir einmal hin.

  • Das Corona-Virus ist ein neuartiges Virus, gegen das wir fast keine Abwehrkräfte besitzen. Das bedeutet, dass fast jeder, der mit dem Virus in Kontakt kommt, erkrankt. Es ist nicht möglich, sich durch Stärkung der Abwehrkräfte, eine spezielle Ernährung oder eine gesunde Lebensführung vor der Ansteckung zu schützen.
  • Das Virus hat eine optimale Strategie zur Verbreitung gefunden. Es ist weder zu tödlich wie das Ebola-Virus, noch verwendet es einen selteneren Übertragungsweg wir das HIV-Virus. Auch der Zeitraum, in dem es keine Symptome zeigt und dennoch ansteckend ist, ist lang genug. Zudem nimmt es bei einem Großteil der Infizierten einen fast unbemerkten Verlauf, wird aber dennoch weitergetragen.
  • Das Virus killt insbesondere die Schwachen und Alten. Die Risikofaktoren umfassen Dinge, die nicht zu beeinflussen sind, wie Krebsbehandlungen, Diabetes aller Typen und Alter. Es trifft aber auch Starke und Junge, oft mit überraschender Härte und Langzeitwirkung.
  • Die Behandlung der schweren Erkrankungsfälle ist enorm aufwendig. Ein Intensivbett ist eine Bett mit intensivem Personalbedarf. Noch mehr gilt das für künstliche Beatmung. Wer darüber mehr erfahren will, sollte im Netz oder im TV nicht weit suchen müssen.
  • Eine Erkrankung hat häufig Spätfolgen. Wir wissen nicht genau, wie häufig. Aber es ist weitaus drastischer als bei einer Grippe. Selbst Windpocken können da nicht mithalten.
  • Das Virus mutiert. Je langsamer es bekämpft wird, umso mehr wird es mutieren. Es werden Virusvarianten kommen, die sich besser durchsetzen können, also noch infektiöser sind. Auch die Impfungen werden vom Virus umgangen werden. Je langsamer und lascher gehandelt wird, desto schlimmer wird die Situation.
  • Viele Menschen fallen auf das Vorsorgeparadoxon herein. Nachdem Maßnahmen gewirkt haben, gibt es immer wieder Besserwisser, die behaupten, dass der ganze Alarmismus unnötig war. War doch nicht so schlimm, oder? Beim Coronavirus wissen wir aber genau, wie es ausgeht, wenn man nichts tut. Das wurde schon ausprobiert, und wir müssen es nicht nochmal probieren.

Was schließen wir daraus? Am besten macht man sich Gedanken, wie man selbst die Krise gemanagt hätte, wenn man diese Verantwortung hätte tragen müssen. Es gibt mehrere unmögliche Szenarien, die alle schon durchgedacht wurden.

  • Wir lassen alles einfach laufen. Das bedeutet, wir opfern die Alten und Schwachen, zugunsten des Restes der Bevölkerung. Obwohl das ethisch, politisch und praktisch völlig undenkbar ist, wird eine Variante dieser „Lösung“ immer wieder vorgetragen. Die Sterberate wäre in der Tat „nur“ um maximal 1/3 höher. Aber die große Anzahl der Erkrankten, um die man sich nicht kümmern könnte und die still und alleine leiden müssten, wäre unerträglich. Wer besucht schon einen ansteckenden Todkranken? Auch die Wirtschaft würde zusammenbrechen, ebenso wie jeder Zusammenhalt in der Bevölkerung. Die Amish in den USA scheinen genau so die Krise zu „meistern“. Genaue Details über die Interna sind nicht bekannt. Man stirbt im Schoße der Gemeinschaft. Ich frage mich aber, was die Amish mit Blinddarmentzündungen oder Unfällen bei Kindern machen. Ebenfalls nichts? In Deutschland wäre das strafbar.
  • Wir isolieren die Schwachen und Alten vollständig und lassen ansonsten alles laufen. Diese Variante geht davon aus, dass im Rest der Bevölkerung das Virus nach eine gewissen Zeit verschwindet. Sonst müsste man ja die Alten ewig isolieren. Es kann sein, dass das der Fall ist oder auch nicht. Das Schnupfen- oder Influenza-Virus verschwindet ja auch nicht. Es mutiert und kommt jedes Jahr wieder. Außerdem geht diese Strategie von der unethischen Vorstellung aus, dass sich Risikogruppen, auch junge, aus dem Leben zurückziehen, bis alles vorbei ist.
  • Wir machen die Grenzen zu. Das haben wir in der Tat gemacht, allerdings nicht vollständig und präventiv, sondern je nach Lage. Es ist in dieser heutigen Welt nicht mehr möglich, sich als Land zu isolieren. Diese Isolation müsste vollständig und klinisch sein, wenn sie wirken soll. Denn das Virus ist zu leicht zu übertragen. Man würde also an den Grenzen Güter umladen und notwendige Geschäftsreisen mit langen Quarantänelagern verbinden. Das alles ist nicht denkbar. Selbst wenn der politische Wille einer faschistoiden Regierung so etwas zustande brächte, wäre es wohl immer schon zu spät. Unsere Welt ist inzwischen viel zu dicht besiedelt und die Verbindungen zu global, um eine Radikalkur durchzustehen.
  • Wir isolieren einfach nur die Kranken. Das tun wir schon. Positiv Getestete müssen auch jetzt in Quarantäne. Das geht erst, nachdem die Kranken von ihrer Erkrankung wissen. Davor können sie das Virus weitergeben. Diese Lösung funktioniert daher einfach nicht. Eine zivil gestaltete Quarantäne lässt sich auch nicht so einfach überwachen.

Man kann diese Gedanken beliebig weiterspinnen. Man kommt letztlich immer auf genau die Maßnahmen, die nun getroffen wurden. Einfach, weil es nicht anders geht. Nicht aus diktatorischer Grausamkeit.

  • Kontaktbeschränkungen und Abstandsregeln bis hin zum Shutdown sowie Tests. Diese Maßnahmen verzögern natürlich nur die Ausbreitung des Virus. Damit gewinnt man lediglich Zeit. Es entlastet die Kliniken und ihr Personal und ermöglicht einen menschlichen Umgang mit den Schwerkranken. Außerdem kann man durch eine verstärkte Anwendung an Orten, wo es nötig ist, schutzbedürftige Gruppen vor dem Virus bewahren. Die Maßnahmen werfen natürlich Sand in das Getriebe der Wirtschaft. Die ethische Abwägung sollte aber doch eindeutig sein.
  • Impfungen. Impfungen schützen den Einzelnen vor schweren Verläufen und extremen Leiden. Sie helfen davor nicht sicher, aber mit einer Wahrscheinlichkeit, die viel, viel höher ist, als die, an Impffolgen leiden zu müssen. Sie sind in diesem Punkt besser als die meisten anderen Arzneien. Noch wichtiger ist aber, mit Impfungen die Anderen schützen. Nach einer Impfung ist die Viruslast bei den allermeisten Menschen sehr viel niedriger, so dass folglich Ansteckungen sehr viel seltener sind. Impfungen sind das Mittel, um Pandemiewellen zu brechen. Ethisch sind sie geboten, weil sich manche Menschen nicht impfen können oder bei ihnen die Impfungen nicht wirken. Werden übrigens Impfverweigerer im Ernstfall auch das Intensivbett verweigern? Verweigern Sie ihren Kindern die Impfung gegen Polio, Masern oder Tetanus?

Ich sehe keine wirkliche Alternative zu den getroffenen Maßnahmen und plädiere im Extremfall für eine Impfpflicht, sollte das Virus weiter an Fahrt gewinnen.

Don’t talk about „Chances“!

I came across a YouTube video featuring another well-known paradoxical problem that is complete bogus. The video is done in a very stressful style with a sort of speaking that I can’t stand very long. I am still not sure, but I think the author just wanted to confuse.

The problem is rather simple. Assume you pick two strangers on the streets of New York and ask them to play a game: They should open their purses and count the money they are carrying. The one with the smaller sum wins the money of the one with the larger sum.

The confusion is now spread by claiming that each of the two strangers must be interested in the deal because the possible loss is always less than the possible win by the very nature of the game.

That’s nonsense. The argument falls apart if you think of a person with an empty purse, who would be overjoyed to play this game, while someone who just picked up a large amount of money from the bank should be very hesitant to make the deal.

The reason for the confusion is that it does not make sense to talk about „expected“ outcome without the assumption of an a priori distribution. I have written about this before in the two-envelopes-problem. This is just a variation.

The best way to avoid this error is to think of a simulation of the experiment on a computer. For this, you would have to fix a distribution which determines the amount of money that you put into the purses of the two strangers. With that knowledge, you can compute the expected outcome of the experiment. Without that knowledge, it does not make sense to talk about „expectation“. – My Latex files are none of your business!

I have been using the ArXiv only once till now for a paper on Optimization which is too small to be published elsewhere. I thought it might be a good idea to upload my script on complex functions of the class I did 2020/2021. But I can’t do that.

The ArXiv wants my Latex code, my images and all files I have. Uploading the PDF is rejected because the PDF is detected to have been done by Latex. The server wants to produce an own version of the PDF from my files. I don’t agree to that procedure.

The final version of the script is the PDF. I do not want anybody else to produce an own copy of my work. I have no problems sending source code to colleagues to be used in their own classes, but not to the public. Even though this script will be completely free and open, I do not want to give away the control over its content.

Moreover, the script failed to compile on the server anyway. I tried it to see if the idea even works. But it does not as expected. I have my own organization of the files, and even changed that organization for the test. Of course, everything compiles here without problems.

My Latex files are none of your business, ArXiv.

Simulating and solving Penney’s game

I recently stumbled across some YouTube videos about Penney’s game. Here is one of them. The problem was actually new to me, like almost all problems in mathematics. There simply are too many. It is easier to come up with a problem than a a solution.

According to Wikipedia, Walter Penney published the problem in the Journal of Recreational Mathematics in 1969. The article seems to be hard to get, so simply let us start an analysis ourselves. After all, that sounds like a lot of fun, and I have written about a similar problem in this blog before.

Assume we toss coins with outcome 0 and 1 at equal probability. So we get a sequence of tosses like


We are studying the following game between two players:

  • Player A selects a triplet of tosses (like 0,0,1), and player B another triplet (like 1,0,0).
  • The player whose triplet appears first in the sequence of tosses wins.

Given the sequence above and the selected examples of triplets above, player B would win. His triplet 1,0,0 appeared immediately after four tosses.

If both players select their triplet secretly this is obviously a fair game.

There is a counterintuitive fact here: It is an illusion that the choice of the triplet matters. The average waiting time for any triplet is equal, and selecting (1,0,1) is no better than (1,1,1).

To understand this, we need to see the tossing as a sequence of triplets. In the example above the sequence is the following:

\((0,1,0) \to (1,0,0) \to (0,0,1) \to (0,1,1) \to (1,1,1) \to \ldots\)

All triplets have the same chance (namely 1/8) to appear at the start of the sequence. And any triplet T has exactly two other triplets S1, S2 which can lead to this triplet. Since S1 and S2 have the same chance (namely 1/8) to appear in the n-th position of the sequence and T is selected with probability 1/2 from each, the chance for T to appear in the (n+1)-th position is also 1/8.

After this first excursion into the mathematics of random walks, we need to come back to the third rule of Penney’s Game.

  • The player B selects his triplet after seeing the selected triplet of player A.

Now, there is no reason why player B shouldn’t have an advantage. After all he can simply take the same triplet and get a draw. If he was forced to take another triplet he might even be at a disadvantage. But, indeed, he has a big advantage of at least 2:1. A wrong selection of player A can make his chances even worse.

Let us try to understand this. This is a case of non-transitive ordering.

  • Let us write S>T if the triplet S is more likely to appear before triplet T in the toss sequence when the tosses are started with a random triplet.

Then it is not true that

\(T \ge S \,{\rm and}\, S \ge U \Longrightarrow T \ge U\)

If it were true, the ordering would be complete and player A could select the best triplet, or at least a triplet that cannot be beaten by the selection of player B.

It turns out that for each selected triplet T of player A, there is a selection S which is better than T (S>T). This makes the game a win for player B.

This is counterintuitive again. A consequence is that there is a sequence of triplets with the property

\(T_1 < T_2, \quad T_2 < T_3, \ldots \quad , T_{n-1}<T_n, \quad T_n < T_1\)

In this game, the simple rule to find a better triplet is the following:

  • If player A selects T=(X,Y,Z), player B selects S=(YN,X,Y), where YN is the opposite of Y, i.e., YN=0 if Y=1 and YN=1 if Y=0.

E.g., if player A selects (0,1,0) or (0,1,1), player B selects (0,0,1). For another example, if player A selects (1,1,0) or (1,1,1) player B selects (0,1,1).

How do we understand this? We use a general mathematical model.

A way to understand this is by considering random walks on a directed graph. The graph in this game has 8 vertices, the 8 possible triplets. It has 16 edges, i.e., pairs of vertices, with associated probabilities 1/2. Given any triplet, there are two possible tosses which lead to two possible edges. (Note that some edges go from the triplet to itself!)

E.g., there are two edges leading away from (1,1,0),

\((1,1,0) \to (1,0,0), \quad (1,1,0) \to (1,0,1)\)

or, from (1,1,1)

\((1,1,1) \to (1,1,0), \quad (1,1,1) \to (1,1,1)\)

One mathematical model for uses an adjacency matrix M, containing transition probabilities from vertex to vertex.

  • We set M(i,j) to the probability to go from triplet j to triplet i (i.e., to reach triplet i from triplet j) in one toss. So we have a matrix M of transition probabilities.

Now imagine starting a lot of experiments by walking from a start vertex to subsequent vertices with the given probabilities. The experiments select the start vertex according to a probability vector P, i.e., vertex i is selected with probability P(i).

  • If the distribution of vertices in the experiments is P at a step. Then the matrix multiplication MP will compute the distribution of probabilities at the next step.

Let us do a test in Euler Math Toolbox. First, we generate the adjacency matrix M for our game. Then we start at one specific node and compute the distribution after three steps.

>function makeM () ...
$M = zeros(8,8);
$for i1=0 to 1;
$  for i2=0 to 1;
$    for i3=0 to 1;
$      for j1=0 to 1;
$        for j2=0 to 1;
$          for j3=0 to 1;
$            i=4*i1+2*i2+i3;
$            j=4*j1+2*j2+j3;
$            if i2==j1 and i3==j2 then
$              M[j+1,i+1]=1/2;
$            endif;
$          end;
$        end;
$      end;
$    end;
$  end;
$return M;
>shortest M
    0.5      0      0      0    0.5      0      0      0 
    0.5      0      0      0    0.5      0      0      0 
      0    0.5      0      0      0    0.5      0      0 
      0    0.5      0      0      0    0.5      0      0 
      0      0    0.5      0      0      0    0.5      0 
      0      0    0.5      0      0      0    0.5      0 
      0      0      0    0.5      0      0      0    0.5 
      0      0      0    0.5      0      0      0    0.5 
>p=zeros(8)'; p[1]=1; p'
 [1,  0,  0,  0,  0,  0,  0,  0]
>p=M.p; p'
 [0.5,  0.5,  0,  0,  0,  0,  0,  0]
>p=M.p; p'
 [0.25,  0.25,  0.25,  0.25,  0,  0,  0,  0]
>p=M.p; p'
 [0.125,  0.125,  0.125,  0.125,  0.125,  0.125,  0.125,  0.125]

The function to generate the matrix M is not the most elegant one. It might be better to use a recursion than a loop. After we generate the matrix, we start with a „distribution“ which puts everything into the first triplet (actually that is (0,0,0)). After only three steps, we see that the probability to reach each triplet is equal.

Let us now model Penney’s game.

We have two triplets T and S (vertex numbers i and j) which lead nowhere. I.e., if those states are met we have found a case where T or S are hit.

We modify the adjacency matrix accordingly and delete all edges leading away from T and S and connect T and S to themselves with probability 1. This is a modified adjacency matrix M(i,j). We now have to look at the following limit.

\(P = \lim_{n \to \infty} M_{i,j}^n P_0\)


\(P_0 = (1/8,\ldots,1/8)^T\)

which is our start distribution.

Let us simulate that in Euler Math Toolbox. We select the triplets T=(0,1,0) (third element of p) and S=(0,0,1) (second element of p) which according to our rule should be superior.

>M[,2]=0; M[2,2]=1;
>M[,3]=0; M[3,3]=1;
>shortest M
    0.5      0      0      0    0.5      0      0      0 
    0.5      1      0      0    0.5      0      0      0 
      0      0      1      0      0    0.5      0      0 
      0      0      0      0      0    0.5      0      0 
      0      0      0      0      0      0    0.5      0 
      0      0      0      0      0      0    0.5      0 
      0      0      0    0.5      0      0      0    0.5 
      0      0      0    0.5      0      0      0    0.5 
>p=ones(8)'/8; p'
 [0.125,  0.125,  0.125,  0.125,  0.125,  0.125,  0.125,  0.125]
>p=M.p; p'
 [0.125,  0.25,  0.1875,  0.0625,  0.0625,  0.0625,  0.125,  0.125]
>p=M.p; p'
 [0.09375,  0.34375,  0.21875,  0.03125,  0.0625,  0.0625,  0.09375,
>loop 1 to 100; p=M.p; end;
 [0,  0.666667,  0.333333,  0,  0,  0,  0,  0]

Indeed, the two selected triplets catch all the distribution, and S=(0,0,1) is indeed twice as good as T=(0,1,0). I.e., player B will win twice as often.

Things get worse for T=(0,0,0) and S=(1,0,0). S is seven times as good as T! There are other combinations which are three times as good.

Due to the structure of the problem, the limit P will have only non-zero entries for T and S. The probability to be in any other triplet will tend to 0. T and S will catch everything.

But can we compute the winning chance for S without a simulation? For that we define

  • w(i) as the chance to reach S before T when starting at triplet i.

We are using the following properties of w.

  • w(i) = 1 for the vertex S with number i.
  • w(j) = 0 for the vertex T with number j.
  • w satisfies

\(w = M_{i,j}^T \, w\)

We are searching for a specific eigenvector. The following function of EMT computes w given and adjacency matrix M and i,j, and returns the probability to reach vertex i before vertex j.

>function getw (M,i,j) ...
$if i==j then return 0; endif;
$Mt[,i]=0; Mt[i,i]=1;
$Mt[,j]=0; Mt[j,j]=1;
$V = eigenspace(Mt',1);
$c = V[[i,j]]\[1;0];
$return sum((V.c)')/cols(Mt);
>M = makeM();
% Let us try our simulated examples.
>fraction getw(M,2,3), fraction getw(M,5,1)

The simulated example from above works, as well as another example which takes T=(0,0,0) and S=(1,0,0).

For the next step, we compute all probabilities that the triplet number i appears before the triplet number j when starting from a random triplet.

>function solvePG () ...
$M = makeM();
$for i=1 to 8;
$  for j=1 to 8;
$    W[i,j]=getw(M,i,j);
$  end;
$return W;
>fracformat(7); W, defformat;
      0    1/2    2/5    2/5    1/8   5/12   3/10    1/2 
    1/2      0    2/3    2/3    1/4    5/8    1/2   7/10 
    3/5    1/3      0    1/2    1/2    1/2    3/8   7/12 
    3/5    1/3    1/2      0    1/2    1/2    3/4    7/8 
    7/8    3/4    1/2    1/2      0    1/2    1/3    3/5 
   7/12    3/8    1/2    1/2    1/2      0    1/3    3/5 
   7/10    1/2    5/8    1/4    2/3    2/3      0    1/2 
    1/2   3/10   5/12    1/8    2/5    2/5    1/2      0 

The non-transitivity of the problem means that in each column there is a value >1/2. We can select this value and print the corresponding triplet in human readable form.

>function printtriplet (i) ...
$s = ")"; i=i-1;
$if mod(i,2) then s=",1"+s; else s=",0"+s; endif;
$i = floor(i/2);
$if mod(i,2) then s=",1"+s; else s=",0"+s; endif;
$i = floor(i/2);
$if mod(i,2) then s="(1"+s; else s="(0"+s; endif;
$return s;
>function printPG (W) ...
$for j=1 to 8;
$  w=W[,j]';
$  e=extrema(w);
$  printtriplet(j) + " -> " + printtriplet(e[4]),
% We get the well known solution of the game.
 (0,0,0) -> (1,0,0)
 (0,0,1) -> (1,0,0)
 (0,1,0) -> (0,0,1)
 (0,1,1) -> (0,0,1)
 (1,0,0) -> (1,1,0)
 (1,0,1) -> (1,1,0)
 (1,1,0) -> (0,1,1)
 (1,1,1) -> (0,1,1)

This is exactly the well-known rule that you find also in the Wikipedia article.

Understand Bayesian arguments!

I just stumbled over another problem of probability theory on the channel MindYourDecisions. The video is rather old already, but mathematical problems never age.

„A nursery has 3 boys and a number of girls at one day, plus an additional birth at night. The next day a child from the day before is picked at random, and it is a boy. What is the probability that the child born at night is a boy?“

I hate the form that this question for a probability is posed. For the casual audience this does not make much sense. They will just answer 50%, and I would not be angry about this answer. It is sort of correct. You need a mathematical education to understand the problem to start with. Let me explain.

Everyone who has not met this kind of problems before will argue that the probability of the night child being a boy is 50%. That is true. The fact that the survey picks a boy the next day does not change this at all. I mean, the randomly picked boy was a boy. So what?

The problem is that the question is asking something more involved. It asks you to consider all possible random experiments. I.e., you assume a random child at night and a random pick. Then you discard those experiments that do not fit to the story.

An extreme case would be that there are no boys at daytime. Then a girl at night would not fit to the story, and all experiments that fit feature a boy at night. The Bayesian mathematics calls this a 100% probability for a boy.

For the actual case of three boys at daytime, let us assume g girls. I now follow the explanation of the video, but with different wordings and a completely different concept of thinking.

There are two cases now.

Case 1: We have a boy at night. We expect this case to happen in half of the experiments. The chance to pick a boy the next day is then 4/(4+g). We throw away all experiments where a girl is picked, i.e., g/(4+g) of them on average.

Case 2: We have a girl at night. Again this happens in half of the experiments on average. The chance to pick a boy next day is 3/(4+g). We throw away all experiments where a girl is picked.

Now imagine both cases together. In the experiments that we did not throw away we have an overhead of 4 to 3 for case 1. Thus, the Bayesian mathematics says that the chance for case one is 4/7.

There are other ways of writing this up. But they are only shortcuts of the above thought experiment. And for me, defining a precise experiment is the clearest way to think about such problems.