Schlagwort-Archive: EMT

Functions of Functions in EMT and Python

Because Euler Math Toolbox is designed to be an interactive way to do mathematics, its algorithms for numerical problems are easy to use, such as the solution of an equation, integration, minimization, or even to plot a function. For a demonstration and the adaptation to Python, we find the minimum of a one dimensional function.

The algorithm used is the Golden Ratio subdivision of an interval. It spares one function evaluation per step, and is guaranteed to yield an inclusion of the minimum for a convex function (assuming exact computations).

Here is an example:

>function f(x) := x^3-x
>xmin = fmin("f",0)
 0.577350265248
>plot2d("f",0,1); plot2d(xmin,f(xmin),>points,>add):

It is even easier to use an expression. Expressions are strings which are evaluated for the parameters x,y,z (by default), and thus work almost like functions.

 >fmin("x^3-x",0)
 0.577350265248

The same function can be coded in Python. I did a line-by-line translation of the code of EMT.

def fmin (f,a,b=None,d=0.01,dmax=1e+20,eps=1e-14):
    """ Compute the minimum of a function
    Using the Golden Cut.
    
    a: Start
    b: End of interval or None (end is searched first)
    d: Initial step size
    dmax: Maximal step size to search for b
    eps: Desired Accuracy
    
    Returns: Point of minimum.
    """
    if b == None:
        x0=a; y0=f(x0); 
        x1=x0+d; y1=f(x1)
        if y1>y0:
            d=-d
            x1=x0+d; y1=f(x1)
        while True:
            if not y1<y0:
                break
            d=2*d
            x0=x1; x1=x0+d
            y0=y1; y1=f(x1)
            if abs(d)>dmax:
                break
        a=x0-d 
        b=x1;
    x0=a; x3=b
    y0=f(x0); y3=f(x3)
    l=(3-np.sqrt(5))/2
    x1=x0+l*(x3-x0); x2=x3-l*(x3-x0)
    y1=f(x1); y2=f(x2)
    while True:
        if y1>y2:
            x0=x1; x1=x2; x2=x3-l*(x3-x0)
            y0=y1; y1=y2; y2=f(x2);
        else:
            x3=x2; x2=x1; x1=x0+l*(x3-x0)
            y3=y2; y2=y1; y1=f(x1)
        if np.abs(x0-x3)<eps:
            return x0

To use this for the example above is easy.

def f(x):
    return x**3-x

fmin(f,0) # -> 0.5773502652484417

Via lambda functions, an expression works just as well.

fmin(lambda x: x**3-x,0)

The main motivation for this posting is more complicated. We want to provide additional parameters to the function.

Euler Math Toolbox has two ways of doing this. The first way is via args(). It works as follows.

>function f(x,a) := x^3-a*x
>xmin = fmin("f",0;1)
 0.577350265248
>xmin = fmin("f",0;1.1)
 0.60553006474

The parameters after the semicolon are passed from „fmin“ to the function „f“ by calling it as „f(x,args())“.

The second way is with call collections. Those are collections of functions and parameters. They are available for expressions too, but need a named member for the additional parameters.

>fmin({{"f",1.1}},0)
 0.60553006474
>fmin({{"x^3-a*x",a=1.1}},0)
 0.60553006474

To achieve the same in Python, the easiest way is to use lambda functions.

def f(x,a):
    return x**3-a*x

a = 1.1
fmin(lambda x: f(x,a),0) # -> 0.6055300647407058

If that is used within another function, the lambda function will be made referring to the current variable „a“. You need not worry about a variable „a“ in the function „fmin“.

Because Python does not separate the additional parameters with a semicolon, the „args“ method feels a bit strange. But it is possible. Have a look at this code.

def h(x,a,b):
    return a*x**2+b

def evalf (f,x,*args):
    return f(x,*args)

evalf(h,3,4,1) # -> 9

The function „evalf“ needs another function as an argument, just as „fmin“. We want to pass more than one argument to this function „f“. If we call „evalf(h,3,4,1)“, those arguments are „(4,1)“. They are passed as a tuple to the variable „args“. Saying „*args“ will expand this tupel and pass „4,1“ to „f“, i.e. to „h“.

I made a private Python module with some functions like „fmin“ where I did not use the „args“ method. The reason is that I want to overwrite named parameters with arguments, e.g., saying „fmin(f,a,b)“ with interval bounds, where the default for „b“ is „None“. This is not possible if „*args“ is present.

Finally, an application of „fmin“.

We minimize a multidimensional function with gradient search. The search goes down the gradient until a minimum is found, and repeats from that point. The handling of the accuracy during the algorithms need to be improved for maximal performance. Currently, it computes the accurate minimum in each step.

The method is best used for convex function, or at least locally convex function. It is guaranteed to find a minimum then, provided that the set of points where the function is less than the current value is compact.

epsilon = 1e-14

def vmin (f,x,v,tmax=1,tmin=epsilon):
    """ Minimize f(x+t*v) assuming v is a direction of descent
    """
    t=fmin(lambda t:f(x-t*v),0);
    return (t,x-t*v)

def fminmult (f,Df,x):
    res = 1e20
    while True:
        t,xnew = vmin(f,x,Df(x))
        resnew = f(xnew)
        if resnew>=res:
            break
        res = resnew; x = xnew
    return x

def f(x,a):
    return (x[0]-a)**2+2*a*x[1]**2

def Df(x,a):
    return np.array([2*(x[0]-a),4*a*x[1]])

a = 4
fminmult(lambda x:f(x,a),lambda x:Df(x,a),np.array([1,1]))

# -> array([ 4.00000000e+00, -1.65436123e-24])

EMT, Python, Numpy, Sympy on Macs with M1

I am currently using an 11′ Mac Air Notebook with the M1 processor as my laptop. We have to admit that Apple has made a big step forward again with this architecture. I never used Macs before because they were expensive and did not offer enough benefit to justify leaving Windows. Admittedly, they always had a much better user experience than Windows, let alone Linux. What works, does work well! But I have been judging the systems by more technical features, like processor power, ease of administrator work, and support for programmers. And in that respect, Macs were a step back. So, I could never justify using a system that serves me worse at a higher cost.

Now, things have changed with the M1 architecture. The laptop I am using is a bargain for what it offers. The processor is very fast, the display excellent, and the battery life extraordinary. The build quality is far better than any notebook I had before, with a rugged, indestructible case that still feels light and slim. For my retirement, that is precisely what I need.

The purchase of this laptop also gives me the chance to test my math programs on that operating system. There is a Windows emulation for Macs, but it does not work well with the M1 processor. So, a program like Euler Math Toolbox will have to be ported which is a lot of work, most likely not worth the effort in the end. I am going to produce a simplified version nevertheless.

Java is working well, and that gave me the chance to fix my program C.a.R. (Compass and Ruler) for Macs. Indeed, there were some annoying quirks that needed to be fixed. It is now running very well. (By the way, the same applies to my editor, JE, which I still use.)

Until I have a working version, Euler Math Toolbox needs to replaced by other, free tools on Macs. For the numerical side, there is Scilab. It has a working version for Macs, and I found no obvious flaw. For the numerical side, there is no free solution. Maxima can only be installed in a very complicated way using a sort of emulator, and I still have to explore this method. More on this on another day.

But, there is Python. We have a numerical matrix library „numpy“ and a symbolic library „sympy“. There is also a very nice plot library „matplotlib“. All those are not as easy to use as Euler Math Toolbox. I’d say, they are clumsy in comparison. Nevertheless, I am going to explore these libraries and compare to EMT in a series of postings on this blog.

You can follow me without a Mac. So let us start with an example which requires symbolic computations. We want to compute the polynomial of degree n with the property

\(p_{n+1}(x)-p_{n+1}(x-1)=x^n, \quad p_n(0)=0\)

for all real x, and natural n (investigated by Bernoulli and Euler). By this equation and induction, the polynomial is defined for all natural numbers x. Thus, the polynomial is uniquely determined if it exists. And it does indeed exist. You just need to interpolate in the first n+2 natural numbers, and since the equation holds there, it holds everywhere.

We now have

\(p_{n+1}(N) = \sum\limits_{k=1}^N k^n.\)

We could use interpolation to determine the polynomials. It is an exercise to do this in any symbolic program. For a start, we use Maxima in Euler Math Toolbox to determine the case n=3.

>p &= sum(a[k]*x^k,k,0,4)
 
                       4       3       2
                   a  x  + a  x  + a  x  + a  x + a
                    4       3       2       1      0
 
>equ &= makelist((p with [x=k]) = sum(j^3,j,1,k),k,1,5)
 
        [a  + a  + a  + a  + a  = 1, 
          4    3    2    1    0
 16 a  + 8 a  + 4 a  + 2 a  + a  = 9, 
     4      3      2      1    0
 81 a  + 27 a  + 9 a  + 3 a  + a  = 36, 
     4       3      2      1    0
 256 a  + 64 a  + 16 a  + 4 a  + a  = 100, 
      4       3       2      1    0
 625 a  + 125 a  + 25 a  + 5 a  + a  = 225]
      4        3       2      1    0
 
>&factor(p with solve(equ)[1])

 
                               2        2
                              x  (x + 1)
                              -----------
                                   4

We have created a polynomial with indexed parameters using the sum() function of Maxima. The command makelist() then creates a list of equations which our polynomial has to satisfy (due to the sum requirement above). This can be solved with solve(). We use the first solution to substitute the values for a[k] back into the polynomial and factor the result. We get the well-known formula for the sum of the cubes of the first x integers.

Let us switch to Python. In Windows, you need to install „Anaconda“ for the following. Then start the „Anaconda Navigator“ and „jupyter notebook“. There you enter the following.

from sympy import *
x, k = symbols('x k')
a = IndexedBase('a')
p = sum([a[k]*x**k for k in range(5)])
print(p)
equ = [Eq(p.subs(x,k),sum([l**3 for l in range(1,k+1)])) for k in range(1,6)]
print(equ)
sol = solve(equ)
print(sol)
factor(p.subs([(a[k],sol[a[k]]) for k in range(5)]))

This is more difficult to understand than the EMT code above, but it follows the same ideas.

After importing everything from sympy, we need to define the symbols x and k, and the indexed symbol a[]. Then we use the sum function of Python on a list. The result is

x**4*a[4] + x**3*a[3] + x**2*a[2] + x*a[1] + a[0]

After that, we create a list of equations, just as in the Maxima code above. The problem is that we cannot use „==“. This would test the two sides for equality and evaluate to true or false. We need to use „Eq(a,b)“ instead.

[Eq(a[0] + a[1] + a[2] + a[3] + a[4], 1), Eq(a[0] + 2*a[1] + 4*a[2] + 8*a[3] + 16*a[4], 9), Eq(a[0] + 3*a[1] + 9*a[2] + 27*a[3] + 81*a[4], 36), Eq(a[0] + 4*a[1] + 16*a[2] + 64*a[3] + 256*a[4], 100), Eq(a[0] + 5*a[1] + 25*a[2] + 125*a[3] + 625*a[4], 225)]

Then we solve this equation, which yields a dictionary with solutions. Dictionaries are Python structures that contain entries and values. They print as follows.

{a[0]: 0, a[1]: 0, a[2]: 1/4, a[3]: 1/2, a[4]: 1/4}

We now use the subs() function to substitute the values of the parameters into the polynomial p and factor the result. In a jupyter notebook, it appears in pretty format automatically. Here is a screenshot.

So much for a start of the sequence of postings. Clearly, Euler Math Toolbox and Maxima re easier to use. But Python is available for Macs. On Linux, you can use EMT quite nicely with the modern Wine emulation. But you could also use Python.

Finally, as a mathematical side remark, you could differentiate the definition equation for our polynomials and get.

\(p_{n+1}'(x)-p_{n+1}'(x-1) = n x^{n-1}\)

Since our polynomials are uniquely defined by the equation, we get

\(p_n(x) = \dfrac{1}{n} (p_{n+1}'(x) -p_{n+1}'(0))\)

On the right hand side, we have a polynomial which satisfies the equation and has the property p(0)=0. We can use this to compute the next polynomial.

\(p_{n+1}(x) = n \int\limits_0^x p_n(t) \,dt + c x\)

To determine the constant c, we use p(1)=1 for our polynomials. Thus

\( p_{n+1}(x) = n \int\limits_0^x p_n(t) \,dt + (1- n\int\limits_0^1 p_n(t) \,dt) x\)

Let us make a function in Euler Math Toolbox from this formula. The function will call Maxima several times, and thus not be fast.

>function p(n) ...
$pn &= "x";
$if n>1 then
$  for k=1 to n-1;
$    q &= "subst(t,x,pn)";
$    pn &= "@k*integrate(q,t,0,x)+(1-@k*integrate(q,t,0,1))*x";
$  end;
$endif;
$return &pn;
$endfunction
>p(4); &factor(%)
 
                               2        2
                              x  (x + 1)
                              -----------
                                   4

In Euler functions, you need to use „…“ for symbolic expressions. Moreover, you can use numerical values via @… in symbolic expressions.

Of course, loops like this can also be programmed in Maxima direct. You will have to learn the syntax of Maxima for this.

Here is a solution in Python.

def p(n):
    t,x = symbols('t x')
    p = x
    if n<=1:
        return p
    for i in range(1,n):
        q = p.subs(x,t)
        p = i*integrate(q,(t,0,x)) + (1-i*integrate(q,(t,0,1)))*x
    return p
factor(p(4))

This is a good start.

But it turns out that the Anaconda Navigator does not run on Macs. It looks like it needs Rosetta2. Usually, a program asks if Rosetta should be installed. But Anaconda didn’t on my system.

After installing Anaconda nevertheless, you have a Python 3.9 environment. So you can run „python“ to start the Python interpreter, and „ipython“ to start the advanced Python terminal. For this test, I saved the following to a file „test.py“ using an editor (I am using my JE, of course).

from sympy import *

def p(n):
    t,x = symbols('t x')
    p = x
    if n<=1:
        return p
    for i in range(1,n):
        q = p.subs(x,t)
        p = i*integrate(q,(t,0,x)) + (1-i*integrate(q,(t,0,1)))*x
    return p
    
pprint(factor(p(10)))

Then running „python test.py“ will run this script and output the result in pretty format.

 2        2 ⎛ 2        ⎞ ⎛   4      3    2          ⎞
x ⋅(x + 1) ⋅⎝x  + x - 1⎠⋅⎝2⋅x  + 4⋅x  - x  - 3⋅x + 3⎠
─────────────────────────────────────────────────────
                          20    

You can also try to run „jupyter notebook“ from the terminal. If things go well, your browser will open and you can start an ipython terminal, clicking on „New“. Use Option+Return to run any command.

Simulating and solving Penney’s game

I recently stumbled across some YouTube videos about Penney’s game. Here is one of them. The problem was actually new to me, like almost all problems in mathematics. There simply are too many. It is easier to come up with a problem than a a solution.

According to Wikipedia, Walter Penney published the problem in the Journal of Recreational Mathematics in 1969. The article seems to be hard to get, so simply let us start an analysis ourselves. After all, that sounds like a lot of fun, and I have written about a similar problem in this blog before.

Assume we toss coins with outcome 0 and 1 at equal probability. So we get a sequence of tosses like

\(0,1,0,0,1,1,1,0,1,0,\ldots\)

We are studying the following game between two players:

  • Player A selects a triplet of tosses (like 0,0,1), and player B another triplet (like 1,0,0).
  • The player whose triplet appears first in the sequence of tosses wins.

Given the sequence above and the selected examples of triplets above, player B would win. His triplet 1,0,0 appeared immediately after four tosses.

If both players select their triplet secretly this is obviously a fair game.

There is a counterintuitive fact here: It is an illusion that the choice of the triplet matters. The average waiting time for any triplet is equal, and selecting (1,0,1) is no better than (1,1,1).

To understand this, we need to see the tossing as a sequence of triplets. In the example above the sequence is the following:

\((0,1,0) \to (1,0,0) \to (0,0,1) \to (0,1,1) \to (1,1,1) \to \ldots\)

All triplets have the same chance (namely 1/8) to appear at the start of the sequence. And any triplet T has exactly two other triplets S1, S2 which can lead to this triplet. Since S1 and S2 have the same chance (namely 1/8) to appear in the n-th position of the sequence and T is selected with probability 1/2 from each, the chance for T to appear in the (n+1)-th position is also 1/8.

After this first excursion into the mathematics of random walks, we need to come back to the third rule of Penney’s Game.

  • The player B selects his triplet after seeing the selected triplet of player A.

Now, there is no reason why player B shouldn’t have an advantage. After all he can simply take the same triplet and get a draw. If he was forced to take another triplet he might even be at a disadvantage. But, indeed, he has a big advantage of at least 2:1. A wrong selection of player A can make his chances even worse.

Let us try to understand this. This is a case of non-transitive ordering.

  • Let us write S>T if the triplet S is more likely to appear before triplet T in the toss sequence when the tosses are started with a random triplet.

Then it is not true that

\(T \ge S \,{\rm and}\, S \ge U \Longrightarrow T \ge U\)

If it were true, the ordering would be complete and player A could select the best triplet, or at least a triplet that cannot be beaten by the selection of player B.

It turns out that for each selected triplet T of player A, there is a selection S which is better than T (S>T). This makes the game a win for player B.

This is counterintuitive again. A consequence is that there is a sequence of triplets with the property

\(T_1 < T_2, \quad T_2 < T_3, \ldots \quad , T_{n-1}<T_n, \quad T_n < T_1\)

In this game, the simple rule to find a better triplet is the following:

  • If player A selects T=(X,Y,Z), player B selects S=(YN,X,Y), where YN is the opposite of Y, i.e., YN=0 if Y=1 and YN=1 if Y=0.

E.g., if player A selects (0,1,0) or (0,1,1), player B selects (0,0,1). For another example, if player A selects (1,1,0) or (1,1,1) player B selects (0,1,1).

How do we understand this? We use a general mathematical model.

A way to understand this is by considering random walks on a directed graph. The graph in this game has 8 vertices, the 8 possible triplets. It has 16 edges, i.e., pairs of vertices, with associated probabilities 1/2. Given any triplet, there are two possible tosses which lead to two possible edges. (Note that some edges go from the triplet to itself!)

E.g., there are two edges leading away from (1,1,0),

\((1,1,0) \to (1,0,0), \quad (1,1,0) \to (1,0,1)\)

or, from (1,1,1)

\((1,1,1) \to (1,1,0), \quad (1,1,1) \to (1,1,1)\)

One mathematical model for uses an adjacency matrix M, containing transition probabilities from vertex to vertex.

  • We set M(i,j) to the probability to go from triplet j to triplet i (i.e., to reach triplet i from triplet j) in one toss. So we have a matrix M of transition probabilities.

Now imagine starting a lot of experiments by walking from a start vertex to subsequent vertices with the given probabilities. The experiments select the start vertex according to a probability vector P, i.e., vertex i is selected with probability P(i).

  • If the distribution of vertices in the experiments is P at a step. Then the matrix multiplication MP will compute the distribution of probabilities at the next step.

Let us do a test in Euler Math Toolbox. First, we generate the adjacency matrix M for our game. Then we start at one specific node and compute the distribution after three steps.

>function makeM () ...
$M = zeros(8,8);
$for i1=0 to 1;
$  for i2=0 to 1;
$    for i3=0 to 1;
$      for j1=0 to 1;
$        for j2=0 to 1;
$          for j3=0 to 1;
$            i=4*i1+2*i2+i3;
$            j=4*j1+2*j2+j3;
$            if i2==j1 and i3==j2 then
$              M[j+1,i+1]=1/2;
$            endif;
$          end;
$        end;
$      end;
$    end;
$  end;
$end;
$return M;
$endfunction
>M=makeM;
>shortest M
    0.5      0      0      0    0.5      0      0      0 
    0.5      0      0      0    0.5      0      0      0 
      0    0.5      0      0      0    0.5      0      0 
      0    0.5      0      0      0    0.5      0      0 
      0      0    0.5      0      0      0    0.5      0 
      0      0    0.5      0      0      0    0.5      0 
      0      0      0    0.5      0      0      0    0.5 
      0      0      0    0.5      0      0      0    0.5 
>p=zeros(8)'; p[1]=1; p'
 [1,  0,  0,  0,  0,  0,  0,  0]
>p=M.p; p'
 [0.5,  0.5,  0,  0,  0,  0,  0,  0]
>p=M.p; p'
 [0.25,  0.25,  0.25,  0.25,  0,  0,  0,  0]
>p=M.p; p'
 [0.125,  0.125,  0.125,  0.125,  0.125,  0.125,  0.125,  0.125]

The function to generate the matrix M is not the most elegant one. It might be better to use a recursion than a loop. After we generate the matrix, we start with a „distribution“ which puts everything into the first triplet (actually that is (0,0,0)). After only three steps, we see that the probability to reach each triplet is equal.

Let us now model Penney’s game.

We have two triplets T and S (vertex numbers i and j) which lead nowhere. I.e., if those states are met we have found a case where T or S are hit.

We modify the adjacency matrix accordingly and delete all edges leading away from T and S and connect T and S to themselves with probability 1. This is a modified adjacency matrix M(i,j). We now have to look at the following limit.

\(P = \lim_{n \to \infty} M_{i,j}^n P_0\)

with

\(P_0 = (1/8,\ldots,1/8)^T\)

which is our start distribution.

Let us simulate that in Euler Math Toolbox. We select the triplets T=(0,1,0) (third element of p) and S=(0,0,1) (second element of p) which according to our rule should be superior.

>M=makeM;
>M[,2]=0; M[2,2]=1;
>M[,3]=0; M[3,3]=1;
>shortest M
    0.5      0      0      0    0.5      0      0      0 
    0.5      1      0      0    0.5      0      0      0 
      0      0      1      0      0    0.5      0      0 
      0      0      0      0      0    0.5      0      0 
      0      0      0      0      0      0    0.5      0 
      0      0      0      0      0      0    0.5      0 
      0      0      0    0.5      0      0      0    0.5 
      0      0      0    0.5      0      0      0    0.5 
>p=ones(8)'/8; p'
 [0.125,  0.125,  0.125,  0.125,  0.125,  0.125,  0.125,  0.125]
>p=M.p; p'
 [0.125,  0.25,  0.1875,  0.0625,  0.0625,  0.0625,  0.125,  0.125]
>p=M.p; p'
 [0.09375,  0.34375,  0.21875,  0.03125,  0.0625,  0.0625,  0.09375,
 0.09375]
>loop 1 to 100; p=M.p; end;
>p'
 [0,  0.666667,  0.333333,  0,  0,  0,  0,  0]

Indeed, the two selected triplets catch all the distribution, and S=(0,0,1) is indeed twice as good as T=(0,1,0). I.e., player B will win twice as often.

Things get worse for T=(0,0,0) and S=(1,0,0). S is seven times as good as T! There are other combinations which are three times as good.

Due to the structure of the problem, the limit P will have only non-zero entries for T and S. The probability to be in any other triplet will tend to 0. T and S will catch everything.

But can we compute the winning chance for S without a simulation? For that we define

  • w(i) as the chance to reach S before T when starting at triplet i.

We are using the following properties of w.

  • w(i) = 1 for the vertex S with number i.
  • w(j) = 0 for the vertex T with number j.
  • w satisfies

\(w = M_{i,j}^T \, w\)

We are searching for a specific eigenvector. The following function of EMT computes w given and adjacency matrix M and i,j, and returns the probability to reach vertex i before vertex j.

>function getw (M,i,j) ...
$if i==j then return 0; endif;
$Mt=M;
$Mt[,i]=0; Mt[i,i]=1;
$Mt[,j]=0; Mt[j,j]=1;
$V = eigenspace(Mt',1);
$c = V[[i,j]]\[1;0];
$return sum((V.c)')/cols(Mt);
$endfunction
>M = makeM();
% Let us try our simulated examples.
>fraction getw(M,2,3), fraction getw(M,5,1)
 2/3
 7/8

The simulated example from above works, as well as another example which takes T=(0,0,0) and S=(1,0,0).

For the next step, we compute all probabilities that the triplet number i appears before the triplet number j when starting from a random triplet.

>function solvePG () ...
$M = makeM();
$W=zeros(8,8);
$for i=1 to 8;
$  for j=1 to 8;
$    W[i,j]=getw(M,i,j);
$  end;
$end;
$return W;
$endfunction
>W=solvePG();
>fracformat(7); W, defformat;
      0    1/2    2/5    2/5    1/8   5/12   3/10    1/2 
    1/2      0    2/3    2/3    1/4    5/8    1/2   7/10 
    3/5    1/3      0    1/2    1/2    1/2    3/8   7/12 
    3/5    1/3    1/2      0    1/2    1/2    3/4    7/8 
    7/8    3/4    1/2    1/2      0    1/2    1/3    3/5 
   7/12    3/8    1/2    1/2    1/2      0    1/3    3/5 
   7/10    1/2    5/8    1/4    2/3    2/3      0    1/2 
    1/2   3/10   5/12    1/8    2/5    2/5    1/2      0 

The non-transitivity of the problem means that in each column there is a value >1/2. We can select this value and print the corresponding triplet in human readable form.

>function printtriplet (i) ...
$s = ")"; i=i-1;
$if mod(i,2) then s=",1"+s; else s=",0"+s; endif;
$i = floor(i/2);
$if mod(i,2) then s=",1"+s; else s=",0"+s; endif;
$i = floor(i/2);
$if mod(i,2) then s="(1"+s; else s="(0"+s; endif;
$return s;
$endfunction
>function printPG (W) ...
$for j=1 to 8;
$  w=W[,j]';
$  e=extrema(w);
$  printtriplet(j) + " -> " + printtriplet(e[4]),
$end;
$endfunction
% We get the well known solution of the game.
>printPG(W)
 (0,0,0) -> (1,0,0)
 (0,0,1) -> (1,0,0)
 (0,1,0) -> (0,0,1)
 (0,1,1) -> (0,0,1)
 (1,0,0) -> (1,1,0)
 (1,0,1) -> (1,1,0)
 (1,1,0) -> (0,1,1)
 (1,1,1) -> (0,1,1)

This is exactly the well-known rule that you find also in the Wikipedia article.

The Riemann Sphere

Since I am currently teaching complex function theory I am interested in visualizations of Möbius transformations as well as other analytic functions. The Riemann Sphere together with the Stereographic Projection is a good tool for this.

Above you see the basic idea. The points on the sphere are projected along a line through the north pole to the x-y-plane which represents the complex plane. The north pole associates to the point infinity. The formulas for the projection and its inverse are easy to compute with a bit of geometry. I have done so on this page using EMT and Maxima:

It is quite interesting to see what a Möbius transformation does to the Riemann sphere.

In this image, the north and south poles are mapped to the two black points. The other lines are the images of the circles of latitude and longitude. The image can be created by projecting the usual circles (with the poles in place) to the complex plane, applying the Möbius transformation, and projecting back to the sphere.

In fact, the projection and the Möbius transformation map circles or lines to circles or lines. They preserve angles. This is why the Stereographic Projection is often used for maps that should show the correct angles between paths. But, note that great circles that do not path to the pole are not mapped to lines, but to circles. So the shortest path on the surface of the Earth between two points is a circle on the projected map. Locally, this is only approximated by a line segment.

The speed of C, Java, Python versus EMT

I recently tested Python to see if it is good as a first programming language. The answer is a limited „yes“. Python has certainly a lot to offer. It is a mainstream language and thus there is a lot of support in the net, and there are tons of libraries for all sorts of applications. A modern version like Anaconda even installs all these tools effortlessly.

However, I keep asking myself if it isn’t more important to learn the basics before flying high. Or to put it another way: Should we really start to teach the usage of something like a machine learning toolbox before the student has understood data types and simple loops? This question is important because most high-level libraries take away the tedious work. If you know the right commands you can start problem-solving without knowing much of the underlying programming language.

I cannot decide for all teachers. It depends on the type of students and the type of class. But I can give you an idea of how much a high-level language like Python is holding you back if you want to implement some algorithm yourself. You have to rely on a well-implemented solution being available, or you will be lost.

So here is some test code for the following simple problem. We want to count how many pairs (i,j) of numbers exist such that i^2+j^2 is prime. For the count, we restrict i and j to 1000.

#include 
#include 

int isprime (int n)
{
	if (n == 1 || n == 2) return 1;
	if (n % 2 == 0) return 0;
	int i = 3;
	while (i * i <= n)
	{
		if (n % i == 0) return 0;
		i = i + 2;
	}
	return 1;
}

int count(int n)
{
	int c = 0;
	for (int i = 1; i <= n; i++)
		for (int j = 1; j <= n; j++)
			if (isprime(i * i + j * j)) c = c + 1;
	return c;
}

int main()
{
	clock_t t = clock();
	printf("%d\n",count(1000));
	printf("%g", (clock() - t) / 1000.0);
}

/*
Result was:
98023
0.132
*/

So this takes 0.132 seconds. That is okay for one million prime-number checks.

Below is a direct translation in Java. Surprisingly, this is even faster than C, taking 0.127 seconds. The reason is not clear. But I might not have switched on every optimization of C.

public class Test {

	static boolean isprime (int n)
	{
		if (n == 1 || n == 2) return true;
		if (n % 2 == 0) return false;
		int i = 3;
		while (i * i <= n)
		{
			if (n % i == 0) return false;
			i = i + 2;
		}
		return true;
	}

	static int count(int n)
	{
		int c = 0;
		for (int i = 1; i <= n; i++)
			for (int j = 1; j <= n; j++)
				if (isprime(i * i + j * j)) c = c + 1;
		return c;
	}

	static double clock ()
	{
		return System.currentTimeMillis();
	}
	
	public static void main (String args[])
	{
		double t = clock();
		System.out.println(count(1000));
		System.out.println((clock() - t) / 1000.0);
	}

	/*
	Result was:
	98023
	0.127
	*/	
	
}

Below is the same code in Python. It is more than 30 times slower. This is unexpected even to me. I have to admit that I have no idea if some compilation trick can speed that up. But in any case, it is the behavior that an innocent student will see. Python should be seen as an interactive interpreter language, not a basic language to implement fast algorithms.

def isprime (n):
    if n==2 or n==1:
        return True
    if n%2==0:
        return False
    i=3
    while i*i<=n:
        if n%i==0:
            return False
        i=i+2
    return True

def count (n):
    c=0
    for k in range(1,n+1):
        for l in range(1,n+1):
            if isprime(k*k+l*l):
                ## print(k*k+l*l)
                c=c+1
    return c

import time
sec=time.time()
print(count(1000))
print(time.time()-sec)

## Result was
## 98023
## 4.791218519210815

I have a version in Euler Math Toolbox too. It uses matrix tricks and is thus very short, using built-in loops in C. But it still cannot compete with the versions above. It is about 3 times slower than Python and 100 times slower than C. The reason is that the isprime() function is not implemented in C but in the EMT language.

>n=1000; k=1:n; l=k'; tic; totalsum(isprime(k^2+l^2)), toc;
 98023
 Used 13.352 seconds

Below is a version where I use TinyC to check for primes. The function isprimc() can only handle real numbers, no vectors. Thus I vectorize it with another function isprimecv(). This takes a bit of performance.

>function tinyc isprimec (x) ...
$  ARG_DOUBLE(x);
$  int n = (int)x;
$  int res = 1; 
$  if (n > 2)
$  {
$     if (n%2 == 0) res=0;
$     int i=3;
$     while (i*i<=n)
$     {
$         if (n%i==0) { res=0; break; }
$         i = i+2;
$     }
$  }
$  new_real(res);
$  endfunction
>function map isprimecv (x) := isprimec(x);
>n=1000; k=1:n; l=k'; tic; totalsum(isprimecv(k^2+l^2)), toc;
   98023
   Used 0.849 seconds

Mathematics of an Epidemic Disease

Since we are currently fighting the Corona epidemy I thought it might be a good idea to simulate the spreading of such a disease in Euler Math Toolbox. The numbers here are fictitious and can only deliver qualitative results.

The mathematical model I took is quite simple.

  • The disease has n1=7 days of incubation. During this time the infected person can infect others.
  • The disease has n2=7 days of severe sickness. I have assumed that there are no further infections during that time.
  • We start with the a-immune population plus 10 infected cases.
  • The chance for infection depends on the probability to contact an infected person during the incubation time and the number of contacts each day that could lead to an infection. I assume that a person has nc=0.2 to nc=0.3 contacts each day which could lead to an infection.
  • In the end, the population is either in the non-immune or the immune group.

So the total population is mapped into a vector x of groups as

  • x[1] : non-immune
  • x[2] to x[1+n1] : incubation
  • x[2+n1] to x[1+n1+n2] : sick
  • x[2+n1+n2] : immune

The formula for the number of new infections in x[2] (first day of incubation) is

\(x[2] = x[1] \left(1-(1-\frac{S}{P})^{n_c}\right)\)

where S is the number of incubated people and P is the total population. So 1-S/P is the chance that the contacted person is not infected, and thus the factor computes the chance for infection. Note that this factor is very small at the start since S/P is very small. Yet the exponential growth does its job over time.

Here is a typical plot of the epidemy with nc=0.3. This is equivalent to contacting 9 other people in a month in a way that can spread the disease. We see that almost 90% of the population has to go through the disease in this case.

Development with nc=0.3

Here is another plot with nc=0.25, equivalent to contacting 7.5 other people in a month.

Development with nc=0.25

That reduces the number of cases a lot and limits the peak number of sick persons from 37% to 22%. For the medical system, this is an enormous relief.

The code of Euler Math Toolbox to do these plots is listed below. You can cut and paste it into EMT to try yourself.

>n1=7; // 7 days of incubation time
>n2=7; // 7 days of sickness
>pm=0.01; // mortility rate
>pop=50000000; // population
>nc=0.3; // number of infectious contacts
>function oneday (x,n1,n2,pm,pop,nc) ...
$xn=x;
$xn[2]=floor(xn[1]*(1-(1-sum(x[2:1+n1])/pop)^nc)); // new infections
$xn[1]=x[1]-xn[2]; // remaining uninfected
$xn[3:2+n1+n2]=x[2:1+n1+n2]; // progress one day 
$xn[-1]=xn[-1]+x[-1];
$return xn;
$endfunction
>x=zeros(2+n1+n2); x[1]=pop; x[2]=10; // start with 10 cases
>function sim (x,n1,n2,pm,pop,nc) ...
$X=x;
$repeat
$   xn=oneday(x,n1,n2,pm,pop,nc);
$   X=X_xn;   
$   until sum(xn[2:1+n1+n2])==0;
$   x=xn;
$end;
$return X;
$endfunction
>X=sim(x,n1,n2,pm,pop,nc);
>Xs=X[,1]|sum(X[,2:2+n1])|sum(X[,2+n1:1+n1+n2])|X[,-1];
>plot2d(Xs'/pop,color=[green,orange,red,blue]); ...
>labelbox(["immune-","infect+","sick","immune+"], ...
>  colors=[green,orange,red,blue],x=0.4,y=0.2,w=0.35):

I do have the same code in Python.

import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

n1=7 # 7 days of incubation
n2=7 # 7 days of sickness
pm=0.01 # mortility rate
pop=50000000 # population
nc=0.3 # number of infectious contacts

def oneday (x,n1,n2,pm,pop,nc):
    xn=x.copy()
    xn[1]=np.floor(x[0]*(1-(1-sum(x[1:1+n1])/pop)**nc)) # new infections
    xn[0]=x[0]-xn[1] # remaining uninfected
    xn[2:2+n1+n2]=x[1:1+n1+n2] # progress one day 
    xn[-1]=xn[-1]+x[-1]
    return xn

x=np.zeros(2+n1+n2)
x[0]=pop
x[1]=10 # start with 10 cases

def sim (x,n1,n2,pm,pop,nc):
    X=x
    while True:
        xn=oneday(x,n1,n2,pm,pop,nc)
        X=np.vstack((X,xn))
        if np.sum(xn[2:1+n1+n2])==0:
            break
        x=xn;
    return X;

X=sim(x,n1,n2,pm,pop,nc)

x1 = X[:,0]
x2 = np.sum(X[:,1:1+n1],axis=1)
x3 = np.sum(X[:,1+n1:n1+n2],axis=1)
x4 = X[:,-1]

fig,ax = plt.subplots(figsize=(14,8))
ax.grid(True)
ax.plot(x1,label='immune-')
ax.plot(x2,label='infect+')
ax.plot(x3,label='sick')
ax.plot(x4,label='immune+')

plt.legend();

We can also understand the problem a bit better if we study the number of new infections in x[2], the first day of incubation. It starts with our chosen number of initial infections. But those 10 induce 3 new infections and the 13 new take another 3. Then we have 16 and they infect 4, and so on exponentially.

New Infections

If the chance S/P of an infected contact is small we have

\(\left(1-(1-\frac{S}{P})^{n_c}\right) \sim \frac{n_c S}{P}\)

Thus

\(x[2] \sim n_c S\)

exactly as observed.

Euclidea App

Geometry is not as popular nowadays as it was. So I was very surprised to learn about a game that features geometry puzzles. It poses problems of Euclidean geometry in the plane, e.g. constructing the tangent to a circle. To solve the problem fully, goals have to be completed. One goal (L) is to do it with the minimum number of steps with all tools available (which include composite tools like the middle perpendicular). Another goal (E) is to do it with the minimum number of elementary tools, which are circles and lines. The third goal (V) is to construct variations of the solution, like reflections of the solution. The goal (E) is often very difficult even for elementary constructions. I show two examples below.

The app is done by some Russian programmers. I could find no information on the authors at all. Actually, there are lots of Russian programmers that make money with their skills in different areas. Besides gaming, an example is flight simulation where Russia programmers made some of the best addons. The programmers are not necessarily skilled geometry geeks. There are old books buried somewhere in libraries that contain lots of geometry problems and solutions. Those are the reminders of the old times when geometry was still popular.

The app is extremely nicely done and shows dedication and skill with user interfaces. Some details could need perfection and I will talk about that later. The app is free. But you can buy more problems. It is a nice touch that you get these for free if you have solved all the previous goals. However, I am considering to donate to these programmers.

Here is an example of a problem. I present the problem with my own program C.a.R. This program could also pose problems with restricted toolsets, called assignments. However, there is no game feature and I never cared about the number of steps taken. It was a program to help to teach geometry in schools.

Your job is to construct the green line given the black lines. In the app, the given objects are not fat and the user cannot set colors. That makes it more difficult than it should be to keep track of the progress.

Of course, it is quite easy to do the job with two given tools, the line tool, and the perpendicular. Moreover, every geometry student should know the following solution.

It takes 5 elementary steps. It is a typical mathematical solution because it is based on another solution to another problem. Mathematics often does that. We know how to construct the middle perpendicular. Thus we know how to construct a perpendicular to a line through a point on the line. It remains to see the tangent as perpendicular to the radius. But the goal was to do it with three elementary steps, not 5. My second solution was the following.

It takes 4 elementary steps still and is based on another well-known geometry fact, the Thales circle. Observe the brown circle (2) to see why this works. It is still not the optimal solution. I now tell you a secret: You can find all solutions on Youtube. Nowadays, going to Youtube is the first step that students take.

The following is the solution with three steps. Unfortunately, I was given that solution before I could find it. Why it works is by no means obvious.

The only proof I could find is a computation with angles.

By basic equivalences, we get that the angles with equal color are equal. By the circle chord theorem, the angle of a chord of a circle with respect to a point on the circle is half the angle with respect to the midpoint of the circle, thus black=2*blue. We also have black+red=90°. Thus black+2*blue=90°.

There should be a simpler proof to this, but I have not yet found it.

All in all, this is an extremely well-done app. I would like to use colors while constructing. I would also like to use the given segments as lines. This is important in the following example, where you have to construct the center of the inscribed circle with only 6 elementary steps. Note that the usual construction takes 4 elementary steps for each angle bisector.

That is already very complicated. The first step is to construct one angle bisector using the blue circles. The second bisector is found with the mysterious red circle which is constructed by one of the blue circles. Its second intersection with the same blue circle yields the second angle bisector. I found this by trial and error. After all, you can construct the angle bisector in C.a.R. so see where it should be. In the app, you can show the targets too. Did I tall you that you can also find the solution on Youtube?

Does anybody have an easy proof?

A Geometry Problem

The internet, and especially Youtube, are a vast resource for mathematics now. You find all sorts of problems, explanations, and tutorials, usually done by very talented people. I like to dig around in that pile and soon find some pearls like the following one. The problem has been presented by a guy named Presh Talwalkar. Some of his problems I knew already, but this one was new to me.

The problem is to determine the area of the quadrilateral with the question mark. Of course, no angles or side lengths are given. It took me a while to figure out a simple solution.

The first idea is to put all this in algebraic terms. To make things simpler, we would put the lower-left corner to (0,0) and the lower-right one to (0,a), and the third one to (b,c). The points on the left and right sides of the triangle can then be determined using algebraic geometry (depending on a,b,c) and thus the area in question. This looks tedious but is a really nice exercise. Do not forget the determinant formula for the area of a triangle. Let’s do it!

Euler Math Toolbox (EMT) happens to have a geometry package which can use Maxima to compute symbolically with geometric objects. It turns out to get more involved than expected. Trying to do this by hand seems to be really hard work.

>load geometry
 Numerical and symbolic geometry.
>A &= [0,0]; B &= [a,0]; C &= [b,c];
>P1 &= u*C+(1-u)*A; P2 &= v*C+(1-v)*B;
>&solve(areaTriangle(A,P2,B)=6,v), P2v &= expand(P2 with %)
 
                                    12
                               [v = ---]
                                    a c
 
 
                           12 b   12      12
                          [---- - -- + a, --]
                           a c    c       a
 
>&solve(areaTriangle(A,P1,B)=7,u), P1u &= expand(P1 with %)
 
                                    14
                               [u = ---]
                                    a c
 
 
                                14 b  14
                               [----, --]
                                a c   a
 
>lv &= lineThrough(A,P2v); 
>lu &= lineThrough(P1u,B); 
>D &= lineIntersection(lv,lu)
 
                      2
                   7 a  c + 84 b - 84 a     84 c
                  [--------------------, -----------]
                       13 a c - 84       13 a c - 84
 
>sol &= solve(areaTriangle(A,D,B)=4,[a,b,c])[1]  ...
>  with [%rnum_list[1]=s,%rnum_list[2]=t]
 
                             168
                        [a = ---, b = s, c = t]
                             5 t
 
>&expand(areaTriangle(A,C with sol,P2v with sol)) - 3
 
                                   39
                                   --
                                   5
 
>&B with sol
 
                                 168
                                [---, 0]
                                 5 t
 
>&C with sol
 
                                 [s, t]
 
>&P1u with sol
 
                                5 s  5 t
                               [---, ---]
                                12   12
 
>&P2v with sol
 
                             108   5 s  5 t
                            [--- + ---, ---]
                             5 t   14   14
 

The problem is that there is more than one solution. The triangle is not determined by those three areas! However, the area in question is always the same. This is quite a surprising fact if we start algebraically as above. On the other hand, it is not surprising if we assume that the problem can be solved at all. For any transformation (x,y) -> (kx,y/k) keeps the areas intact. Moreover, any transformation (x,y) -> (x+c,y) keeps the given areas intact. So if we have one solution, we can effectively get another solution where C is placed anywhere.

I took the values into C.a.R. (with sliders for s and t, and B,C,D,P1,P2 according to the expressions above) and here is one such triangle.

 

The computations seem to work, and moving the sliders shows that 7.8 is indeed the solution to our problem.

Isn’t there an easier and more intuitive solution? I have not watched the linked video yet. But upon discussion with a friend, the solution occurred to me. Have a look at the following sketch.

The trick is to look at the triangle ABP1 and to see BP1 as its base side. Then divide the triangle into ADP1 and ABD. Those triangles have the same height upon BP1 and since the area is height times base over 2, we get (solving for the height)

\(\dfrac{4}{DB} = \dfrac{3}{DP1}\)

Now the same can be done in the CBP1. Thus

\(\dfrac{y+2}{DB} = \dfrac{x}{DP1}\)

Thus

\(\dfrac{y+2}{4} = \dfrac{x}{3}\)

Now we do the same with the triangles over the baseline AP2.  We get in the same way

\(\dfrac{x+3}{4} = \dfrac{y}{2}\)

Now we have two equations for x and y. The solution is

\(x = \dfrac{21}{5}, \quad y = \dfrac{18}{5}\)

So

\(x+y = \dfrac{39}{5}\)

To find such solutions is a matter of practice, combined with trial and error, plus a grain of stubbornness. The brain needs to have enough math tricks to look for, simply. Then you need to look around if anything looks familiar to you.

Mathematische Software – Geogebra

This is the Android Version of GeoGebra. It is a lot more cumbersome to use than the desktop version, but you can get used to it. Although I will be criticizing it a bit below it is a very nice software.  It can do far more than an average student would ever need. If you do not know the power of GeoGebra, have a look at its home page. My own programs Euler Math Toolbox or C.a.R. and others in the same category may be more powerful on many areas and more versatile. But GeoGebra offers an all-in-one package with teacher support and a world community. Over the years, they completed the program and added missing features. They even included 3D constructions and augmented reality. For this, we had to use specialized programs like Archimedes 3D or Cabri 3D, both payware. Moreover, they added JavaScript support on webpages so that constructions can still be embedded after Java has been killed in the browser. This free package is to be recommended.

Instead of writing a review, let me point out the shortcomings of CAS with regard to the learning process. No, I won’t be arguing that software hinders the process of acquiring mathematical skills and math has to be done with pencil and paper. Rather the contrary is true. I will be arguing for more software usage. But it needs to be used in an intelligent way. For that, we need intelligent teaching.

Have a look at the example in the image above. It was the first example I tried on my Android device. We are discussing the function

\(f(x) = e^{5x} \, \sin(x)\)

We want to learn its behavior. It is very difficult to get a really good impression on the Android device. You can try to zoom in and out. But without further information, you will not be able to grasp its structure. The software uses the nicest feature of touch screens, the pinch zoom. So you can zoom right into the interesting region as in the image above. Even then, it looks as if the function was zero left of -1.5.

If you zoom out further you see the following.

One can only understand this image if one knows how the two factors look like and have ever studied a dampened oscillation before. So the plot does not really help without the mathematical background. But, on the other hand, if you have the background the plot can be a huge help in asserting the knowledge and confirming it.

Next, I tried to find the first local minimum on the negative axis. You can solve that numerically in the program by touching the graph in the minimum. The software will then display one of these black dots showing all special points of the plot, and you can read the coordinates below the plot. I think this is a very nice way of grasping math and something EMT cannot do that easily. Of course, you can do it on the command line.

>function f(x) &= exp(5*x)*sin(x)
 
                               5 x
                              E    sin(x)
 
>plot2d(f,-1,0.2);
>xm=solve(&diff(f(x),x),0.2)
 -0.19739555985
>plot2d(xm,f(xm),>points,>add);


But let us talk about the CAS aspect of this solution. GeoGebra produces a very interesting solution to this problem.

\(\{ x = 2 \tan^{-1}(\sqrt{26}+5), \, x=2 \tan^{-1}(-\sqrt{26}+5) \}\)

There is a switch to evaluate this numerically. If it is pressed four surprising values appear: -191.31°, -11.31°, 168.69°, 348.69° (rounded). The degrees can most likely be avoided by setting the program to radian mode. The values are correct.

>fzeros(&diff(f(x),x),-200°,360°); %->°
 [-191.31,  -11.3099,  168.69,  348.69]

What do we make of all this?

First of all, the computations, algebraic or numerical, do not make much sense without proper explanations and without the mathematical background. The zeros of the sine function and consequently of the function f simply are the multiples of 180°. Between each zero, the function has at least one extremum, alternatingly a minimal and maximal value. So much is easy to see with the naked eye. In fact, there is exactly one extremal point in each interval. This is harder to see, however. By computing the derivative, we get the extremal points as the solutions of

\(\tan(x) = – \frac{1}{5}\)

Every book about trigonometry contains a plot of the tangent function like the following.

I added the line y=-0.2. So you can easily see that the extremal points repeat in distances of 180°. Problem solved.

Why does GeoGebra produce such a complicated answer involving the square root of 26? That is a mathematical problem in itself and well above the capabilities of high school students. The reason may be connected to replacing sine with 1 minus cosine squared and solving a quadratic equation.

And why does it only show four of the infinitely many solutions? I do not know.

We learn from all this that numerical or algebraic software or plots can be useful. But without mathematical background they are useless.