Schlagwort-Archive: Problems

Cambrige Entry Examination

I recently came across a video with the problem to find the sum of the solutions of

\(3^x – \sqrt{3}^{(x+4)} + 20 = 0\)

Maybe anybody who is intimate with this kind of computations sees the trick. You can set

\(y = \sqrt{3}^x,\)

and get the equivalent equation

\(y^2-9y+20=0.\)

Solving this, you get

\(y = \sqrt{3}^x = 3^{x/2} = \dfrac{9 \pm 1}{2}.\)

Thus

\(x_1+x_2 = 2 \log_3(20) = \dfrac{2 \ln(20)}{\ln(3)}.\)

That’s not very exciting.

But why are they asking to find the sum of the solutions? If you hear that, you might think of the Vieta’s fact that the sum and the product of the solutions of a quadratic equation are the coefficients. Using this, you get

\(y_1y_2 = \sqrt{3}^{(x_1+x_2)} = 20\)

immediately.

That’s were I decided to write about this problem here. E.g., you might change the original problem to

\(3^x – \sqrt{3}^{(x+4)} + 9 = 0\)

and get the much more pleasing solution

\(x_1 + x_2 = 2 \log_3(9) = 4.\)

Nice trick! But it is dangerous. Applying the trick to

\(3^x – \sqrt{3}^{(x+4)} + 81 = 0\)

yields the false sum 8.

In fact, this equation does not have a solution at all! Unless you allow complex numbers, and are careful with complex logarithms. In that case, you get multiple solutions.

So be careful with tricks!

Simulating and solving Penney’s game

I recently stumbled across some YouTube videos about Penney’s game. Here is one of them. The problem was actually new to me, like almost all problems in mathematics. There simply are too many. It is easier to come up with a problem than a a solution.

According to Wikipedia, Walter Penney published the problem in the Journal of Recreational Mathematics in 1969. The article seems to be hard to get, so simply let us start an analysis ourselves. After all, that sounds like a lot of fun, and I have written about a similar problem in this blog before.

Assume we toss coins with outcome 0 and 1 at equal probability. So we get a sequence of tosses like

\(0,1,0,0,1,1,1,0,1,0,\ldots\)

We are studying the following game between two players:

  • Player A selects a triplet of tosses (like 0,0,1), and player B another triplet (like 1,0,0).
  • The player whose triplet appears first in the sequence of tosses wins.

Given the sequence above and the selected examples of triplets above, player B would win. His triplet 1,0,0 appeared immediately after four tosses.

If both players select their triplet secretly this is obviously a fair game.

There is a counterintuitive fact here: It is an illusion that the choice of the triplet matters. The average waiting time for any triplet is equal, and selecting (1,0,1) is no better than (1,1,1).

To understand this, we need to see the tossing as a sequence of triplets. In the example above the sequence is the following:

\((0,1,0) \to (1,0,0) \to (0,0,1) \to (0,1,1) \to (1,1,1) \to \ldots\)

All triplets have the same chance (namely 1/8) to appear at the start of the sequence. And any triplet T has exactly two other triplets S1, S2 which can lead to this triplet. Since S1 and S2 have the same chance (namely 1/8) to appear in the n-th position of the sequence and T is selected with probability 1/2 from each, the chance for T to appear in the (n+1)-th position is also 1/8.

After this first excursion into the mathematics of random walks, we need to come back to the third rule of Penney’s Game.

  • The player B selects his triplet after seeing the selected triplet of player A.

Now, there is no reason why player B shouldn’t have an advantage. After all he can simply take the same triplet and get a draw. If he was forced to take another triplet he might even be at a disadvantage. But, indeed, he has a big advantage of at least 2:1. A wrong selection of player A can make his chances even worse.

Let us try to understand this. This is a case of non-transitive ordering.

  • Let us write S>T if the triplet S is more likely to appear before triplet T in the toss sequence when the tosses are started with a random triplet.

Then it is not true that

\(T \ge S \,{\rm and}\, S \ge U \Longrightarrow T \ge U\)

If it were true, the ordering would be complete and player A could select the best triplet, or at least a triplet that cannot be beaten by the selection of player B.

It turns out that for each selected triplet T of player A, there is a selection S which is better than T (S>T). This makes the game a win for player B.

This is counterintuitive again. A consequence is that there is a sequence of triplets with the property

\(T_1 < T_2, \quad T_2 < T_3, \ldots \quad , T_{n-1}<T_n, \quad T_n < T_1\)

In this game, the simple rule to find a better triplet is the following:

  • If player A selects T=(X,Y,Z), player B selects S=(YN,X,Y), where YN is the opposite of Y, i.e., YN=0 if Y=1 and YN=1 if Y=0.

E.g., if player A selects (0,1,0) or (0,1,1), player B selects (0,0,1). For another example, if player A selects (1,1,0) or (1,1,1) player B selects (0,1,1).

How do we understand this? We use a general mathematical model.

A way to understand this is by considering random walks on a directed graph. The graph in this game has 8 vertices, the 8 possible triplets. It has 16 edges, i.e., pairs of vertices, with associated probabilities 1/2. Given any triplet, there are two possible tosses which lead to two possible edges. (Note that some edges go from the triplet to itself!)

E.g., there are two edges leading away from (1,1,0),

\((1,1,0) \to (1,0,0), \quad (1,1,0) \to (1,0,1)\)

or, from (1,1,1)

\((1,1,1) \to (1,1,0), \quad (1,1,1) \to (1,1,1)\)

One mathematical model for uses an adjacency matrix M, containing transition probabilities from vertex to vertex.

  • We set M(i,j) to the probability to go from triplet j to triplet i (i.e., to reach triplet i from triplet j) in one toss. So we have a matrix M of transition probabilities.

Now imagine starting a lot of experiments by walking from a start vertex to subsequent vertices with the given probabilities. The experiments select the start vertex according to a probability vector P, i.e., vertex i is selected with probability P(i).

  • If the distribution of vertices in the experiments is P at a step. Then the matrix multiplication MP will compute the distribution of probabilities at the next step.

Let us do a test in Euler Math Toolbox. First, we generate the adjacency matrix M for our game. Then we start at one specific node and compute the distribution after three steps.

>function makeM () ...
$M = zeros(8,8);
$for i1=0 to 1;
$  for i2=0 to 1;
$    for i3=0 to 1;
$      for j1=0 to 1;
$        for j2=0 to 1;
$          for j3=0 to 1;
$            i=4*i1+2*i2+i3;
$            j=4*j1+2*j2+j3;
$            if i2==j1 and i3==j2 then
$              M[j+1,i+1]=1/2;
$            endif;
$          end;
$        end;
$      end;
$    end;
$  end;
$end;
$return M;
$endfunction
>M=makeM;
>shortest M
    0.5      0      0      0    0.5      0      0      0 
    0.5      0      0      0    0.5      0      0      0 
      0    0.5      0      0      0    0.5      0      0 
      0    0.5      0      0      0    0.5      0      0 
      0      0    0.5      0      0      0    0.5      0 
      0      0    0.5      0      0      0    0.5      0 
      0      0      0    0.5      0      0      0    0.5 
      0      0      0    0.5      0      0      0    0.5 
>p=zeros(8)'; p[1]=1; p'
 [1,  0,  0,  0,  0,  0,  0,  0]
>p=M.p; p'
 [0.5,  0.5,  0,  0,  0,  0,  0,  0]
>p=M.p; p'
 [0.25,  0.25,  0.25,  0.25,  0,  0,  0,  0]
>p=M.p; p'
 [0.125,  0.125,  0.125,  0.125,  0.125,  0.125,  0.125,  0.125]

The function to generate the matrix M is not the most elegant one. It might be better to use a recursion than a loop. After we generate the matrix, we start with a „distribution“ which puts everything into the first triplet (actually that is (0,0,0)). After only three steps, we see that the probability to reach each triplet is equal.

Let us now model Penney’s game.

We have two triplets T and S (vertex numbers i and j) which lead nowhere. I.e., if those states are met we have found a case where T or S are hit.

We modify the adjacency matrix accordingly and delete all edges leading away from T and S and connect T and S to themselves with probability 1. This is a modified adjacency matrix M(i,j). We now have to look at the following limit.

\(P = \lim_{n \to \infty} M_{i,j}^n P_0\)

with

\(P_0 = (1/8,\ldots,1/8)^T\)

which is our start distribution.

Let us simulate that in Euler Math Toolbox. We select the triplets T=(0,1,0) (third element of p) and S=(0,0,1) (second element of p) which according to our rule should be superior.

>M=makeM;
>M[,2]=0; M[2,2]=1;
>M[,3]=0; M[3,3]=1;
>shortest M
    0.5      0      0      0    0.5      0      0      0 
    0.5      1      0      0    0.5      0      0      0 
      0      0      1      0      0    0.5      0      0 
      0      0      0      0      0    0.5      0      0 
      0      0      0      0      0      0    0.5      0 
      0      0      0      0      0      0    0.5      0 
      0      0      0    0.5      0      0      0    0.5 
      0      0      0    0.5      0      0      0    0.5 
>p=ones(8)'/8; p'
 [0.125,  0.125,  0.125,  0.125,  0.125,  0.125,  0.125,  0.125]
>p=M.p; p'
 [0.125,  0.25,  0.1875,  0.0625,  0.0625,  0.0625,  0.125,  0.125]
>p=M.p; p'
 [0.09375,  0.34375,  0.21875,  0.03125,  0.0625,  0.0625,  0.09375,
 0.09375]
>loop 1 to 100; p=M.p; end;
>p'
 [0,  0.666667,  0.333333,  0,  0,  0,  0,  0]

Indeed, the two selected triplets catch all the distribution, and S=(0,0,1) is indeed twice as good as T=(0,1,0). I.e., player B will win twice as often.

Things get worse for T=(0,0,0) and S=(1,0,0). S is seven times as good as T! There are other combinations which are three times as good.

Due to the structure of the problem, the limit P will have only non-zero entries for T and S. The probability to be in any other triplet will tend to 0. T and S will catch everything.

But can we compute the winning chance for S without a simulation? For that we define

  • w(i) as the chance to reach S before T when starting at triplet i.

We are using the following properties of w.

  • w(i) = 1 for the vertex S with number i.
  • w(j) = 0 for the vertex T with number j.
  • w satisfies

\(w = M_{i,j}^T \, w\)

We are searching for a specific eigenvector. The following function of EMT computes w given and adjacency matrix M and i,j, and returns the probability to reach vertex i before vertex j.

>function getw (M,i,j) ...
$if i==j then return 0; endif;
$Mt=M;
$Mt[,i]=0; Mt[i,i]=1;
$Mt[,j]=0; Mt[j,j]=1;
$V = eigenspace(Mt',1);
$c = V[[i,j]]\[1;0];
$return sum((V.c)')/cols(Mt);
$endfunction
>M = makeM();
% Let us try our simulated examples.
>fraction getw(M,2,3), fraction getw(M,5,1)
 2/3
 7/8

The simulated example from above works, as well as another example which takes T=(0,0,0) and S=(1,0,0).

For the next step, we compute all probabilities that the triplet number i appears before the triplet number j when starting from a random triplet.

>function solvePG () ...
$M = makeM();
$W=zeros(8,8);
$for i=1 to 8;
$  for j=1 to 8;
$    W[i,j]=getw(M,i,j);
$  end;
$end;
$return W;
$endfunction
>W=solvePG();
>fracformat(7); W, defformat;
      0    1/2    2/5    2/5    1/8   5/12   3/10    1/2 
    1/2      0    2/3    2/3    1/4    5/8    1/2   7/10 
    3/5    1/3      0    1/2    1/2    1/2    3/8   7/12 
    3/5    1/3    1/2      0    1/2    1/2    3/4    7/8 
    7/8    3/4    1/2    1/2      0    1/2    1/3    3/5 
   7/12    3/8    1/2    1/2    1/2      0    1/3    3/5 
   7/10    1/2    5/8    1/4    2/3    2/3      0    1/2 
    1/2   3/10   5/12    1/8    2/5    2/5    1/2      0 

The non-transitivity of the problem means that in each column there is a value >1/2. We can select this value and print the corresponding triplet in human readable form.

>function printtriplet (i) ...
$s = ")"; i=i-1;
$if mod(i,2) then s=",1"+s; else s=",0"+s; endif;
$i = floor(i/2);
$if mod(i,2) then s=",1"+s; else s=",0"+s; endif;
$i = floor(i/2);
$if mod(i,2) then s="(1"+s; else s="(0"+s; endif;
$return s;
$endfunction
>function printPG (W) ...
$for j=1 to 8;
$  w=W[,j]';
$  e=extrema(w);
$  printtriplet(j) + " -> " + printtriplet(e[4]),
$end;
$endfunction
% We get the well known solution of the game.
>printPG(W)
 (0,0,0) -> (1,0,0)
 (0,0,1) -> (1,0,0)
 (0,1,0) -> (0,0,1)
 (0,1,1) -> (0,0,1)
 (1,0,0) -> (1,1,0)
 (1,0,1) -> (1,1,0)
 (1,1,0) -> (0,1,1)
 (1,1,1) -> (0,1,1)

This is exactly the well-known rule that you find also in the Wikipedia article.

A Simple Combinatorial Approximation Problem

This is certainly a well-known problem. But I pose it here because there is a geometric solution, and geometric solutions are always nice.

Assume you have two sets of n points

\(x_1,\ldots,x_n, \quad y_1,\ldots,y_n\)

The problem is to find a permutation p such that

\(\sum_{k=1}^n (x_k-y_{p(k)})^2\)

is minimized. Of course, we can assume that the x’s are sorted. And it is easy to guess that the minimum is achieved if the permutation sorts the y’s. But how to prove that?

Here is a simple idea. We claim that swapping two of the y’s into the right order makes the sum of squares smaller. I.e.,

\(x_1 < x_2, \, y_1<y_2 \Longleftrightarrow (x_1-y_1)^2+(x_2-y_2)^2 < (x_1-y_2)^2+(x_2-y_1)^2\)

If you try that algebraically the right hand side is equivalent to

\(-x_1y_1-x_2y_2 < -x_1y_2-x_2y_1\)

Putting everything one side and factorizing, you end up with

\((x_2-x_1)(y_2-y_1) > 0\)

and the equivalence is proved.

But there is a geometric solution. We plot the points into the plane. Two points are on the same side of the symmetry line between the x- and y-axis if they have the same order. And the connection to the mirrored point will always be larger.

We can extend this result to more norms. E.g., the sorting of the y’s will also minimize

\(\sum_{k=1}^n |x_k-y_{p(k)}|, \, \max_k |x_k-y_{p(k)}|\)

As long as the underlying norm is symmetric with respect to the variables, i.e.,

\(\|(\ldots,t,\ldots,s,\ldots)\|=\|(\ldots,s,\ldots,t,\ldots)\|\)

The prove for this can be based on the same geometrical argument.

Another Youtube Problem

Recently, I found the following problem on Youtube:

\(x+y+z=1, \\ x^2+y^2+z^2=2, \\ x^3+y^3+z^3=3, \\ x^n+y^n+z^n= \,?\)

I did not continue to watch the video. So I have no idea what solution is presented. I decided to treat that as a challenge and to observe my steps, including the failed ones.

The problem sounds like a normal set of three equations with three unknowns that we can solve by eliminating. After all, this is the standard technique that we learn in school. But it is actually not that easy. Eliminating x from the second equation and solving for y yields an expression in y which contains a square root. Inserting everything in the third equation results in a mess.

Let us try numerically. We use Euler Math Toolbox here.

>function f2(x,y) &= 2-x^2-y^2-(1-x-y)^2
 
                        2                2    2
                     - y  - (- y - x + 1)  - x  + 2
 
>a3 &:= 3
 3
>function f3(x,y) &= a3-x^3-y^3-(1-x-y)^3
 
                        3                3    3
                     - y  - (- y - x + 1)  - x  + 3
 
>plot2d("f2",levels=0,r=2);
>plot2d("f3",levels=0,>add):

We see, that the equations do not have a real solution. But they have, of course, a complex solution by the main theorem of Algebra.

The second idea is to try some tricks on the equations. E.g., we could multiply the first equation with x, y, and z and add the three results. Because of the first and the second equation, we get

\(xy+yz+xz=-\dfrac{1}{2}\)

That is a new equation. But is not clear how this is going to help.

After a while, the mathematical instinct kicks in to try something more generic and simple. Pattern matching inside the brain brings a known trick to the mind. Any sequence

\(1,x,x^2,x^3,\ldots\)

has an interesting property. If x is the zero of a polynomial

\(p(t)=t^3+at^2+bt+c\)

It satisfies the interesting property

\(x^n+ax^{n-1}+bx^{n-2}+cx^{n-3}=0\)

Thus the sequence above satisfies a recursion formula. Now, if x,y,z are the three zeros of the polynomial p(t) all three satisfy the same recursion formula, and so does the sum of the three zeros.

\(s_n = – (as_{n-1}+bs_{n-2}+cs_{n-3})\)

for

\(s_n = x^n+y^n+z^n\)

So we know from our three equations and the trivial equation

\(s_0 = x^0+y^0+z^0=3\)

that

\(3 = -(2a+b+3c)\)

But we also know that

\(p(t) = (t-x)(t-y)(t-z) = t^3 – (x+y+z) t^2 + (xy+yz+xz)t -xyz\)

Once we see that, we get a=-1 immediately from our problem. Having played around with tricks, we also know b=-1/2. From that, we now have

\(a=-1, \quad b=-\dfrac{1}{2}, \quad c=-\dfrac{1}{6}\)

We can verify that with Euler Math Toolbox.

For that, we compute the zeros of the polynomial and check the equations numerically. As expected, two of the zeros are complex.

>s=polysolve([-1/6,-1/2,-1,1])
 [ -0.215425+0.264713i,  -0.215425-0.264713i,  1.43085+0i  ]
>sum(s)
 1+0i 
>sum(s^2)
 2+0i 
>sum(s^3)
 3+0i 

The expression for the real zero of p(t) is terrible and involves a third root.

\(z=u^{1/3} + \dfrac{5}{u^{1/3}} + \dfrac{1}{3}, \quad u=\dfrac{\sqrt{13}}{9 \cdot 2^{2/3}}+\dfrac{11}{54}\)

Thus, we get a recursion for the sum and thus for the equation sign in our problem.

\(s_n = s_{n-1}+\dfrac{1}{2}s_{n-2}+\dfrac{1}{6}s_{n-3}\)

Check with EMT.

>fraction sequence("x[-1]+x[-2]/2+x[-3]/6",[3,1,2,3],10)
 [3,  1,  2,  3,  25/6,  6,  103/12,  221/18,  1265/72,  905/36]
>fraction real(sum(s^[0:9]')')
 [3,  1,  2,  3,  25/6,  6,  103/12,  221/18,  1265/72,  905/36]

Can we make a problem with an easier solution? Sure! We only have to start with simpler zeros.

\(x+y+z=0, \\ x^2+y^2+z^2=14, \\ x^3+y^3+z^3=-18, \\ x^n+y^n+z^n= \,?\)

The answer is now

\(s_n = 1 + 2^n + (-3)^n\)

This kind of mathematical thinking looks like a touch of genius. But it is not. Not at all. It is just the summary of long experience in mathematics. The tricks involved are well known. They are just combined in the right way. Mathematical problem solving is just pattern matching using the patterns that have been exercised before. Maybe some stubbornness is needed, and a lot of trial and error. But no genius.

It does help very much to have a numerical and symbolic program like EMT available to check the computations.

The Two Envelopes Problem

Recently I came across a Youtube video by „vsauce“ explaining the two envelope paradox. The problem is at least half a century old, and most likely a lot older. It can be found in a 1953 book by Maurice Kraitchik which is the oldest citation in Wikipedia.

Someone places money into one of two identical envelopes and double the money into the second. We select one of the two envelopes without knowing if we have the first or the second one. We open it. Should we keep that money or return it and opt for the other envelope?

(Let us assume that the value in the first envelope is a rational number. So we cannot conclude with certainty which envelope we have opened. If we selected from integer numbers we would know the envelope in the case of an even value.)

The naive analysis goes like this:

Let X be the amount we have.  Since we have chosen the envelopes at random, the other envelope contains X/2 and 2X with the same probability. So our expectation after switching is

\(\dfrac{X/2+2X}{2} = \dfrac{5}{4} X\)

So switching sounds like a winning strategy. It gets more striking if we consider switching before even opening the envelope. Switching back and forth drives the expected value to infinity. This is clearly nonsense.

This paradox is a confirmation of my principle:

To analyze a probabilistic problem we need an experiment that we could, at least hypothetically, implement in a computer. Equivalently, we need a probability space to work with.

The two envelopes problem does not satisfy this principle. The reason is that it is impossible to select a rational number at random from the set of all rational numbers. You cannot even select an integer evenly from all the integers. We need a probability distribution for that. Think of how you would test the switching strategy on a computer. You will have to tell the machine how to select a number. It can select a number with equal probability from all the numbers it can represent. But it cannot do so from all, infinitely many numbers.

Now that we know that the original problem makes no sense, we can start to analyze the problem by assuming an a priori distribution for the value M in the first envelope. By doing so, we get a bunch of very interesting questions. (By the way, it is always easy to create questions in mathematics. Only a few questions have nice answers, however.)

Let us start by selecting M evenly from 1 to n. Clearly, we switch if the opened envelope, containing X, is odd. We will keep the money only if X>n. For n which is a multiple of 4, we have the following cases:

  1. M<=n/2: We will switch no matter if X=M or X=2M. Thus the average outcome is 3/2 M.
  2. M>n/2: We get 2M in every case.

It is an exercise to compute that our average result is

\(E = \dfrac{15n+14}{16}\)

This is a blog about Euler Math Toolbox too. So let us simulate the strategy in EMT. In this case, a loop would be a good idea, maybe written in C or Python. However, it is also possible to do this elegantly with the matrix language

>n=10; N=1000000;
>M=intrandom(N,n); // select a number between 1 and n
>sel=intrandom(N,2); // select 1 or 2 envelope
>X1=M*sel; X2=M*(3-sel); // X1 is in the selected, X2 in the other
>sum((X1>n)*X1+(X1<=n)*X2)/N // X1 for X1>n, X2 else
 10.253055
>(15*n+14)/16 // expected outcome
 10.25

We could also try a distribution on the positive real numbers for M. A typical candidate is the exponential distribution. We decide ourselves to keep a value greater than some a>0.

In this case, we switch for all M<a/2 and M>a, getting 3/2 M on average. In half of the cases between a/2 and a, we get 2 M. In the other half we still get 3/2 M. Thus

\(E = \dfrac{3}{2}+\dfrac{1}{2} \int\limits_{a/2}^a x e^{-x} \, dx\)

The best choice is a = 4 ln(2) with

\(E \sim 1.68\)

We can simulate that too.

>n=10; N=1000000;
>M=randexponential(N); // select a number between 1 and n
>sel=intrandom(N,2); // select 1 or 2 envelope
>X1=M*sel; X2=M*(3-sel); // X1 is in the selected, X2 in the other
>a=4*ln(2); sum((X1>a)*X1+(X1<=a)*X2)/N // X1 for X1>n, X2 else
 1.68097393969

Note that switching all the time or never both expect E=3/2.

You might ask if there is a distribution for M which ensures that it does not matter to look inside the first envelope. But that is not the case. Such a distribution does not exist.m Knowing the distribution and looking at the first envelope will always yield an expected overhead for us.

Two Tilted Rectangles

I found this problem from the wonderful Math with bad Drawings site in my news feed. They cite it from Carolina Shearer. Her twitter account contains more such nice problems. It is the kind of problem which seems only adequate for advanced students. Sometimes, you can solve them by looking at it in the right way. Most of the time, you will start a lengthy computation. Often, you will notice in retrospect that the solution was quite easy and you could have guessed it.

For this problem, I did not see the solution. The problem, of course, is to fit two equal rectangles into a square in the shown manner. For a start, try to find a reason why the point 1/2 on the lower side solves the problem. You can use the general fact that a=b+c in the figure below.

 

I admit that I did the computations using Euler Math Toolbox. That works. However, there is an elementary fact that can be used to prove a+b=c. For a hint, consult the following image. If the green lines are equal the red lines are equal.

 

A Geometry Problem

The internet, and especially Youtube, are a vast resource for mathematics now. You find all sorts of problems, explanations, and tutorials, usually done by very talented people. I like to dig around in that pile and soon find some pearls like the following one. The problem has been presented by a guy named Presh Talwalkar. Some of his problems I knew already, but this one was new to me.

The problem is to determine the area of the quadrilateral with the question mark. Of course, no angles or side lengths are given. It took me a while to figure out a simple solution.

The first idea is to put all this in algebraic terms. To make things simpler, we would put the lower-left corner to (0,0) and the lower-right one to (0,a), and the third one to (b,c). The points on the left and right sides of the triangle can then be determined using algebraic geometry (depending on a,b,c) and thus the area in question. This looks tedious but is a really nice exercise. Do not forget the determinant formula for the area of a triangle. Let’s do it!

Euler Math Toolbox (EMT) happens to have a geometry package which can use Maxima to compute symbolically with geometric objects. It turns out to get more involved than expected. Trying to do this by hand seems to be really hard work.

>load geometry
 Numerical and symbolic geometry.
>A &= [0,0]; B &= [a,0]; C &= [b,c];
>P1 &= u*C+(1-u)*A; P2 &= v*C+(1-v)*B;
>&solve(areaTriangle(A,P2,B)=6,v), P2v &= expand(P2 with %)
 
                                    12
                               [v = ---]
                                    a c
 
 
                           12 b   12      12
                          [---- - -- + a, --]
                           a c    c       a
 
>&solve(areaTriangle(A,P1,B)=7,u), P1u &= expand(P1 with %)
 
                                    14
                               [u = ---]
                                    a c
 
 
                                14 b  14
                               [----, --]
                                a c   a
 
>lv &= lineThrough(A,P2v); 
>lu &= lineThrough(P1u,B); 
>D &= lineIntersection(lv,lu)
 
                      2
                   7 a  c + 84 b - 84 a     84 c
                  [--------------------, -----------]
                       13 a c - 84       13 a c - 84
 
>sol &= solve(areaTriangle(A,D,B)=4,[a,b,c])[1]  ...
>  with [%rnum_list[1]=s,%rnum_list[2]=t]
 
                             168
                        [a = ---, b = s, c = t]
                             5 t
 
>&expand(areaTriangle(A,C with sol,P2v with sol)) - 3
 
                                   39
                                   --
                                   5
 
>&B with sol
 
                                 168
                                [---, 0]
                                 5 t
 
>&C with sol
 
                                 [s, t]
 
>&P1u with sol
 
                                5 s  5 t
                               [---, ---]
                                12   12
 
>&P2v with sol
 
                             108   5 s  5 t
                            [--- + ---, ---]
                             5 t   14   14
 

The problem is that there is more than one solution. The triangle is not determined by those three areas! However, the area in question is always the same. This is quite a surprising fact if we start algebraically as above. On the other hand, it is not surprising if we assume that the problem can be solved at all. For any transformation (x,y) -> (kx,y/k) keeps the areas intact. Moreover, any transformation (x,y) -> (x+c,y) keeps the given areas intact. So if we have one solution, we can effectively get another solution where C is placed anywhere.

I took the values into C.a.R. (with sliders for s and t, and B,C,D,P1,P2 according to the expressions above) and here is one such triangle.

 

The computations seem to work, and moving the sliders shows that 7.8 is indeed the solution to our problem.

Isn’t there an easier and more intuitive solution? I have not watched the linked video yet. But upon discussion with a friend, the solution occurred to me. Have a look at the following sketch.

The trick is to look at the triangle ABP1 and to see BP1 as its base side. Then divide the triangle into ADP1 and ABD. Those triangles have the same height upon BP1 and since the area is height times base over 2, we get (solving for the height)

\(\dfrac{4}{DB} = \dfrac{3}{DP1}\)

Now the same can be done in the CBP1. Thus

\(\dfrac{y+2}{DB} = \dfrac{x}{DP1}\)

Thus

\(\dfrac{y+2}{4} = \dfrac{x}{3}\)

Now we do the same with the triangles over the baseline AP2.  We get in the same way

\(\dfrac{x+3}{4} = \dfrac{y}{2}\)

Now we have two equations for x and y. The solution is

\(x = \dfrac{21}{5}, \quad y = \dfrac{18}{5}\)

So

\(x+y = \dfrac{39}{5}\)

To find such solutions is a matter of practice, combined with trial and error, plus a grain of stubbornness. The brain needs to have enough math tricks to look for, simply. Then you need to look around if anything looks familiar to you.

EMT – Simulation of the Two Child Problem

The two child problem is a problem in probability theory with a solution that seems paradox on first sight. I wrote about that problem before. Let me repeat the explanation and do a simulation to convince everybody that the solution is really correct. You can find information on that problem on Wikipedia.

The first simple version starts with this story: You meet a man in a bar and he mentions his daughter. You ask him about his children and he answers that he has two kids. The question is: What is the probability of the other kid (the one he was not talking about) to be female?

The answer depends on some important details. But let us assume you have no information if the other kid is younger or older, or any other special information about the daughter and the man is just a random man from the population. These details will turn out to be important! Then the intuitive answer to our question 1/2 is wrong, or rather, it does not make sense. The correct answer is 1/3.

In my opinion, each probabilistic problem makes sense only if we can devise an experiment that would in theory or in practice simulate the problem. Our computation should predict the outcome of the experiment. Problems that cannot be simulated in any way are not interesting to me. This includes most problems in probability or measure theory which need the help of the axiom of choice.

But our problem can easily be simulated. And setting up the simulation gives great insights into the meaning of our question and the terms „probability“ and „randomly selected“. Let us do it in Euler Math Toolbox (EMT). What we do is a Monte-Carlo simulation. We need to make the following assumption: The man has two kids with random gender, one of which is female, and the probability for a kid to be either gender is 1/2 (no diverse genders in this posting). So we draw 1000000 pairs of kids with random gender. Then we count the proportion of two female kids among all pairs with at least one female kid.

>n = 1000000
 1000000
>K = intrandom(2,n,2)
 Real 2 x 1000000 matrix
 
             2             1             1             2     ...
             1             2             2             2     ...
>i = nonzeros(K[1]==1 || K[2]==1); ni = length(i); ni/n
 0.749596
>sum(K[1,i]==1 && K[2,i]==1) / ni
 0.334372115113

The syntax may seem cryptic, but it is intuitive if you understand the Matrix language. K contains a pair of kids in each column (n=1000000 columns). „K[1]==1“ returns a vector of 0/1 with 1 (true) on each position where the vector K[1] (the first row of K) is 1. I.e., a vector indicating the pairs where the first kid is female. „||“ is short for „or“, and nonzeros() returns the indices of the non-zero elements of a vector. Thus „ni“ is the number of pairs such that either kid is female. As expected, „ni/n“ is approximately 3/4. There are four cases, and one case (tow boys) is wrong.

In the final line, we count the numbers of pairs in the „i“-columns, where both kids are female, using sum() which sums up the ones and compare that to the total number of right cases „ni“. The answer is approximately 1/3.

This should not surprise us since there are three cases in the „i“-columns: (1,2), (2,1), (1,1). Only one of these cases is the correct one.

Why does the problem depend on the details? For this, we assume that it is Tuesday and the man in the bar has her birthday today. Surprisingly, this changes the problem completely! Even if we only know that the daughter is born on a Tuesday the problem changes drastically.

Let us start with the Tuesday problem. We simulate the same now by randomly selecting weekdays for the birthday of both kids.

>n = 1000000;
>K = intrandom(2,n,2);
>D = intrandom(2,n,7)
 Real 2 x 1000000 matrix
 
             7             4             6             4     ...
             7             2             5             5     ...
>i = nonzeros((K[1]==1 && D[1]==2) || (K[2]==1 && D[2]==2)); ni = length(i); ni/n
 0.137799
>sum(K[1,i]==1 && K[2,i]==1) / ni
 0.481679838025

The code did not change very much. But the „i“-columns now contain only columns with one Tuesday girl (2 means Tuesday above). The probability for this much lower, of course. We have 4*49=196 cases in total (4 gender pairs, 7 days for each). Of these, only 13 contain a Tuesday girl. There are 6 with the first kid a Tuesday girl and the other a girl born on another day, and 6 with the younger in the same way, and one with two Tuesday kids, plus 14 cases with one Tuesday kid and a boy. This is 27/496~0.13775.

Out of these 27 cases, we have 13 cases with two girls, as computed above. We have 13/27~0.48148. The simulation works as good as it can. The accuracy of a  Monte-Carlo simulation is only about 1/sqrt(n)=1/1000.

If we know that the birthday is today we can just take the probability as 1/2. In this case, the pick of the other child is almost independent of the pick of the first child in contrast to the original problem. And that is the true reason for the confusion. If we had known that the girl is the elder one the pick of the other child is independent and thus female with probability 1/2. But we only know that one of the kids is female. That changes the situation completely and both picks depend on each other.

I recently saw a Youtube video which I do not link here because it only adds to the confusion. The video could not explain why it makes a difference between knowing that the girl has a birthday, and knowing the precise birthday. This is easy to explain if you think of an experiment as above. Drawing a man with two kids, one of them being a daughter, is a different experiment than drawing a man with two kids, one of them is a daughter born on a Tuesday.

Ellipse Geometry – a Problem

I found the following nice problem in my Facebook account. Facebook, however, is a miracle to me and I am always unable to find a posting a second time. Unless you answer something silly like „Following“ it is lost. The best I could find was this link.

The problem is to prove:

The intersections of two perpendicular tangents to an ellipse form a circle.

Of course, this can be computed by Analytic Geometry and I carry this out below. It is no fun, however. E.g., you can find the radius of the circle if the claim is correct, take a point on the circle, compute the two tangents to the ellipse and check that they are perpendicular.

Let us find a more geometrical solution! At first, there is a well known „construction“ of ellipses folding a circular paper. You can do it with actual paper. Cut a circular paper. Mark a point A inside the circle. Then fold the paper, so that the boundary lies on A. I.e., the point P is reflected along the folding line to A. The line is the middle perpendicular of AP. If you do that often enough the folds will outline the green ellipse. There are even videos on Youtube showing this.

For the proof, you need that ellipses are the set of points where the sum of distances to A and M is constant. In our case, it is „obviously“ constantly equal to the radius of the circle. We have the following.

The set of all middle perpendiculars to a fixed point A and points P on a circle is the set of tangents to an ellipse.

Now you know how to get two perpendicular tangents. In the following construction, I did just that. The four blue lines are four middle perpendiculars. They are constructed by selecting a point P on the black circle, the line PA and a rectangular to that line. Then the blue lines are the four middle perpendiculars on A and the four intersections of our green lines with the circle.

We have to prove that

  • the midpoint Z on AM is the center of the rectangle
  • and that the rectangle has the same diameter, independent on the choice of P on the black circle.

Then the corners of the blue rectangle will always be on the same circle.

Now, we stretch the blue rectangle by a factor 2 from the center at A. The point Z will be stretched to the point M then. The blue rectangle will become a rectangle through the intersections as in the following construction.

It is now „quite obvious“ that M is the center of the blue rectangle. Thus Z was the center of the 1/2 times smaller rectangle. This proves our first claim.

For the second claim, we need to show that the length of the diagonal of the large blue rectangle does not depend on the choice of the red point. The diagonal has the length

\(\sqrt{c^2+d^2}\)

by Pythagoras. Now we have another claim.

The sum of the squares of the lengths of each two perpendicular secants of a fixed circle that meet in a fixed point inside the circle is constant.

To prove that have a look at the dashed rectangle with diagonal AM. Using this rectangle and Pythagoras you can „easily“ express the diagonal of the blue rectangle by the length of AM and the radius of the circle. This proves the second claim.

The images in this posting have been done with C.a.R. (Compass and Ruler), a Java program developed by the author. It allows beautiful exports of images and the automatic creation of polar sets for sets of lines. That feature was used to compute the ellipse in the second image. The first ellipse was done using the two focal points. C.a.R. has also ellipses defined by 5 points, or even by an equation or a parameterization.

I promised to compute the problem using Analytic Geometry. I am using the Computer Algebra system Maxima via my Euler Math Toolbox for this.

The first method computes two perpendicular tangents to the ellipse with the equation

\(x^2 + c^2 y^2 = 1\)

To find a tangent perpendicular to a vector v, we can maximize the expression

\(v_1 x + v_2 y\)

on the ellipse using the method of Lagrange. If the vector v has norm 1 the value of this maximum will be the distance of the tangent from the origin. The Lagrange equations fot his are

\(v_1 = 2 \lambda x, \quad v_2 = 2 \lambda c^2 y, \quad x^2+c^2 y^2 = 1\)

After solving this, we get

\(v_1 x + v_2 y = 2  \lambda (x^2+c^2y^2) = 2 \lambda\)

Then we do the same with the orthogonal vector, i.e., we maximize

\(-v_2 x + v_1 y\)

We then show that the sum of squares of these two values is constant.

In EMT and Maxima, this is the following code.

>&solve([v1=2*la*x,v2=2*la*c^2*y,x^2+c^2*y^2=1],[x,y,la]); ...
>  la1 &= la with %[1]
 
                                  2    2   2
                           sqrt(v2  + c  v1 )
                           ------------------
                                  2 c
 
>&solve([-v2=2*la*x,v1=2*la*c^2*y,x^2+c^2*y^2=1],[x,y,la]);  ...
>  la2 &= la with %[1]
 
                                 2   2     2
                           sqrt(c  v2  + v1 )
                           ------------------
                                  2 c
 
>&factor(la1^2+la2^2)
 
                            2         2     2
                          (c  + 1) (v2  + v1 )
                          --------------------
                                     2
                                  4 c

Thus the circle of intersections has the radius

\(r = \dfrac{\sqrt{1+c^2}}{c} = \sqrt{1 + \dfrac{1}{c^2}}\)

This is confirmed by the special case of tangents parallel to the axes.

There are several other methods. One is to construct the tangents using tangents to a circle. For this, the ellipse needs to be stretched by 1/c into y-direction. It will then become a circle. We need to compute points on the image of our circle with radius r under this mapping, then take the tangents to the mapped ellipse and map back. Below is the plan of such a construction.

The computations are very involved, however.

Another method to find both tangents is the following: We look at all lines through a given point (e.g. determined by their slope) and find the ones that intersect the ellipse only once. The product of the two slopes that solve this problem should be 1. Again, this is a complicated computation.