# Integrals and Monte Carlo Methods

I am teaching surface integrals right now. Searching for some interesting applications away from physics, I came up with mean values along surfaces. After all, mean or „expected“ values are a center part of probability theory. It is tempting to try to verify the analytic results by Monte Carlo methods. This can be done nicely in Euler Math Toolbox.

E.g., an integral on a finite interval [a,b] can be simulated this way. We generate random numbers in [a,b] uniformly, and compute the mean of the values of the function. The expected mean value is the integral of the function divided by the length of the interval.

>&integrate(log(x),x,0,2), %()

2 log(2) - 2

-0.61370563888
>N = 10 000 000;
>x = random(N)*2;
>mean(log(x))*2
-0.61394326247


Syntax notes for EMT: The blanks in integer numbers can be generated by pressing Ctrl-Space. This generates a hard space which is skipped when scanning the number. The random() function generates uniform numbers in [0,1] which we need to be scaled properly to fall into [0,2].

The accuracy of these Monte Carlo methods are like 1/√N. So, we cannot expect more than 4 digits, even with ten million samples.

As another example, the area of the circle can be simulated by throwing random numbers into the unit cube and counting the number of hits inside the circle. We expect a proportion of pi/4 there.

>N = 10 000 000;
>x = 2*random(N)-1; y=2*random(N)-1;
>sum(x^2+y^2<=1)/N
0.7854814
>pi/4
0.785398163397


Syntax notes for EMT: Applying the boolean „≤“ to a vector generates a vector of zeros and ones. Summing this vector up counts the „true“ cases.

This rejection method is also a simple way to create random numbers inside the unit circle. It is not as bad as it sounds, since more than every second pair of numbers hits inside the circle.

For a further test, let us compute the integral of r^2 on the unit circle (which is proportional to the energy in a rotating disk). The integral is the mean value of r^2 on the unit circle times the area of the unit circle.

>N = 10 000 000;
>x = 2*random(N)-1; y=2*random(N)-1;
>r = sqrt(x^2+y^2);
>i = nonzeros(r<=1);
>mean(r[i]^2)*pi
1.57088571913
>&2*pi*integrate(r^3,r,0,1), %()

pi
--
2

1.57079632679


Syntax notes for EMT: The nonzeros() function generates a vector of indices of nonzero elements. We can use that vector to select the hits from the vector r.

For the correct integral, we used the formula for rotational invariant functions on circles.

As a side remark, there is a direct method to generate random numbers inside the unit circle. For this, we need to select a random angle between 0 and 2pi. The radius must be selected in proportion to the length of the circle with this radius, i.e. with density 0≤r≤1. The following does the trick. We explain it after the code.

>function circrandom (N) ...
$t = random(N)*2*pi;$  r = sqrt(1-random(N));
$return {r*cos(t),r*sin(t)}$endfunction
>{x,y} = circrandom(10000);
>plot2d(x,y,>points,style=".",r=1.2):


This will generate the following points, which look quite nicely distributed in the unit circle.

Why does this work? The reason is that any probability distribution can be simulated by inverting its accumulated distribution, in this case

$$F(R) = p(r \ge R) = \int\limits_R^1 2r \, dr = 1-r^2$$

Here, we take 2r because this generates a probability distribution with density proportionally to r. Then we create a random number between 0 and 1 and apply the inverted function.

Using this trick, we can do another Monte Carlo simulation of the integral.

>{x,y} = circrandom(N);
>r = sqrt(x^2+y^2);
>i = nonzeros(r<=1);
>mean(r[i]^2)*pi
1.57153749805


Syntax notes on EMT: Note that circrandom() returns two values. It is possible to assign these two values to two variables.

Finally, we try an integral on a surface. We want to know the mean geographical width of a point in the Northern Hemisphere.

We try a simple rejection method. For this, we generate random numbers in the unit cube and reject those with z<0, r>1 or r<1/2. The remaining points, projected to the surface, are randomly distributed on the Northern Hemisphere.

>N = 10 000 000;
>x = random(N)*2-1; y = random(N)*2-1; z=random(N)*2-1;
>r = sqrt(x^2+y^2+z^2);
>i = nonzeros(z>0 && r<1 && r>1/2);
>mean(asin(z[i]/r[i]))
0.570688989539
>&2*pi*integrate(asin(z),z,0,1) / (2*pi) | factor, ...
>   %(), %->"°"

pi - 2
------
2

0.570796326795
32.7042204869°


The analytic surface integral is done by parameterizing the surface using

$$g(z,t) = [\sqrt{1-z^2}\cos(t),\sqrt{1-z^2} \sin(t),z]$$

We can compute the Gram determinant for this parameterization, which is actually 1 (or use a geometric reasoning with the same result). Thus, for any function on the hemisphere H which depends on the height z only, we get

$$\int_H f(x,y,z) \,dS = 2 \pi \int\limits_0^1h(z) \,dz$$

The rejection trick works on symmetric surfaces only, where we can easily do an orthogonal projection and know the distance to the surface. Otherwise, we have to distribute the random points on the parameters in accordance with the distribution of the surface measure. This can be very tricky.

# EMT – Simulation of the Two Child Problem

The two child problem is a problem in probability theory with a solution that seems paradox on first sight. I wrote about that problem before. Let me repeat the explanation and do a simulation to convince everybody that the solution is really correct. You can find information on that problem on Wikipedia.

The first simple version starts with this story: You meet a man in a bar and he mentions his daughter. You ask him about his children and he answers that he has two kids. The question is: What is the probability of the other kid (the one he was not talking about) to be female?

The answer depends on some important details. But let us assume you have no information if the other kid is younger or older, or any other special information about the daughter and the man is just a random man from the population. These details will turn out to be important! Then the intuitive answer to our question 1/2 is wrong, or rather, it does not make sense. The correct answer is 1/3.

In my opinion, each probabilistic problem makes sense only if we can devise an experiment that would in theory or in practice simulate the problem. Our computation should predict the outcome of the experiment. Problems that cannot be simulated in any way are not interesting to me. This includes most problems in probability or measure theory which need the help of the axiom of choice.

But our problem can easily be simulated. And setting up the simulation gives great insights into the meaning of our question and the terms „probability“ and „randomly selected“. Let us do it in Euler Math Toolbox (EMT). What we do is a Monte-Carlo simulation. We need to make the following assumption: The man has two kids with random gender, one of which is female, and the probability for a kid to be either gender is 1/2 (no diverse genders in this posting). So we draw 1000000 pairs of kids with random gender. Then we count the proportion of two female kids among all pairs with at least one female kid.

>n = 1000000
1000000
>K = intrandom(2,n,2)
Real 2 x 1000000 matrix

2             1             1             2     ...
1             2             2             2     ...
>i = nonzeros(K[1]==1 || K[2]==1); ni = length(i); ni/n
0.749596
>sum(K[1,i]==1 && K[2,i]==1) / ni
0.334372115113


The syntax may seem cryptic, but it is intuitive if you understand the Matrix language. K contains a pair of kids in each column (n=1000000 columns). „K[1]==1“ returns a vector of 0/1 with 1 (true) on each position where the vector K[1] (the first row of K) is 1. I.e., a vector indicating the pairs where the first kid is female. „||“ is short for „or“, and nonzeros() returns the indices of the non-zero elements of a vector. Thus „ni“ is the number of pairs such that either kid is female. As expected, „ni/n“ is approximately 3/4. There are four cases, and one case (tow boys) is wrong.

In the final line, we count the numbers of pairs in the „i“-columns, where both kids are female, using sum() which sums up the ones and compare that to the total number of right cases „ni“. The answer is approximately 1/3.

This should not surprise us since there are three cases in the „i“-columns: (1,2), (2,1), (1,1). Only one of these cases is the correct one.

Why does the problem depend on the details? For this, we assume that it is Tuesday and the man in the bar has her birthday today. Surprisingly, this changes the problem completely! Even if we only know that the daughter is born on a Tuesday the problem changes drastically.

Let us start with the Tuesday problem. We simulate the same now by randomly selecting weekdays for the birthday of both kids.

>n = 1000000;
>K = intrandom(2,n,2);
>D = intrandom(2,n,7)
Real 2 x 1000000 matrix

7             4             6             4     ...
7             2             5             5     ...
>i = nonzeros((K[1]==1 && D[1]==2) || (K[2]==1 && D[2]==2)); ni = length(i); ni/n
0.137799
>sum(K[1,i]==1 && K[2,i]==1) / ni
0.481679838025


The code did not change very much. But the „i“-columns now contain only columns with one Tuesday girl (2 means Tuesday above). The probability for this much lower, of course. We have 4*49=196 cases in total (4 gender pairs, 7 days for each). Of these, only 13 contain a Tuesday girl. There are 6 with the first kid a Tuesday girl and the other a girl born on another day, and 6 with the younger in the same way, and one with two Tuesday kids, plus 14 cases with one Tuesday kid and a boy. This is 27/496~0.13775.

Out of these 27 cases, we have 13 cases with two girls, as computed above. We have 13/27~0.48148. The simulation works as good as it can. The accuracy of a  Monte-Carlo simulation is only about 1/sqrt(n)=1/1000.

If we know that the birthday is today we can just take the probability as 1/2. In this case, the pick of the other child is almost independent of the pick of the first child in contrast to the original problem. And that is the true reason for the confusion. If we had known that the girl is the elder one the pick of the other child is independent and thus female with probability 1/2. But we only know that one of the kids is female. That changes the situation completely and both picks depend on each other.

I recently saw a Youtube video which I do not link here because it only adds to the confusion. The video could not explain why it makes a difference between knowing that the girl has a birthday, and knowing the precise birthday. This is easy to explain if you think of an experiment as above. Drawing a man with two kids, one of them being a daughter, is a different experiment than drawing a man with two kids, one of them is a daughter born on a Tuesday.

# Yahtzee Waiting Times

I recently was asked about waiting times in the game of Yahtzee. If you do not know the game it suffices to say that you throw 5 dice, and one of the goals is to get 5 dice with the same number, a Yahtzee. I wrote about waiting times a long time ago. But let me repeat the main points.

The trick is to use a number of states S0, S1, …, Sn, and probabilities P(i,j) to get from state Si to state Sj. S0 is the starting state, and Sn is the state we want to reach, in our case the Yahtzee. For a first attempt, we use the number 6s we have on out 5 dice. Then we have 6 states, S0 to S5. With a bit of combinatorics, we can compute the probabilities P(i,j) as

$$p_{i,j} = p^{j-i} (1-p)^{n-i-(j-i)} \binom{n-i}{j-i}$$

If we compute that with Euler Math Toolbox we get the following matrix P.

>p=1/6;
>i=(0:5)'; j=i';
% This is the matrix of probabilities to get from i sixes to j sixes.
0.4018776 0.4018776 0.1607510 0.0321502 0.0032150 0.0001286
0.0000000 0.4822531 0.3858025 0.1157407 0.0154321 0.0007716
0.0000000 0.0000000 0.5787037 0.3472222 0.0694444 0.0046296
0.0000000 0.0000000 0.0000000 0.6944444 0.2777778 0.0277778
0.0000000 0.0000000 0.0000000 0.0000000 0.8333333 0.1666667
0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 1.0000000

This matrix allows us to simulate or compute results about probabilities for 6-Yahtzees.  E.g., we can start with no 6 in S0. After one dice throw, the first row of P yields the distribution of the expected states. We can do dice throws by applying P to the right of our distribution vector.

>v=[1,0,0,0,0,0].P
0.4018776 0.4018776 0.1607510 0.0321502 0.0032150 0.0001286
% After two more throws, we get the following.
>v.P.P
0.0649055 0.2362559 0.3439886 0.2504237 0.0911542 0.0132721


This tells us that the chance to be in state S5 after three throws is just 1.3%.

How about the average time that we have to wait if we keep throwing the dice until we get a 6-Yahtzee? This can be solved by denoting the waiting time from state Si to S5 by w(i) and observe

$$w_i = p_{i,n} + \sum_{j \ne n} p_{i,j} (1+w_j)$$

where n=5 is the final state. Obviously, w(5)=0. Moreover, the sum of all probabilities in one row is 1. Thus we get

$$w_i – \sum_{j=1}^{n-1} w_j = 1$$

Let us solve this system in Euler Math Toolbox.

>B=P[1:5,1:5];
>w=(id(5)-B)\ones(5,1)
13.0236615
11.9266963
10.5554446
8.7272727
6.0000000


The average waiting time for a 6-Yahtzee is approximately 13 throws. If you already got 4 sixes, the average waiting time for the fifth is indeed 6.

We can interpret this in another way. Observe

$$w = (I-B)^{-1} \cdot 1 = (I+B+B^2+\ldots) \cdot 1$$

The sum converges because the norm of B is clearly less than 1. So we interpret this as a process

$$w_{n+1} = B \cdot (w_n + 1)$$

Suppose a waiting room. We have w(n,i) persons in state Si at time n. Now, one more person arrives at each state and we process the states according to our probabilities. On the long run, the number of persons in state S0 will become the waiting time to get out of the system into the final state.

Unfortunately, the probabilities are not that easy to compute when we ask for a Yahtzee of any kind. E.g., if we have 66432 we will keep the two sixes as 66. But we may throw 66555 and switch to keep 555. There is a recursion for the probability to get from i same dice to j same dice. I computed the following matrix which replaces the matrix P above.

 >P
0.0925926 0.6944444 0.1929012 0.0192901 0.0007716
0.0000000 0.5555556 0.3703704 0.0694444 0.0046296
0.0000000 0.0000000 0.6944444 0.2777778 0.0277778
0.0000000 0.0000000 0.0000000 0.8333333 0.1666667
0.0000000 0.0000000 0.0000000 0.0000000 1.0000000


Note that I have only the states S1 to S5 here. With the same method, I get the waiting times.

>B=PM[1:4,1:4];
>w=(id(4)-B)\ones(4,1)
11.0901554
10.4602273
8.7272727
6.0000000


Thus we need only about 11 throws to get any Yahtzee. The probability to get one in three throws is now about 4.6%.

>[1,0,0,0,0].PM.PM.PM
0.0007938 0.2560109 0.4524017 0.2447649 0.0460286