The Bayesian and the Frequentist – II

This is not immediately related to Bayesian statistics. But it is a good argument for the frequentistic approach. I met the problem in  a Youtube video, but the core of this problem is well known. The solutions are not always correct, nor is the linked video. Actually, I present two problems

Problem 1

You see two frogs and hear a croak. The croak can only come from a male. Then, what is the probability that one of the frogs is a female?

Problem 2

You meat a man, and he tells you that he has two kids, at least one of which is a boy. What is the probability of the other being a girl? And does the probability change if you know that the boy is born on a Tuesday?

Both problems are obviously only vaguely formulated. You need to make assumptions. E.g., in the following, let us assume that for each random frog or kid the probability of being male is 1/2. But, as we will see, the answer to Problem 1 depends on more assumptions. The intuitive answers to both problems tend to be completely wrong.

My point is that you need to imagine a Monte-Carlo simulation in both situations. If you cannot come up with an experiment any answer will be useless anyway. That is the heart of the frequentistic approach to statistics.

Let us start with Problem 2. So your simulation would assign genders to both kids by random, so BB, BF, FB and FF have the same probability 1/4. Note that there is BF and FB since we assign the gender to each kid separately. Then the simulation would discard the irrelevant case FF. We are left with BB, BF and FB with equal probability. Thus, 2/3 of the simulated cases contain a female. Here is a code for this in EMT.

>n=100000; Gender=(random(n,2)<0.5); Boys=sum(Gender)'; 
>sum(Boys>0 && Boys<2)/sum(Boys>0)
 0.665760978144

As usual this code is in the style of a Matrix language. If you are not familiar with this you need to write a loop. Hint: „Gender“ is a nx2 matrix with 0=girl and 1=boy. „Boys“ is a vector that contains the number of boys in each of the 100 thousand samples.

Now, what changes if we know that boy is born on Tuesday? In our simulation, we would have to assign birth dates to the boys. We make the assumption that each day has the same probability. We discard every pair that has no Tuesday born boy. Let us do that in EMT first.

>n=1000000; Gender=(random(n,2)<0.5); Boys=sum(Gender)'; 
>TuesdayBoys=(random(n,2)<1/7 && Gender==1); TBoys=sum(TuesdayBoys)';
>sum(TBoys>0 && Boys<2)/sum(TBoys>0)
 0.518327797038

Again, a loop may be more convenient for you if you are not familiar with the Matrix language of EMT.

If we think of the cases and their probabilities, we get the following cases with the probabilities

\(P(M_TM_T)=\dfrac{p^2}{4},\)

\(p(M_TM_O)=P(M_0M_T)=\dfrac{p(1-p)}{4},\)

\(P(M_TF)=P(FM_T)=\dfrac{p}{4}\)

using the obvious abbreviations (MT for a Tuesday boy, MO for any other boy, and F for a girl) and p=1/7. The cases are exclusive to each other. Thus the probability for a girl under these assumptions is

\(\dfrac{2p/4}{p^2/4+2p(1-p)/4+2p/4} = \dfrac{2}{4-p} = \dfrac{14}{27} \approx 0.5185\)

That agrees to our experiment in three digits. Random Monte-Carlo experiments like the one we performed are not very accurate. With a programming language, however, you can make much larger experiments.

Let us turn to Problem 1. Simulating the frogs means we have to decide for a probability of gender and for the probability of croaking. Moreover, we have to decide what to do with two croaking males.

Assume, you cannot distinguish one from two croaks. Then this is the same as in Problem 1 with a general probability p instead of 1/7. The extreme case is p=1 with the result 1/3 for a female. It is just the same case as in Problem 2 without the Tuesday information. The other extreme case is p close to 0. Then, if you hear a croak the probability for a female is close to 1/2.

The situation is quite different if you assume that there was only one croak. For p=1 this means there must be a female for sure. For p->0 you get 1/2 as before. I leave that to work out for you.

What about a Bayesian approach? We want to compute something like P(Female|Croak). But what do we know? We could use the usual Bayesian trick

\(P(F|C) = P(C|F) \dfrac{P(F)}{P(C)}.\)

We have to work out the probabilities on the right hand side, nevertheless. You can make all sorts of non-sense now. Clearly, P(F)=1/4. But the other probabilities are difficult to compute. So your are left with the definition of the conditional probability

\(P(F|C) = \dfrac{P(F \cap C)}{P(C)}\)

And that is exactly what we did above. There is no help from Bayes here.

20. Juli 2016 von mga010
Kategorien: Euler, Uncategorized | Schreibe einen Kommentar

Schreibe einen Kommentar

Pflichtfelder sind mit * markiert