Stratified Sampling in EMT

Let me demonstrate another trick for Monte-Carlo methods. For a better spread of random numbers over the complete range of the random variable, one can use stratified sampling. For this the interval [0,1] is evenly divided into k partitions. In each partition, a random variable is selected. Then the inverse distribution is applied to all k random variables. This yields k random variables, which should distribute more evenly across the range of the random variable we want to simulate.

Here is one run in EMT. We generate k normal distributed random variables.

>k=10; t=(1:k)/k; p=t-random(k)/k
 [0.0345216,  0.140289,  0.223475,  0.364381,  0.47122,  0.597257,
 0.613633,  0.747672,  0.894046,  0.993234]
 [-1.81814,  -1.07902,  -0.760509,  -0.346772,  -0.0722034,  0.246254,
 0.288799,  0.667183,  1.24834,  2.46948]

To make a function for this, we use the same technique as shown in a previous blog. We first generate a vector of j runs of the above stratification. We use redim() to make this a vector. Then we fill this vector to get n*m random variables. Then we apply redim() again to get a nxm random matrix.

>function normalstrat (n,m,k=10) ...
$  N=n*m; j=floor(N/k);
$  t=(1:k)/k; v=redim(invnormaldis(t-random(j,k)/k),1,j*k);
$  return redim(v|normal(N-j*k),n,m);

To test this, we generate 1000 times 1000 random numbers with distribution

\(e^X, \quad X \sim N(0,1)\)

Then we compute the mean values of all runs and compare the deviation of these mean values between the usual generator normal() and normalstrat().


This clear improvement comes at the cost of giving up our fast generator (based on the Box-Muller method) for a method which involves the inverse distribution. In fact the time factor is about 3. But we need to generate 9000 random numbers instead of 1000 to get the same accuracy. So the methods is efficient.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht.

Diese Website verwendet Akismet, um Spam zu reduzieren. Erfahre mehr darüber, wie deine Kommentardaten verarbeitet werden.