Monte Carlo Methods
If you are unable to view the code in the Frame above then you can also view it here.
If you are unable to view the code in the Frame above then you can also view it here.
Above: 10 000 points pseudo-randomly generated within a square enclosing a circle of unit radius (in
Java). The spread of the points certainly looks random and no obvious pattern is apparent. From the
ratio of the number of points falling within the circle to points falling outside the circle, we can
estimate Pi.

As the name suggests, Monte Carlo methods have a lot to do with randomness and probability.
Monte Carlo methods are mathematical tools that make use of probability to derive meaningful
results, like an estimate for the value of Pi, or more novel results, such as solving integrals that can
not be solved analytically. These are stochastic numerical methods that can be used to give us
approximate answers to many mathematical problems.

The Java code (written in Netbeans 4.1) for the program I used to estimate PI is given below:
My pocket calculator gives Pi = 3.141592654. Some results for the estimates derived are given below:
Number of points        Estimate of Pi        Computation time

100 000                        3.13772                < 1 second
100 000                        3.1488
100 000                        3.14728
100 000                        3.14164
100 000                        3.1454

1 000 000                     3.14118                about 2 seconds
1 000 000                     3.142112
1 000 000                     3.143588
1 000 000                     3.138508
1 000 000                     3.142688

1 000 000 000              3.141622             about 30 minutes
Mean of the five estimates with 1 000 000 points = 3.1416152.
As you can see, this method works fine for calculating Pi to about 4 decimal places, obtaining more
accurate results would require a lot more computation, and we may encounter problems because
computers do not generate true random numbers, but rather pseudo-random numbers (see below).
Note that for illustration the code generates points over the area of the whole square and circle, but
we could simply restrict the random coordinates generated to the positive quadrant only (so each
random X and Y is between 0 and 1) which means that we need a quarter of the number of points for
the same degree of accuracy (though in reality this makes an almost negligible difference).

Some examples of Monte Carlo methods for computing definite integrals are illustrated by the C#
(Visual Studio) code below. This code uses a small Winform interface for ease of use in
experimenting with different values (I often find this more economical on time than simply re-running
a Console app with varying inputs).
An integral of a function is the area under the curve. A definite integral calculates the area for a given
range of the curve. In this case we are taking the limits from x = 0 to some specified upper limit (input by
the user). The function is hard-coded, but I have entered three different functions and simply commented
out all but the one I want computed. The function is coded in the TestHit() function of the program, which
determines whether or not a randomly generated point falls below the function curve or not, within the
interval of interest. You can add your own function to the code, and although the program was designed
initially for definite integrals with a lower limit fixed at zero, this can also be changed in the code (or
another textbox could be added to the Winform to allow the user to set this value)..

We can test this initially with a function that we can solve analytically, such as the exponential function:
The interface allows the user to enter the value of one coefficient, in this case, for the above example: a =1
and also the upper bound, in this case u = 1 and the number of sample points. The Monte Carlo program
gives the correct answer to 3 decimal places (0.632) with ten million sample points, 2 decimal places with
100 000 to one million sample points, and for one decimal place only with 1000 to 10 000 points.

Now we know the program and method works, at least in principle, we can try it on a function which cannot
be solved analytically, such as the error function. The error function can not be solved by exact analytic
methods, but must instead by numerically approximated by a method like Monte Carlo. Tables give results
for some values of the upper bound, x. Using one of the tabulated values, say for x = 1, allows us to
compare our answer:
The Monte Carlo results for erf(1) obtained were:

Number of points        erf(1)

1000                           0.8780
10 000                        0.8348
100 000                      0.84160
1 000 000                   0.84446
10 000 000                 0.84303
100 000 000               0.84273, mean for five trials = 0.84265
1 000 000 000            0.842684

The computation time with 100 000 000 points was about 25 seconds and for 1 000 000 000 points
about 5 minutes.

Of course, we can use our Monte Carlo method to solve other integrals, including those whose
answers are unknown, but one always has to be careful of the degree of accuracy. Incrementally
increasing the number of sample points and re-running the calculation allows us to see when each
decimal place has been secured with certainty (i.e. when it no longer changes).

Pseudo-random numbers

The sample programs given here all use the default random number generators provided with Java (in
the util namespace) and C#. These generators use a deterministic algorithm that does not generate
true random numbers, but rather pseudorandom numbers. They require an initial value or seed value.
If you start with the exact same seed then the sequence of numbers generated would always be the
same (and so is not truly random). These are called linear congruential generators (LCGs) and utilise
the following equation:
Note that this is an iterative formula, so given a number in the sequence, r(n), we can obtain the next
number, r(n+1). The value we use to get the series going in the first place is called the
seed value,
r(0). It is possible to use a seed that is genuinely random, even though the sequence of numbers
coming from it will not be truly random.

By
mod (modulus) m we mean that the pattern of numbers repeats after every m numbers. (A clock
face is modulus 12). Thus, m is the period of the sequence of numbers generated. Clearly we want m
to be very large! Java uses a 48-bit LCG, meaning that it generates 48 random bits at a time, and
period m = 2^48, whilst C# uses a 32 bit LCG. In both cases, the bits of least value, the lower bits, are
discarded as these tend to repeat more often. Java dumps the 16 lowest bits and so generates a 32
bit random number (suitable for a standard 32 bit integer). Although the sequences do not repeat
until after many more numbers than we will use in most cases, there can be other problems. Certain
numbers may rarely appear (leading to gaps in the pattern of coverage in the Pi example) and the
pattern may tend to repeat to some extent over much smaller cycles. Indeed, some people say that
one should not use LCG for Monte Carlo simulations, however, as we have seen if all one wants to do
is solve an integral over a small interval to say 4 decimal places of accuracy, then there is little
problem. The main problem with calculations requiring higher accuracy, say 6 or more accurate
decimal places, is computation time. LCGs are good for many Monte Carlo purposes, although not all.
It also depends which language you are using, some of them use quite old LCGs which are no where
near as good as those in Java and C#.

Seed values. By default Java seeds the random generator with the system time in milliseconds, that is
the time in milliseconds since the computer booted. A custom seed can be used (it is sometimes
useful, for example in graphical applications, to be able to re-run a program and generate the same
set of pseudorandom numbers every time, in which case one could use the same seed, say 20. In the
code above I explicitly seeded the generator with the system time (though this is the default anyway).
The system time in nanoseconds can also be used. If we were worried that a sequence of
pseudorandom numbers might exhibit some periodic behaviour within its main period, say with some
sort of pattern repeating every 100 million numbers, then we could reseed the generator every so
often to break-up the pattern, which might prevent blank stripes appearing, for example, when filling a
square with random points. We could even generate our random numbers, store them in an array,
and then shuffle them. Certainly, in our example, there is no evidence of peridicity - the
pseudorandom numbers generated by the LCG are as good as random for our purpose.

You can see from the picture at the top of the page that the spread of points, although
pseudorandom, approximates a random spread very well. Java does have another standard number
generator which generates more random numbers, called the SecureRandom class. This is useful for
encryption, such as generating a random password, however the numbers take longer to generate.
The time taken seems highly variable, but I find that this generator takes around 6-10 times as long
and for our pie calculation, with 100 million sample points it really made no difference to the precision
of the estimate of Pi. SecureRandom does not use an LCG, but relies on a long series of bit
operations (such as XOR bit shifts).

It is possible to generate good pseudorandom bits using a simplified version of such a generator
coded inline (see external link: ) though this loses some of the convenience of the various methods
associated with the standard Java random umber generators.