import numpy as np;
import matplotlib.pyplot as plt;
plt.rcParams.update({"text.usetex":True});
%config InlineBackend.figure_format = "svg"
from ipywidgets import interactive
Let us consider a Bernoulli trial. In this, we will consider a bunch of particles at the origin. We will sample a random variable from [-1, +1] and then change the position of each particle. The feeling is that on an average, the particles will have a zero mean because the random variable drawn has zero mean. Let's see what the final distribution of the locations will be.
def get_steps(N):
return np.random.randint(0, 1+1, size=(N,))*2-1
Np = 100000; Nt = 200;
x = np.zeros((Np, Nt));
for i in np.arange(1,Nt):
steps = get_steps(Np)
x[:,i] = x[:,i-1] + steps;
if np.mod(i,20)==0:
s = x[:,i]
binsp = np.arange(np.min(s), np.max(s), 2.0)
counts, bins = np.histogram(s, bins=binsp, density = True)
plt.plot(bins[0:-1], counts);
plt.xlabel('$x$'); plt.ylabel('$p(x)$'); plt.title('Probability density function of locations')
Text(0.5, 1.0, 'Probability density function of locations')
As can be seen, the distribution of the distances tends towards that of a bell-shaped curve. In fact this curve is the normal distribution. The fact that for large times, the distribution tends towards the normal distribution, is a consequence of the Central Limit Theorem. In this particular case, if the step sizes have a finite variance, then the eventual distribution tends towards the normal distribution. Let us now sample from a uniform distribution.
def get_steps(N):
return np.random.uniform(-1, 1, size=(N,))
Np = 100000; Nt = 200;
x = np.zeros((Np, Nt));
for i in np.arange(1,Nt):
steps = get_steps(Np)
x[:,i] = x[:,i-1] + steps;
if np.mod(i,20)==0:
s = x[:,i]
binsp = np.arange(np.min(s), np.max(s), 2.0)
counts, bins = np.histogram(s, bins=binsp, density = True)
plt.plot(bins[0:-1], counts);
plt.xlabel('$x$'); plt.ylabel('$p(x)$'); plt.title('Probability density function of locations')
Text(0.5, 1.0, 'Probability density function of locations')
Let us now see how the mean and variance scales with time.
Let's also check whether we can sample the random step using some other distribution.
def get_steps(N):
#return np.random.randint(0, 1+1, size=(N,))*2-1
return np.random.uniform(-(3**0.5), 3**0.5, size=(N,))
Np = 100000; Nt = 200;
x = np.zeros((Np, Nt));
for i in np.arange(1,Nt):
steps = get_steps(Np)
x[:,i] = x[:,i-1] + steps;
if np.mod(i,20)==0:
s = x[:,i]
binsp = np.arange(np.min(s), np.max(s), 2.0)
counts, bins = np.histogram(s, bins=binsp, density = True)
m = np.trapz(bins[0:-1]*counts, bins[0:-1]);
var = np.trapz((bins[0:-1]-m)**2*counts, bins[0:-1])
plt.plot(i, m, 'ok', label="Mean");
plt.plot(i, var, 'sr', label="variance")
ta = np.arange(0, Nt);
plt.plot(ta, ta, '--r');
ax = plt.gca(); ax.set_aspect(1)
The above plot clearly elucidates a very important property of the Central Limit Theorem. The variance seems to grow linearly with time.
We see that the variance still grows linearly. Quite remarkable. Any distribution which has a finite variance will yield such a result. This result also depends on the variance of the distribution from which the sample is picked. As a matter of fact, the function can be used to sample a uniform variable. Can you tell why specifically $\pm\sqrt{3}$ is employed as the lower and upper limit of the uniform distribution to achieve the linearly trend $\sim N$? What happens when you choose different upper and lower limits?
Yet another observation can be done when the distribution of step sizes does not have a finite mean. What should be the form of the uniform distribution so that the variance shows then $\sim N$ behaviour? Note that when the variance of the distribution is not 1, then the scaling between the variance and N for the distribution of the locations is still linear but with a different slope.
def get_steps(N, a, m):
#return np.random.randint(0, 1+1, size=(N,))*2-1
#return np.random.uniform(0, 12**0.5, size=(N,))
return (np.random.pareto(a,size=(N,)) + 1)*m;
Np = 100000; Nt = 200;
x = np.zeros((Np, Nt));
for i in np.arange(1,Nt):
steps = get_steps(Np, 1.2, 1)
x[:,i] = x[:,i-1] + steps;
if np.mod(i,20)==0:
s = x[:,i]
binsp = np.arange(np.min(s), np.max(s), 2.0)
counts, bins = np.histogram(s, bins=binsp, density = True)
plt.loglog(bins[0:-1], counts);
for j in np.arange(0,10):
plt.plot(x[j,:])