random – Pseudorandom number generators

Purpose:Implements several types of pseudorandom number generators.
Available In:1.4 and later

The random module provides a fast pseudorandom number generator based on the Mersenne Twister algorithm. Originally developed to produce inputs for Monte Carlo simulations, Mersenne Twister generates numbers with nearly uniform distribution and a large period, making it suited for a wide range of applications.

Generating Random Numbers

The random() function returns the next random floating point value from the generated sequence. All of the return values fall within the range 0 <= n < 1.0.

import random

for i in xrange(5):
    print '%04.3f' % random.random()

Running the program repeatedly produces different sequences of numbers.

$ python random_random.py

0.182
0.155
0.097
0.175
0.008

$ python random_random.py

0.851
0.607
0.700
0.922
0.496

To generate numbers in a specific numerical range, use uniform() instead.

import random

for i in xrange(5):
    print '%04.3f' % random.uniform(1, 100)

Pass minimum and maximum values, and uniform() adjusts the return values from random() using the formula min + (max - min) * random().

$ python random_uniform.py

6.899
14.411
96.792
18.219
63.386

Seeding

random() produces different values each time it is called, and has a very large period before it repeats any numbers. This is useful for producing unique values or variations, but there are times when having the same dataset available to be processed in different ways is useful. One technique is to use a program to generate random values and save them to be processed by a separate step. That may not be practical for large amounts of data, though, so random includes the seed() function for initializing the pseudorandom generator so that it produces an expected set of values.

import random

random.seed(1)

for i in xrange(5):
    print '%04.3f' % random.random()

The seed value controls the first value produced by the formula used to produce pseudorandom numbers, and since the formula is deterministic it also sets the full sequence produced after the seed is changed. The argument to seed() can be any hashable object. The default is to use a platform-specific source of randomness, if one is available. Otherwise the current time is used.

$ python random_seed.py

0.134
0.847
0.764
0.255
0.495

$ python random_seed.py

0.134
0.847
0.764
0.255
0.495

Saving State

Another technique useful for controlling the number sequence is to save the internal state of the generator between test runs. Restoring the previous state before continuing reduces the likelyhood of repeating values or sequences of values from the earlier input. The getstate() function returns data that can be used to re-initialize the random number generator later with setstate().

import random
import os
import cPickle as pickle

if os.path.exists('state.dat'):
    # Restore the previously saved sate
    print 'Found state.dat, initializing random module'
    with open('state.dat', 'rb') as f:
        state = pickle.load(f)
    random.setstate(state)
else:
    # Use a well-known start state
    print 'No state.dat, seeding'
    random.seed(1)

# Produce random values
for i in xrange(3):
    print '%04.3f' % random.random()

# Save state for next time
with open('state.dat', 'wb') as f:
    pickle.dump(random.getstate(), f)

# Produce more random values
print '\nAfter saving state:'
for i in xrange(3):
    print '%04.3f' % random.random()

The data returned by getstate() is an implementation detail, so this example saves the data to a file with pickle but otherwise treats it as a black box. If the file exists when the program starts, it loads the old state and continues. Each run produces a few numbers before and after saving the state, to show that restoring the state causes the generator to produce the same values again.

$ python random_state.py

No state.dat, seeding
0.134
0.847
0.764

After saving state:
0.255
0.495
0.449

$ python random_state.py

Found state.dat, initializing random module
0.255
0.495
0.449

After saving state:
0.652
0.789
0.094

Random Integers

random() generates floating point numbers. It is possible to convert the results to integers, but using randint() to generate integers directly is more convenient.

import random

print '[1, 100]:'

for i in xrange(3):
    print random.randint(1, 100)

print
print '[-5, 5]:'
for i in xrange(3):
    print random.randint(-5, 5)

The arguments to randint() are the ends of the inclusive range for the values. The numbers can be positive or negative, but the first value should be less than the second.

$ python random_randint.py

[1, 100]:
3
47
72

[-5, 5]:
4
1
-3

randrange() is a more general form of selecting values from a range.

import random

for i in xrange(3):
    print random.randrange(0, 101, 5)

randrange() supports a step argument, in addition to start and stop values, so it is fully equivalent to selecting a random value from range(start, stop, step). It is more efficient, because the range is not actually constructed.

$ python random_randrange.py

50
55
45

Picking Random Items

One common use for random number generators is to select a random item from a sequence of enumerated values, even if those values are not numbers. random includes the choice() function for making a random selection from a sequence. This example simulates flipping a coin 10,000 times to count how many times it comes up heads and how many times tails.

import random
import itertools

outcomes = { 'heads':0,
             'tails':0,
             }
sides = outcomes.keys()

for i in range(10000):
    outcomes[ random.choice(sides) ] += 1

print 'Heads:', outcomes['heads']
print 'Tails:', outcomes['tails']

There are only two outcomes allowed, so rather than use numbers and convert them the words “heads” and “tails” are used with choice(). The results are tabulated in a dictionary using the outcome names as keys.

$ python random_choice.py

Heads: 5069
Tails: 4931

Permutations

A simulation of a card game needs to mix up the deck of cards and then “deal” them to the players, without using the same card more than once. Using choice() could result in the same card being dealt twice, so instead the deck can be mixed up with shuffle() and then individual cards removed as they are dealt.

import random
import itertools

def new_deck():
    return list(itertools.product(
            itertools.chain(xrange(2, 11), ('J', 'Q', 'K', 'A')),
            ('H', 'D', 'C', 'S'),
            ))

def show_deck(deck):
    p_deck = deck[:]
    while p_deck:
        row = p_deck[:13]
        p_deck = p_deck[13:]
        for j in row:
            print '%2s%s' % j,
        print

# Get a new deck, with the cards in order
deck = new_deck()
print 'Initial deck:'
show_deck(deck)

# Shuffle the deck to randomize the order
random.shuffle(deck)
print '\nShuffled deck:'
show_deck(deck)

# Deal 4 hands of 5 cards each
hands = [ [], [], [], [] ]

for i in xrange(5):
    for h in hands:
        h.append(deck.pop())

# Show the hands
print '\nHands:'
for n, h in enumerate(hands):
    print '%d:' % (n+1),
    for c in h:
        print '%2s%s' % c,
    print
    
# Show the remaining deck
print '\nRemaining deck:'
show_deck(deck)

The cards are represented as tuples with the face value and a letter indicating the suit. The dealt “hands” are created by adding one card at a time to each of four lists, and removing it from the deck so it cannot be dealt again.

$ python random_shuffle.py

Initial deck:
 2H  2D  2C  2S  3H  3D  3C  3S  4H  4D  4C  4S  5H
 5D  5C  5S  6H  6D  6C  6S  7H  7D  7C  7S  8H  8D
 8C  8S  9H  9D  9C  9S 10H 10D 10C 10S  JH  JD  JC
 JS  QH  QD  QC  QS  KH  KD  KC  KS  AH  AD  AC  AS

Shuffled deck:
 4C  3H  AD  JH  7D  3D  5C  6D  5D  7S  5S  KH  8S
 QC  5H  7C  4D  4S  2H  JD  KD  AH 10S  KC  6C  6H
 8H 10H  QD  AC  2S  7H  JC  9S  AS  8C  QH  9D  4H
 8D  JS  2D  3S  9C 10D  3C  6S  2C  QS  KS 10C  9H

Hands:
1:  9H  2C  9C  8D  8C
2: 10C  6S  3S  4H  AS
3:  KS  3C  2D  9D  9S
4:  QS 10D  JS  QH  JC

Remaining deck:
 4C  3H  AD  JH  7D  3D  5C  6D  5D  7S  5S  KH  8S
 QC  5H  7C  4D  4S  2H  JD  KD  AH 10S  KC  6C  6H
 8H 10H  QD  AC  2S  7H

Many simulations need random samples from a population of input values. The sample() function generates samples without repeating values and without modifying the input sequence. This example prints a random sample of words from the system dictionary.

import random

with open('/usr/share/dict/words', 'rt') as f:
    words = f.readlines()
words = [ w.rstrip() for w in words ]

for w in random.sample(words, 5):
    print w
    

The algorithm for producing the result set takes into account the sizes of the input and the sample requested to produce the result as efficiently as possible.

$ python random_sample.py

pleasureman
consequency
docibility
youdendrift
Ituraean

$ python random_sample.py

jigamaree
readingdom
sporidium
pansylike
foraminiferan

Multiple Simultaneous Generators

In addition to module-level functions, random includes a Random class to manage the internal state for several random number generators. All of the functions described above are available as methods of the Random instances, and each instance can be initialized and used separately, without interfering with the values returned by other instances.

import random
import time

print 'Default initializiation:\n'

r1 = random.Random()
r2 = random.Random()

for i in xrange(3):
    print '%04.3f  %04.3f' % (r1.random(), r2.random())

print '\nSame seed:\n'

seed = time.time()
r1 = random.Random(seed)
r2 = random.Random(seed)

for i in xrange(3):
    print '%04.3f  %04.3f' % (r1.random(), r2.random())

On a system with good native random value seeding, the instances start out in unique states. However, if there is no good platform random value generator, the instances are likely to have been seeded with the current time, and therefore produce the same values.

$ python random_random_class.py

Default initializiation:

0.171  0.711
0.184  0.558
0.818  0.113

Same seed:

0.857  0.857
0.925  0.925
0.040  0.040

To ensure that the generators produce values from different parts of the random period, use jumpahead() to shift one of them away from its initial state.

import random
import time

r1 = random.Random()
r2 = random.Random()

# Force r2 to a different part of the random period than r1.
r2.setstate(r1.getstate())
r2.jumpahead(1024)

for i in xrange(3):
    print '%04.3f  %04.3f' % (r1.random(), r2.random())

The argument to jumpahead() should be a non-negative integer based the number of values needed from each generator. The internal state of the generator is scrambled based on the input value, but not simply by incrementing it by the number of steps given.

$ python random_jumpahead.py

0.405  0.159
0.592  0.765
0.501  0.764

SystemRandom

Some operating systems provide a random number generator that has access to more sources of entropy that can be introduced into the generator. random exposes this feature through the SystemRandom class, which has the same API as Random but uses os.urandom() to generate the values that form the basis of all of the other algorithms.

import random
import time

print 'Default initializiation:\n'

r1 = random.SystemRandom()
r2 = random.SystemRandom()

for i in xrange(3):
    print '%04.3f  %04.3f' % (r1.random(), r2.random())

print '\nSame seed:\n'

seed = time.time()
r1 = random.SystemRandom(seed)
r2 = random.SystemRandom(seed)

for i in xrange(3):
    print '%04.3f  %04.3f' % (r1.random(), r2.random())

Sequences produced by SystemRandom are not reproducable because the randomness is coming from the system, rather than software state (in fact, seed() and setstate() have no effect at all).

$ python random_system_random.py

Default initializiation:

0.374  0.932
0.002  0.022
0.692  1.000

Same seed:

0.182  0.939
0.154  0.430
0.649  0.970

Non-uniform Distributions

While the uniform distribution of the values produced by random() is useful for a lot of purposes, other distributions more accurately model specific situations. The random module includes functions to produce values in those distributions, too. They are listed here, but not covered in detail because their uses tend to be specialized and require more complex examples.

Normal

The normal distribution is commonly used for non-uniform continuous values such as grades, heights, weights, etc. The curve produced by the distribution has a distinctive shape which has lead to it being nicknamed a “bell curve.” random includes two functions for generating values with a normal distribution, normalvariate() and the slightly faster gauss() (the normal distribution is also called the Gaussian distribution).

The related function, lognormvariate() produces pseudorandom values where the logarithm of the values is distributed normally. Log-normal distributions are useful for values that are the product of several random variables which do not interact.

Approximation

The triangular distribution is used as an approximate distribution for small sample sizes. The “curve” of a triangular distribution has low points at known minimum and maximum values, and a high point at and the mode, which is estimated based on a “most likely” outcome (reflected by the mode argument to triangular()).

Exponential

expovariate() produces an exponential distribution useful for simulating arrival or interval time values for in homogeneous Poisson processes such as the rate of radioactive decay or requests coming into a web server.

The Pareto, or power law, distribution matches many observable phenomena and was popularized by Chris Anderon’s book, The Long Tail. The paretovariate() function is useful for simulating allocation of resources to individuals (wealth to people, demand for musicians, attention to blogs, etc.).

Angular

The von Mises, or circular normal, distribution (produced by vonmisesvariate()) is used for computing probabilities of cyclic values such as angles, calendar days, and times.

Sizes

betavariate() generates values with the Beta distribution, which is commonly used in Bayesian statistics and applications such as task duration modeling.

The Gamma distribution produced by gammavariate() is used for modeling the sizes of things such as waiting times, rainfall, and computational errors.

The Weibull distribution computed by weibullvariate() is used in failure analysis, industrial engineering, and weather forecasting. It describes the distribution of sizes of particles or other discrete objects.

See also

random
The standard library documentation for this module.
Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator
Article by M. Matsumoto and T. Nishimura from ACM Transactions on Modeling and Computer Simulation Vol. 8, No. 1, January pp.3-30 1998.
Wikipedia: Mersenne Twister
Article about the pseudorandom generator algorithm used by Python.
Wikipedia: Uniform distribution
Article about continuous uniform distributions in statistics.