# python – Generate random numbers with a given (numerical) distribution

## python – Generate random numbers with a given (numerical) distribution

`scipy.stats.rv_discrete` might be what you want. You can supply your probabilities via the `values` parameter. You can then use the `rvs()` method of the distribution object to generate random numbers.

As pointed out by Eugene Pakhomov in the comments, you can also pass a `p` keyword parameter to `numpy.random.choice()`, e.g.

``````numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])
``````

If you are using Python 3.6 or above, you can use `random.choices()` from the standard library – see the answer by Mark Dickinson.

Since Python 3.6, theres a solution for this in Pythons standard library, namely `random.choices`.

Example usage: lets set up a population and weights matching those in the OPs question:

``````>>> from random import choices
>>> population = [1, 2, 3, 4, 5, 6]
>>> weights = [0.1, 0.05, 0.05, 0.2, 0.4, 0.2]
``````

Now `choices(population, weights)` generates a single sample:

``````>>> choices(population, weights)
4
``````

The optional keyword-only argument `k` allows one to request more than one sample at once. This is valuable because theres some preparatory work that `random.choices` has to do every time its called, prior to generating any samples; by generating many samples at once, we only have to do that preparatory work once. Here we generate a million samples, and use `collections.Counter` to check that the distribution we get roughly matches the weights we gave.

``````>>> million_samples = choices(population, weights, k=10**6)
>>> from collections import Counter
>>> Counter(million_samples)
Counter({5: 399616, 6: 200387, 4: 200117, 1: 99636, 3: 50219, 2: 50025})
``````

#### python – Generate random numbers with a given (numerical) distribution

An advantage to generating the list using CDF is that you can use binary search. While you need O(n) time and space for preprocessing, you can get k numbers in O(k log n). Since normal Python lists are inefficient, you can use `array` module.

If you insist on constant space, you can do the following; O(n) time, O(1) space.

``````def random_distr(l):
r = random.uniform(0, 1)
s = 0
for item, prob in l:
s += prob
if s >= r:
return item
return item  # Might occur because of floating point inaccuracies
``````