# python – Generate random numbers with a given (numerical) distribution

## python – Generate random numbers with a given (numerical) distribution

`scipy.stats.rv_discrete`

might be what you want. You can supply your probabilities via the `values`

parameter. You can then use the `rvs()`

method of the distribution object to generate random numbers.

As pointed out by Eugene Pakhomov in the comments, you can also pass a `p`

keyword parameter to `numpy.random.choice()`

, e.g.

```
numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])
```

If you are using Python 3.6 or above, you can use `random.choices()`

from the standard library – see the answer by Mark Dickinson.

Since Python 3.6, theres a solution for this in Pythons standard library, namely `random.choices`

.

Example usage: lets set up a population and weights matching those in the OPs question:

```
>>> from random import choices
>>> population = [1, 2, 3, 4, 5, 6]
>>> weights = [0.1, 0.05, 0.05, 0.2, 0.4, 0.2]
```

Now `choices(population, weights)`

generates a single sample:

```
>>> choices(population, weights)
4
```

The optional keyword-only argument `k`

allows one to request more than one sample at once. This is valuable because theres some preparatory work that `random.choices`

has to do every time its called, prior to generating any samples; by generating many samples at once, we only have to do that preparatory work once. Here we generate a million samples, and use `collections.Counter`

to check that the distribution we get roughly matches the weights we gave.

```
>>> million_samples = choices(population, weights, k=10**6)
>>> from collections import Counter
>>> Counter(million_samples)
Counter({5: 399616, 6: 200387, 4: 200117, 1: 99636, 3: 50219, 2: 50025})
```

#### python – Generate random numbers with a given (numerical) distribution

An advantage to generating the list using CDF is that you can use binary search. While you need O(n) time and space for preprocessing, you can get k numbers in O(k log n). Since normal Python lists are inefficient, you can use `array`

module.

If you insist on constant space, you can do the following; O(n) time, O(1) space.

```
def random_distr(l):
r = random.uniform(0, 1)
s = 0
for item, prob in l:
s += prob
if s >= r:
return item
return item # Might occur because of floating point inaccuracies
```