# gotchas where Numpy differs from straight python?

## gotchas where Numpy differs from straight python?

Because `__eq__`

does not return a bool, using numpy arrays in any kind of containers prevents equality testing without a container-specific work around.

Example:

```
>>> import numpy
>>> a = numpy.array(range(3))
>>> b = numpy.array(range(3))
>>> a == b
array([ True, True, True], dtype=bool)
>>> x = (a, banana)
>>> y = (b, banana)
>>> x == y
Traceback (most recent call last):
File <stdin>, line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
```

This is a horrible problem. For example, you cannot write unittests for containers which use `TestCase.assertEqual()`

and must instead write custom comparison functions. Suppose we write a work-around function `special_eq_for_numpy_and_tuples`

. Now we can do this in a unittest:

```
x = (array1, deserialized)
y = (array2, deserialized)
self.failUnless( special_eq_for_numpy_and_tuples(x, y) )
```

Now we must do this for every container type we might use to store numpy arrays. Furthermore, `__eq__`

might return a bool rather than an array of bools:

```
>>> a = numpy.array(range(3))
>>> b = numpy.array(range(5))
>>> a == b
False
```

Now each of our container-specific equality comparison functions must also handle that special case.

Maybe we can patch over this wart with a subclass?

```
>>> class SaneEqualityArray (numpy.ndarray):
... def __eq__(self, other):
... return isinstance(other, SaneEqualityArray) and self.shape == other.shape and (numpy.ndarray.__eq__(self, other)).all()
...
>>> a = SaneEqualityArray( (2, 3) )
>>> a.fill(7)
>>> b = SaneEqualityArray( (2, 3) )
>>> b.fill(7)
>>> a == b
True
>>> x = (a, banana)
>>> y = (b, banana)
>>> x == y
True
>>> c = SaneEqualityArray( (7, 7) )
>>> c.fill(7)
>>> a == c
False
```

That seems to do the right thing. The class should also explicitly export elementwise comparison, since that is often useful.

The biggest gotcha for me was that almost every standard operator is overloaded to distribute across the array.

Define a list and an array

```
>>> l = range(10)
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> import numpy
>>> a = numpy.array(l)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
```

Multiplication duplicates the python list, but distributes over the numpy array

```
>>> l * 2
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a * 2
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
```

Addition and division are not defined on python lists

```
>>> l + 2
Traceback (most recent call last):
File <stdin>, line 1, in <module>
TypeError: can only concatenate list (not int) to list
>>> a + 2
array([ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
>>> l / 2.0
Traceback (most recent call last):
File <stdin>, line 1, in <module>
TypeError: unsupported operand type(s) for /: list and float
>>> a / 2.0
array([ 0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])
```

Numpy overloads to treat lists like arrays sometimes

```
>>> a + a
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
>>> a + l
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
```

#### gotchas where Numpy differs from straight python?

I think this one is funny:

```
>>> import numpy as n
>>> a = n.array([[1,2],[3,4]])
>>> a[1], a[0] = a[0], a[1]
>>> a
array([[1, 2],
[1, 2]])
```

For Python lists on the other hand this works as intended:

```
>>> b = [[1,2],[3,4]]
>>> b[1], b[0] = b[0], b[1]
>>> b
[[3, 4], [1, 2]]
```

Funny side note: numpy itself had a bug in the `shuffle`

function, because it used that notation ðŸ™‚ (see here).

The reason is that in the first case we are dealing with *views* of the array, so the values are overwritten in-place.