EN VI

What is the difference between np.divide(x, y) and x/y in Python3?

I recently found a bug in my code that I was able to fix by replacing np.divide(x, y) with x / y.

I was under the impression that np.divide(x, y) was equivalent to x / y (it says as much in the numpy documentation).

Is this a bug in numpy or is it expected behaviour?

As I said my immediate issue is solved so I'm not too worried about finding a fix, I am more curious to understand what's going on.

import numpy as np


x1 = np.array([[281], [15831], [30280], [975], [313], [739], [252], [10364], [21480], [1447], [315], [772], [95], [2710], [7408], [215], [111], [158], [0], [88], [21], [661], [0], [0], [0], [5], [4], [0], [12], [0], [0], [50], [28], [0], [0], [272]])
x2 = np.array([[499], [6315], [33800], [580], [208], [464], [384], [3127], [19596], [2319], [218], [1740], [217], [411], [4250], [223], [406], [267], [2], [0], [16], [0], [0], [0], [0], [8], [3], [0], [18], [0], [1], [0], [41], [0], [0], [0]])
x3 = np.array([[507], [6180], [34005], [555], [200], [451], [390], [3024], [19492], [2425], [211], [1848], [223], [396], [4097], [224], [406], [282], [2], [0], [16], [0], [0], [0], [0], [8], [3], [0], [19], [0], [2], [0], [45], [0], [0], [0]])
x4 = np.array([[507], [6178], [34017], [554], [200], [451], [391], [3022], [19486], [2439], [210], [1865], [223], [396], [4089], [224], [406], [284], [2], [0], [16], [0], [0], [0], [0], [8], [3], [0], [19], [0], [2], [0], [46], [0], [0], [0]])

not_zero = (x1 + x2) != 0
x = np.divide(2*(x1 - x2)**2, x1 + x2, where=not_zero)
r = (2*(x1[not_zero] - x2[not_zero])**2) / (x1[not_zero] + x2[not_zero])
print("n1 =",x.max(),"\tt1 =", r.max())

not_zero = (x2 + x3) != 0
x = np.divide(2*(x2 - x3)**2, x2 + x3, where=not_zero)
r = (2*(x2[not_zero] - x3[not_zero])**2) / (x2[not_zero] + x3[not_zero])
print("n2 =",x.max(),"\tt2 =", r.max())

not_zero = (x3 + x4) != 0
x = np.divide(2*(x3 - x4)**2, x3 + x4, where=not_zero)
r = (2*(x3[not_zero] - x4[not_zero])**2) / (x3[not_zero] + x4[not_zero])
print("n3 =",x.max(),"\tt3 =", r.max())

Output:

n1 = 8177.933351395286  t1 = 8177.933351395286
n2 = 873842.0           t2 = 6.501672240802676
n3 = 1322.0             t3 = 0.15566927013196877

Python version: 3.7.6 Numpy version: 1.17.0

Solution:

The parameter where = mask without the parameter out is somewhat dangerous. Without a target for the output, the function builds an np.empty array of the appropriate shape, and then replaces some subset of the empty array with the output data.

But np.empty isn't, well, empty. It's just a random memory location that hasn't been garbage collected. So where mask = False, your output will be random garbage. If that memory block happens to have binary garbage that can be encoded into a number bigger than the rest of your data, it end up being your max value.

you can mask out the garbage using not_zero as a mask again:

x = np.divide(2*(x1 - x2)**2, x1 + x2, where=not_zero)[not_zero]

or garbage collect your output array yourself:

x = np.zeros_like(x1)
np.divide(2*(x1 - x2)**2, x1 + x2, where = not_zero, out = x)