seeming memory leak in numpy for Mac? - Ubuntu

maplemaple
November 22, 2022
259 views
0 votes
2 Answers

I used the following process the generate a numpy array with size = (720, 720, 3). In principle, it should cost 720 * 720 * 3 * 8Byte = 12.3MB. However, in the ans = memory_benchmark(), it costs 188 MB. Why does it cost much more memory than expected? I think it should have same cost as the line m1 = np.ones((720, 720, 3)).

I have following two Environments. Both have same problem.

Environment1: numpy=1.23.4, memory_profiler=0.61.0, python=3.10.6, MacOS 12.6.1(Intel not M1)

Environment2: numpy=1.19.5, memory_profiler=0.61.0, python=3.8.15, MacOS 12.6.1(Intel not M1)

I did memory profile in the following

import numpy as np
from memory_profiler import profile


@profile
def memory_benchmark():
    m1 = np.ones((720, 720, 3))
    m2 = np.random.randint(128, size=(720, 720, 77, 3))
    a = m2[:, :, :, 0].astype(np.uint16)
    b = m2[:, :, :, 1].astype(np.uint16)
    ans = np.array(m1[b, a].sum(axis=2))
    m2 = None
    a = None
    b = None
    m1 = None
    return ans


@profile
def f():
    ans = memory_benchmark()
    print(ans.shape)
    print("finished")


if __name__ == '__main__':
    f()

(720, 720, 3)
finished

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
     5     59.3 MiB     59.3 MiB           1   @profile
     6                                         def memory_benchmark():
     7     71.2 MiB     11.9 MiB           1       m1 = np.ones((720, 720, 3))
     8    984.8 MiB    913.7 MiB           1       m2 = np.random.randint(128, size=(720, 720, 77, 3))
     9   1061.0 MiB     76.1 MiB           1       a = m2[:, :, :, 0].astype(np.uint16)
    10   1137.1 MiB     76.1 MiB           1       b = m2[:, :, :, 1].astype(np.uint16)
    11   1160.9 MiB     23.8 MiB           1       ans = np.array(m1[b, a].sum(axis=2))
    12    247.3 MiB   -913.6 MiB           1       m2 = None
    13    247.3 MiB      0.0 MiB           1       a = None
    14    247.3 MiB      0.0 MiB           1       b = None
    15    247.3 MiB      0.0 MiB           1       m1 = None
    16    247.3 MiB      0.0 MiB           1       return ans


Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    19     59.3 MiB     59.3 MiB           1   @profile
    20                                         def f():
    21    247.3 MiB    188.0 MiB           1       ans = memory_benchmark()
    22    247.3 MiB      0.0 MiB           1       print(ans.shape)
    23    247.3 MiB      0.0 MiB           1       print("finished")

If I print(type(m1[0, 0, 0])) yields <class 'numpy.float64'>, print(type(m2[0, 0, 0, 0])) yields <class 'numpy.int64'>, print(type(ans[0, 0, 0])) yields <class 'numpy.float64'>

However, in my Ubuntu VM, I don’t have above problem.

Answers

I can’t reproduce the results you’re getting. In python 3.7.3, numpy 1.21.4, and memory_profiler 0.61.0, I’m getting the following results


Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    23     57.6 MiB     57.6 MiB           1   @profile
    24                                         def memory_benchmark():
    25     69.5 MiB     11.9 MiB           1       m1 = np.ones((720, 720, 3))
    26    527.4 MiB    457.8 MiB           1       m2 = np.random.randint(128, size=(720, 720, 77, 3))
    27    603.6 MiB     76.3 MiB           1       a = m2[:, :, :, 0].astype(np.uint16)
    28    679.9 MiB     76.3 MiB           1       b = m2[:, :, :, 1].astype(np.uint16)
    29    692.0 MiB     12.1 MiB           1       ans = np.array(m1[b, a].sum(axis=2))
    30    234.3 MiB   -457.7 MiB           1       m2 = None
    31    158.0 MiB    -76.3 MiB           1       a = None
    32     81.7 MiB    -76.3 MiB           1       b = None
    33     69.8 MiB    -11.9 MiB           1       m1 = None
    34     69.8 MiB      0.0 MiB           1       return ans


(720, 720, 3)
finished

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    37     57.6 MiB     57.6 MiB           1   @profile
    38                                         def f():
    39     69.8 MiB     12.2 MiB           1       ans = memory_benchmark()
    40     69.8 MiB      0.0 MiB           1       print(ans.shape)
    41     69.8 MiB      0.0 MiB           1       print("finished")

Printing type(m1[0,0,0,0]) yields <class 'numpy.int32'>, so the 457.8 MiB makes sense. On the other hand, your output seems weird, given that assigning m1 to None reports no difference in memory. Which python & library versions are you using?

Update: In a different machine, with python 3.10.6, numpy 1.23.5, and memory_profiler 0.61.0, I still cannot reproduce the OP output.

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
     5     35.6 MiB     35.6 MiB           1   @profile
     6                                         def memory_benchmark():
     7     47.4 MiB     11.8 MiB           1       m1 = np.ones((720, 720, 3))
     8    961.2 MiB    913.8 MiB           1       m2 = np.random.randint(128, size=(720, 720, 77, 3))
     9   1037.4 MiB     76.2 MiB           1       a = m2[:, :, :, 0].astype(np.uint16)
    10   1113.6 MiB     76.2 MiB           1       b = m2[:, :, :, 1].astype(np.uint16)
    11   1125.8 MiB     12.2 MiB           1       ans = np.array(m1[b, a].sum(axis=2))
    12    212.1 MiB   -913.6 MiB           1       m2 = None
    13    136.0 MiB    -76.1 MiB           1       a = None
    14     59.9 MiB    -76.1 MiB           1       b = None
    15     48.0 MiB    -11.9 MiB           1       m1 = None
    16     48.0 MiB      0.0 MiB           1       return ans


(720, 720, 3)
finished

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    19     35.6 MiB     35.6 MiB           1   @profile
    20                                         def f():
    21     48.0 MiB     12.3 MiB           1       ans = memory_benchmark()
    22     48.0 MiB      0.0 MiB           1       print(ans.shape)
    23     48.0 MiB      0.0 MiB           1       print("finished")

- hpaulj
- November 22, 2022 at 2:21 am
- 0 votes
0
Those numbers look fine to me:
```
In [772]: 720*720*3*8/1e6
Out[772]: 12.4416

In [773]: 720*720*3*8/1e6 * 77
Out[773]: 958.0032

In [775]: 720*720*77*2/1e6
Out[775]: 79.8336
```
Evidently once you drop it to 247.3 MiB, the interpreter/numpy decides to "hang on" to that memory, rather than return it to the OS. When tracking memory you are dealing the "choices" of several layers – OS, python interpreter, and numpy's own memory management. One or more of those layers can maintain a "free space" from which it can allocated new objects or arrays.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

seeming memory leak in numpy for Mac? – Ubuntu

Answers