skip to Main Content

I’m troubleshooting a nasty Python / Pandas / Dependency problem and want to go back in time, that is, recreate an earlier Python environment from when the code is thought to have worked.

In the interest of keeping this as simple as possible, I’m starting from a (Windows) machine which had never seen Python before and has no custom environments, a clean throwaway environment.

Starting with this:

C:dev>python --version
Python 3.9.9

C:dev>pip3 list
Package Version
------- -------
pip     23.1.2

And now,

pip3 install pandas==0.25.3

which chugs along and eventually produces an error message from a MS C compiler, the key phrase being "NUMPY_IMPORT_ARRAY_RETVAL". Attempts on a Linux system produce similar error message.

  objToJSON.c
  pandas/_libs/src/ujson/python/objToJSON.c(181): error C2065: 'NUMPY_IMPORT_ARRAY_RETVAL': undeclared identifier
  pandas/_libs/src/ujson/python/objToJSON.c(181): warning C4047: 'return': 'void *' differs in levels of indirection from 'int'
  pandas/_libs/src/ujson/python/objToJSON.c(479): warning C4267: 'function': conversion from 'size_t' to 'int', possible loss of data
  pandas/_libs/src/ujson/python/objToJSON.c(680): warning C4244: '=': conversion from 'npy_intp' to 'int', possible loss of data
  pandas/_libs/src/ujson/python/objToJSON.c(959): warning C4244: '=': conversion from 'Py_ssize_t' to 'int', possible loss of data
  pandas/_libs/src/ujson/python/objToJSON.c(1050): warning C4244: '=': conversion from 'npy_intp' to 'int', possible loss of data
  pandas/_libs/src/ujson/python/objToJSON.c(1844): warning C4244: '=': conversion from 'npy_float64' to 'npy_int64', possible loss of data
  error: command 'C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe' failed with exit code 2
[end of output]
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for pandas
Failed to build pandas
ERROR: Could not build wheels for pandas, which is required to install pyproject.toml-based projects

Similar question here: use of undeclared identifier 'NUMPY_IMPORT_ARRAY_RETVAL' return NUMPY_IMPORT_ARRAY_RETVAL;

Github: https://github.com/pandas-dev/pandas/issues/34969, https://github.com/pysat/pysat/issues/588

The Github discussions mentions an earlier version of Numpy, so let’s go ahead and try 1.17.4.

Package Version
------- -------
numpy   1.17.4
pip     23.1.2

Now let’s try

C:dev>pip3 install pandas==0.25.3

I get the same error as above, NUMPY_IMPORT_ARRAY_RETVAL etc.

Suggestions?

2

Answers


  1. Chosen as BEST ANSWER

    My approach, in the end, was to go to Pandas 3.8 as implicitly suggested by 9769953 (pity, 3.9 would have solved other problems elsewhere) and install numpy==1.17.2 and pandas==0.25.3.

    As for getting my application working, an intervening install of matplotlib forced the installation of an unwanted newer numpy, but once I settled on an older matplotlib, the application "worked", proving once again that nightmares really can come true as this entire downgrade exercise did expose an error in the application.

    I would have guessed that the error was caused by a mismatch in a .c vs a .h file but was surprised (but not shocked) that a different Python version influenced external C code (my dumb).


  2. Welcome to dependency hell.

    Usually, your package manager such as pip/conda manage the dependencies, but sometimes their dependency graph database is flawed.

    Here is my process of getting pandas 0.25.3 working with conda(mambaforge). By default it installs numpy 1.24, which unfortunately errors out giving a message why.
    Follow the message and use numpy 1.19 instead, then pandas could be imported.

    (base) $ mamba create -n test123 pandas=0.25.3
    ......
    (base) $ mamba activate test123
    (test123) $ python
    >>> import pandas as pd
    Traceback (most recent call last):
    ......
    AttributeError: module 'numpy' has no attribute 'bool'.
    `np.bool` was a deprecated alias for the builtin `bool`. To avoid this error in existing code, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
    The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
        https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    
    (test123) $ mamba install pandas=0.25.3 numpy=1.19
    ......
    (test123) $ python
    >>> import pandas as pd
    >>> # Imported successfully
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search