I’m troubleshooting a nasty Python / Pandas / Dependency problem and want to go back in time, that is, recreate an earlier Python environment from when the code is thought to have worked.
In the interest of keeping this as simple as possible, I’m starting from a (Windows) machine which had never seen Python before and has no custom environments, a clean throwaway environment.
Starting with this:
C:dev>python --version
Python 3.9.9
C:dev>pip3 list
Package Version
------- -------
pip 23.1.2
And now,
pip3 install pandas==0.25.3
which chugs along and eventually produces an error message from a MS C compiler, the key phrase being "NUMPY_IMPORT_ARRAY_RETVAL". Attempts on a Linux system produce similar error message.
objToJSON.c
pandas/_libs/src/ujson/python/objToJSON.c(181): error C2065: 'NUMPY_IMPORT_ARRAY_RETVAL': undeclared identifier
pandas/_libs/src/ujson/python/objToJSON.c(181): warning C4047: 'return': 'void *' differs in levels of indirection from 'int'
pandas/_libs/src/ujson/python/objToJSON.c(479): warning C4267: 'function': conversion from 'size_t' to 'int', possible loss of data
pandas/_libs/src/ujson/python/objToJSON.c(680): warning C4244: '=': conversion from 'npy_intp' to 'int', possible loss of data
pandas/_libs/src/ujson/python/objToJSON.c(959): warning C4244: '=': conversion from 'Py_ssize_t' to 'int', possible loss of data
pandas/_libs/src/ujson/python/objToJSON.c(1050): warning C4244: '=': conversion from 'npy_intp' to 'int', possible loss of data
pandas/_libs/src/ujson/python/objToJSON.c(1844): warning C4244: '=': conversion from 'npy_float64' to 'npy_int64', possible loss of data
error: command 'C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe' failed with exit code 2
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pandas
Failed to build pandas
ERROR: Could not build wheels for pandas, which is required to install pyproject.toml-based projects
Similar question here: use of undeclared identifier 'NUMPY_IMPORT_ARRAY_RETVAL' return NUMPY_IMPORT_ARRAY_RETVAL;
Github: https://github.com/pandas-dev/pandas/issues/34969, https://github.com/pysat/pysat/issues/588
The Github discussions mentions an earlier version of Numpy, so let’s go ahead and try 1.17.4.
Package Version
------- -------
numpy 1.17.4
pip 23.1.2
Now let’s try
C:dev>pip3 install pandas==0.25.3
I get the same error as above, NUMPY_IMPORT_ARRAY_RETVAL etc.
Suggestions?
2
Answers
My approach, in the end, was to go to Pandas 3.8 as implicitly suggested by 9769953 (pity, 3.9 would have solved other problems elsewhere) and install
numpy==1.17.2
andpandas==0.25.3
.As for getting my application working, an intervening install of matplotlib forced the installation of an unwanted newer numpy, but once I settled on an older matplotlib, the application "worked", proving once again that nightmares really can come true as this entire downgrade exercise did expose an error in the application.
I would have guessed that the error was caused by a mismatch in a .c vs a .h file but was surprised (but not shocked) that a different Python version influenced external C code (my dumb).
Welcome to dependency hell.
Usually, your package manager such as pip/conda manage the dependencies, but sometimes their dependency graph database is flawed.
Here is my process of getting pandas 0.25.3 working with conda(mambaforge). By default it installs numpy 1.24, which unfortunately errors out giving a message why.
Follow the message and use numpy 1.19 instead, then pandas could be imported.