I keep getting
AttributeError: module ‘tabula’ has no attribute ‘read_pdf’
in Visual Studio Code when I try to run the below code:
import tabula
from tabula.io import read_pdf
tables =tabula.read_pdf (' C:/Users/mothe/OneDrive/Desktop/Sample.pdf ')
print (tables)
I did install tabula before installing tabula-py, and I’ve now uninstalled tabula via using pip uninstall tabula
, but I still get the same problem. How can I resolve this issue?
I was expecting to not get the Attribute error and for the tables to be printed.
2
Answers
Okay, so abit of an update ,I was trying all of this through my windows terminal ,spyder ,visual studio 2019 then vs code ,it didn't work in any of the IDEs mentioned ,i recently just ran it on pycharm and it worked ,I can generate tables using tabula !
only issue im having now is the missing columns but im sure that shouldn't be too hard to work out
Before installing tabula-py, ensure you have Java runtime on your environment.
If you don’t have it already, install Java
Try to run an example code (replace the appropriate PDF file name).
If there’s a FileNotFoundError when it calls read_pdf(), and when you type java on command line it says ‘java’ is not recognized as an internal or external command, operable program or batch file, you should set PATH environment variable to point to the Java directory.
Find the main Java folder like jre… or jdk…. On Windows 10 it was under C:Program FilesJava
On Windows 10: Control Panel -> System and Security -> System -> Advanced System Settings -> Environment Variables -> Select PATH –> Edit
Add the bin folder like C:Program FilesJavajre1.8.0_144bin, hit OK a bunch of times.
On command line, java should now print a list of options, and
tabula.read_pdf()
should run.Here is the full guide for installation:
https://tabula-py.readthedocs.io/en/latest/getting_started.html#get-tabula-py-working-windows-10