skip to Main Content

My actual question was this

Read exel file from url in python 3.6

but having tried the instructions in comments I got the error
io.UnsupportedOperation: seek

Here’s the code:

import pandas as pd
from urllib.request import Request, urlopen
url = "https://<myOrg>.sharepoint.com/:x:/s/x-taulukot/Ec0R1y3l7sdGsP92csSO-mgBI8WCN153LfEMvzKMSg1Zzg?e=6NS5Qh"
req = Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0')

content = urlopen(req)
pd.read_excel(content)
print(df)

and the result:

(venv) miettinj@ramen:~/beta/python> python test.py
Traceback (most recent call last):
  File "test.py", line 9, in <module>
    pd.read_excel(content)
  File "/srv/work/miettinj/beta/python/venv/lib/python3.6/site-packages/pandas/util/_decorators.py", line 296, in wrapper
    return func(*args, **kwargs)
  File "/srv/work/miettinj/beta/python/venv/lib/python3.6/site-packages/pandas/io/excel/_base.py", line 304, in read_excel
    io = ExcelFile(io, engine=engine)
  File "/srv/work/miettinj/beta/python/venv/lib/python3.6/site-packages/pandas/io/excel/_base.py", line 851, in __init__
    if _is_ods_stream(path_or_buffer):
  File "/srv/work/miettinj/beta/python/venv/lib/python3.6/site-packages/pandas/io/excel/_base.py", line 800, in _is_ods_stream
    stream.seek(0)
io.UnsupportedOperation: seek

2

Answers


  1. It can work by passing a string of the page (excel) content:

    import pandas as pd
    from urllib.request import Request, urlopen
    url = "https://<myOrg>.sharepoint.com/:x:/s/x-taulukot/Ec0R1y3l7sdGsP92csSO-mgBI8WCN153LfEMvzKMSg1Zzg?e=6NS5Qh"
    req = Request(url)
    req.add_header('User-Agent', 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0')
    
    content = urlopen(req)
    pd.read_excel(content.read().decode())
    print(df)
    
    Login or Signup to reply.
  2. Maybe you can use pd.read_excel with storage_options parameter (Pandas>=1.2):

    url = 'https://<myOrg>.sharepoint.com/:x:/s/x-taulukot/Ec0R1y3l7sdGsP92csSO-mgBI8WCN153LfEMvzKMSg1Zzg?e=6NS5Qh'
    headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}
    
    df = pd.read_excel(url, storage_options=headers)
    

    More information: Reading/writing remote files

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search