Let me first start by saying I have gone through and done my due diligence trying to find a solution based on questions previously asked on the web.
I’ve run into an odd bug in my code that I really cannot explain…
So far my code executes the following:
-
take stock symbols and write OHLC data to a CSV file
-
loop through the directory that contains the CSV files and use that data to calculate technical indicators
-
add the technical indicator data to the same CSV file
So the bug is that it executes everything perfectly (99 stocks) EXCEPT for ZM.csv (Zoom). The error that it prints is"
pandas.errors.EmptyDataError: No columns to parse from file.
So to troubleshoot I copied and pasted the data from ZM.csv into a CSV that I know ran fine (I used AAPL) and it actually executed fine. Next, I took the working data from AAPL.csv
, pasted it into ZM.csv
and ran it again. It throws the same error. I also tried renaming the file to ZMI (randomly) and it worked.
This led me to believe that for some unknown reason that the FILENAME is the root issue. The part where I first create the CSV files, I changed the name of the file to be {symbol}1.csv, {symbol}_.csv, and {symbol}I.csv to no avail. Lastly, I combined the two files together and did not mess with anything else. It worked. Does anyone know why?
The flow is to first run bars.py
, check the data/ohlc/
directory CSV files (should only have the OHLC data), run technical_analysis.py
, and then check the CSV files again (now with technical indicators).
[bar.py]
from config import *
from datetime import datetime
import requests, json
holdings = open('data/qqq.csv').readlines()
symbols_list = [holding.split(',')[2].strip() for holding in holdings][1:]
symbols = ','.join(symbols_list)
minute_bars_url = '{}/1Min?symbols={}&limit=100'.format(BARS_URL, symbols)
r = requests.get(minute_bars_url, headers=HEADERS)
ohlc_data = r.json()
for symbol in ohlc_data:
filename = 'data/ohlc/{}.csv'.format(symbol)
f = open(filename, 'w+')
f.write('Timestamp,Open,High,Low,Close,Volumen')
for bar in ohlc_data[symbol]:
t = datetime.fromtimestamp(bar['t'])
timestamp = t.strftime('%I:%M:%S%p-%Z%Y-%m-%d')
line = '{},{},{},{},{},{}n'.format(timestamp, bar['o'], bar['h'],
bar['l'], bar['c'], bar['v'])
f.write(line)
The variables symbols_list and symbols print as follows:
symbols_list = ['AAPL', 'MSFT', 'AMZN', 'FB', 'GOOGL', 'GOOG', 'TSLA', 'NVDA', 'PYPL', 'ADBE', 'INTC', 'NFLX', 'CMCSA', 'PEP', 'COST', 'CSCO', 'AVGO', 'QCOM', 'TMUS', 'AMGN', 'TXN', 'CHTR', 'SBUX', 'ZM', 'AMD', 'INTU', 'ISRG', 'MDLZ', 'JD', 'GILD', 'BKNGLD', 'BKNG', 'FISV', 'MELI', 'ATVI', 'ADP', 'CSX', 'REGN', 'MU', 'AMAT', 'ADSK', 'VRTX', 'LRCX', 'ILMN', 'ADI', 'BIIB', 'MNST', 'EXC', 'KDP', 'LULU', 'DOCU', 'WDAY', 'CTSH', 'KHC', 'NXPI', 'BIDU', 'XEL', 'DXCM', 'EBAY', 'EA', 'ID', 'SNPS',XX', 'CTAS', 'SNPS', 'ORLY', 'SGEN', 'SPLK', 'ROST', 'WBA', 'KLAC', 'NTES', 'PCAR', 'CDNS', 'MAR', 'VRSK', 'PAYX', 'ASML', 'ANSS', 'MCHP', 'XLNX', 'MRNA', 'CPRT', 'ALGN', 'PDD', 'ALXN', 'SIRI', 'FAST', 'SWKS', 'VRSN', 'DLTR', 'CE 'TTWO', 'RN', 'MXIM', 'INCY', 'TTWO', 'CDW', 'CHKP', 'CTXS', 'TCOM', 'BMRN', 'ULTA', 'EXPE', 'FOXA', 'LBTYK', 'FOX', 'LBTYA']
symbols = AAPL,MSFT,AMZN,FB,GOOGL,GOOG,TSLA,NVDA,PYPL,ADBE,INTC,NFLX,CMCSA,PEP,COST,CSCO,AVGO,QCOM,TMUS,AMGN,TXN,CHTR,SBUX,ZM,AMD,INTU,ISRG,MDLZ,JD,GILD,BKNG,FISV,MELI,ATVI,ADP,CSX,REGN,MU,AMAT,ADSK,VRTX,LRCX,ILMN,ADI,BIIB,MNST,EXC,KDP,LULU,DOCU,WDAU,DOCU,WDAY,CTSH,KHC,NXPI,BIDU,XEL,DXCM,EBAY,EA,IDXX,CTAS,SNPS,ORLY,SGEN,SPLK,ROST,WBA,KLAC,NTES,PCAR,CDNS,MAR,VRSK,PAYX,ASML,ANSS,MCHP,XLNX,MRNA,CPRT,ALGN,PDD,ALXN,SIRI,FAST,SWKS,VRSN,DLTR,CERN,MXIM,INCY,TTWO,CDW,CHKP,CTXS,TCOM,EXPE,FOXA,BMRN,ULTA,EXPE,FOXA,LBTYK,FOX,LBTYA
So ZM is not listed last.
[technical_analysis.py]
import btalib
import pandas as pd
from datetime import datetime
from bars import ohlc_data
from bars import symbols_list as symbols
for symbol in symbols:
try:
file_path = f'data/ohlc/{symbol}.csv'
dataframe = pd.read_csv(file_path,
parse_dates=True,
index_col='Timestamp')
sma6 = btalib.sma(dataframe, period=6)
sma10 = btalib.sma(dataframe, period=10)
rsi = btalib.rsi(dataframe)
macd = btalib.macd(dataframe)
dataframe['SMA-6'] = sma6.df
dataframe['SMA-10'] = sma10.df
dataframe['RSI'] = rsi.df
dataframe['MACD'] = macd.df['macd']
dataframe['Signal'] = macd.df['signal']
dataframe['Histogram'] = macd.df['histogram']
f = open(file_path, 'w+')
dataframe.to_csv(file_path, sep=',', index=True)
except:
print(f'{symbol} is not writing the technical data.')
2
Answers
You can probably reduce the code more to get a minimally viable example. I suspect there is something funny in the
qqq.csv
file and the split/strip code that makes the last entry not quite what you want.Hopefully, that’ll be clear printing the variable values as below.
with
data/qqq.csv
likeand py example
I think the error might be since ‘ZM’ is the last symbol in
holdings
, it contains some whitespace, due to in[bar.py]
you createdholdings
the following way (instead of just the normalpd.read_csv
):