skip to Main Content

Let me first start by saying I have gone through and done my due diligence trying to find a solution based on questions previously asked on the web.

I’ve run into an odd bug in my code that I really cannot explain…
So far my code executes the following:

  1. take stock symbols and write OHLC data to a CSV file

  2. loop through the directory that contains the CSV files and use that data to calculate technical indicators

  3. add the technical indicator data to the same CSV file

    So the bug is that it executes everything perfectly (99 stocks) EXCEPT for ZM.csv (Zoom). The error that it prints is"

    pandas.errors.EmptyDataError: No columns to parse from file.

So to troubleshoot I copied and pasted the data from ZM.csv into a CSV that I know ran fine (I used AAPL) and it actually executed fine. Next, I took the working data from AAPL.csv, pasted it into ZM.csv and ran it again. It throws the same error. I also tried renaming the file to ZMI (randomly) and it worked.

This led me to believe that for some unknown reason that the FILENAME is the root issue. The part where I first create the CSV files, I changed the name of the file to be {symbol}1.csv, {symbol}_.csv, and {symbol}I.csv to no avail. Lastly, I combined the two files together and did not mess with anything else. It worked. Does anyone know why?

The flow is to first run bars.py, check the data/ohlc/ directory CSV files (should only have the OHLC data), run technical_analysis.py, and then check the CSV files again (now with technical indicators).

[bar.py]

    from config import *
    from datetime import datetime
    import requests, json

    holdings = open('data/qqq.csv').readlines()

    symbols_list = [holding.split(',')[2].strip() for holding in holdings][1:]
    symbols = ','.join(symbols_list)

    minute_bars_url = '{}/1Min?symbols={}&limit=100'.format(BARS_URL, symbols)
    r = requests.get(minute_bars_url, headers=HEADERS)

    ohlc_data = r.json()

    for symbol in ohlc_data:
        filename = 'data/ohlc/{}.csv'.format(symbol)
        f = open(filename, 'w+')
        f.write('Timestamp,Open,High,Low,Close,Volumen')
        for bar in ohlc_data[symbol]:
            t = datetime.fromtimestamp(bar['t'])
            timestamp = t.strftime('%I:%M:%S%p-%Z%Y-%m-%d')
            line = '{},{},{},{},{},{}n'.format(timestamp, bar['o'], bar['h'],                                                 
                                                 bar['l'], bar['c'], bar['v'])
            f.write(line)

The variables symbols_list and symbols print as follows:

symbols_list = ['AAPL', 'MSFT', 'AMZN', 'FB', 'GOOGL', 'GOOG', 'TSLA', 'NVDA', 'PYPL', 'ADBE', 'INTC', 'NFLX', 'CMCSA', 'PEP', 'COST', 'CSCO', 'AVGO', 'QCOM', 'TMUS', 'AMGN', 'TXN', 'CHTR', 'SBUX', 'ZM', 'AMD', 'INTU', 'ISRG', 'MDLZ', 'JD', 'GILD', 'BKNGLD', 'BKNG', 'FISV', 'MELI', 'ATVI', 'ADP', 'CSX', 'REGN', 'MU', 'AMAT', 'ADSK', 'VRTX', 'LRCX', 'ILMN', 'ADI', 'BIIB', 'MNST', 'EXC', 'KDP', 'LULU', 'DOCU', 'WDAY', 'CTSH', 'KHC', 'NXPI', 'BIDU', 'XEL', 'DXCM', 'EBAY', 'EA', 'ID', 'SNPS',XX', 'CTAS', 'SNPS', 'ORLY', 'SGEN', 'SPLK', 'ROST', 'WBA', 'KLAC', 'NTES', 'PCAR', 'CDNS', 'MAR', 'VRSK', 'PAYX', 'ASML', 'ANSS', 'MCHP', 'XLNX', 'MRNA', 'CPRT', 'ALGN', 'PDD', 'ALXN', 'SIRI', 'FAST', 'SWKS', 'VRSN', 'DLTR', 'CE 'TTWO', 'RN', 'MXIM', 'INCY', 'TTWO', 'CDW', 'CHKP', 'CTXS', 'TCOM', 'BMRN', 'ULTA', 'EXPE', 'FOXA', 'LBTYK', 'FOX', 'LBTYA']
symbols = AAPL,MSFT,AMZN,FB,GOOGL,GOOG,TSLA,NVDA,PYPL,ADBE,INTC,NFLX,CMCSA,PEP,COST,CSCO,AVGO,QCOM,TMUS,AMGN,TXN,CHTR,SBUX,ZM,AMD,INTU,ISRG,MDLZ,JD,GILD,BKNG,FISV,MELI,ATVI,ADP,CSX,REGN,MU,AMAT,ADSK,VRTX,LRCX,ILMN,ADI,BIIB,MNST,EXC,KDP,LULU,DOCU,WDAU,DOCU,WDAY,CTSH,KHC,NXPI,BIDU,XEL,DXCM,EBAY,EA,IDXX,CTAS,SNPS,ORLY,SGEN,SPLK,ROST,WBA,KLAC,NTES,PCAR,CDNS,MAR,VRSK,PAYX,ASML,ANSS,MCHP,XLNX,MRNA,CPRT,ALGN,PDD,ALXN,SIRI,FAST,SWKS,VRSN,DLTR,CERN,MXIM,INCY,TTWO,CDW,CHKP,CTXS,TCOM,EXPE,FOXA,BMRN,ULTA,EXPE,FOXA,LBTYK,FOX,LBTYA

So ZM is not listed last.

[technical_analysis.py]

    import btalib
    import pandas as pd
    from datetime import datetime
    from bars import ohlc_data
    from bars import symbols_list as symbols

    for symbol in symbols:
        try:
            file_path = f'data/ohlc/{symbol}.csv'
            dataframe = pd.read_csv(file_path,
                                parse_dates=True,
                                index_col='Timestamp')

            sma6 = btalib.sma(dataframe, period=6)
            sma10 = btalib.sma(dataframe, period=10)
            rsi = btalib.rsi(dataframe)
            macd = btalib.macd(dataframe)

            dataframe['SMA-6'] = sma6.df
            dataframe['SMA-10'] = sma10.df
            dataframe['RSI'] = rsi.df
            dataframe['MACD'] = macd.df['macd']
            dataframe['Signal'] = macd.df['signal']
            dataframe['Histogram'] = macd.df['histogram']

            f = open(file_path, 'w+')
            dataframe.to_csv(file_path, sep=',', index=True)
        except:
            print(f'{symbol} is not writing the technical data.')

2

Answers


  1. You can probably reduce the code more to get a minimally viable example. I suspect there is something funny in the qqq.csv file and the split/strip code that makes the last entry not quite what you want.

    Hopefully, that’ll be clear printing the variable values as below.

    with data/qqq.csv like

    xname,yname,symbol
    xxx,yyy,ZM
    

    and py example

    def write_OHLC(fname):
        "write example data to a file"
        f = open(fname, 'w+')
        f.write('Timestamp,Open,High,Low,Close,Volumen')
        # IRL, would parse json and spitout meaningful values
        f.write('2020-10-13 16:30,1,10,5,100n')
    
    
    def all_symbols():
        "get list of all symbols from qqq.csv"
        holdings = open('data/qqq.csv').readlines()
        symbols_list = [holding.split(',')[2].strip() for holding in holdings][1:]
        return symbols_list
    
    # issue saving/reading last(?) symbol
    symbols = all_symbols()
    print(symbols)
    # check just zoom
    zm_sym = symbols[-1]
    fname = f'data/ohlc/{zm_sym}.csv'
    # inspect
    print(zm_sym)
    print(fname)
    # write and read back
    write_OHLC(fname)
    ZM = pd.read_csv(fname,
                     parse_dates=True,
                     index_col='Timestamp')
    print(ZM)
    
    Login or Signup to reply.
  2. I think the error might be since ‘ZM’ is the last symbol in holdings, it contains some whitespace, due to in [bar.py] you created holdings the following way (instead of just the normal pd.read_csv):

    holdings = open('data/qqq.csv').readlines()
    
    symbols_list = [holding.split(',')[2].strip() for holding in holdings][1:]
    symbols = ','.join(symbols_list)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search