skip to Main Content

I have a data sourcing project I thought would be good for a first foray into Python. I have managed to work out the downloading and file writing (largely with info from this site), but after months of on and off tinkering, I have been unable to figure out the most important part – parsing the schedule data into the daily list of times format that I want.

The end goal is a batch of txt files with the data in the format I need to copy & paste into graphics software, eliminating the tedious manual processing step I have been doing every week on data received by email from multiple sources and in multiple formats.

CODE (problem section, print instead write to file):

# get showtimes data from the local JSON files & write to txt file (eventually)
def ShowtimesParseJSON ():
    
        import os
        import json 
        import datetime
        import time
        from datetime import timezone
        from dateutil import tz

        with open('showtimes.json') as file:
            data = json.load(file)
            print ('n')

            # loop to print each title, rating
            for title in data['titles']:
                print ('n')
                print (title.get('title'),"(",title.get('rating'),")")
                print ('-------------------------------------')

                latest_day = ""
             
                # TO DO: Limit enumerate & write to Friday->Thursday
                for idx, perf in enumerate(title['perf'], start=1):
                    this_day = datetime.datetime.fromtimestamp(perf.get('start'), tz=timezone.utc)
                    # exact same result with or without the if below ??
                    if(this_day != latest_day):
                        latest_day = this_day
                        #day of week
                        print (f'{this_day:%a (%b %d)}', end="t")
                    #daily times
                    print (f'{this_day:%#I:%M}')
                print (f'Total shows: ', idx)
                print ('-------------------------------------')    

ShowtimesParseJSON ()

JSON (showtimes.json edited to remove extra fields):

{
    "titles":
    [
        {
            "id": 822,
            "title": "80 FOR BRADY",
            "longTitle": "80 FOR BRADY",
            "rating": "PG",
            "perf":
            [
                {
                    "id": 97632,
                    "start": 1677848400
                },
                {
                    "id": 97633,
                    "start": 1677855600
                },
                {
                    "id": 97504,
                    "start": 1677942300
                },
                {
                    "id": 97510,
                    "start": 1677949200
                },
                {
                    "id": 97683,
                    "start": 1678044000
                },
                {
                    "id": 97710,
                    "start": 1678051200
                },
                {
                    "id": 97732,
                    "start": 1678107600
                },
                {
                    "id": 97720,
                    "start": 1678114800
                },
                {
                    "id": 97748,
                    "start": 1678201200
                },
                {
                    "id": 97761,
                    "start": 1678224900
                },
                {
                    "id": 97788,
                    "start": 1678280400
                },
                {
                    "id": 97776,
                    "start": 1678287600
                },
                {
                    "id": 97816,
                    "start": 1678366800
                },
                {
                    "id": 97804,
                    "start": 1678374000
                }
            ]
        }
    ]
}

I admit, I have forgotten a number of attempts at solving the day/times looping issue since I took a break for a while after finding a workaround which is no longer available. There was an attempt at using a while loop that upset my computer greatly. And I do have some PHP code that parses the JSON and tried to ‘translate’ that into Python with no luck.

I think it has something to do with the nesting/looping the timestamp twice but I haven’t worked it out yet and I’m getting very tired of the manual data cleanup every week. Would really appreciate a nudge in the right direction.

What I am trying for format wise (from Fri to Thu, with a list of times for each day):

80 FOR BRADY ( PG )
-------------------------------------
Fri (Mar 03)    1:00 - 3:00
Sat (Mar 04)    3:05 - 5:00
...
Thu (Mar 09)    1:00 - 3:00
Total shows:  14
-------------------------------------

What I am actually getting is each time with the day:

80 FOR BRADY ( PG )
-------------------------------------
Fri (Mar 03)    1:00
Fri (Mar 03)    3:00
Sat (Mar 04)    3:05
...
Thu (Mar 09)    3:00
Total shows:  14
-------------------------------------

2

Answers


  1. If I understand you right you want to group the days and then join the time:

    import json
    import datetime
    from datetime import timezone
    from itertools import groupby
    
    def ShowtimesParseJSON():
        all_data = []
    
        with open("your_data.json") as file:
            data = json.load(file)
    
            # loop to print each title, rating
            for title in data["titles"]:
                for perf in title["perf"]:
                    this_day = datetime.datetime.fromtimestamp(
                        perf.get("start"), tz=timezone.utc
                    )
                    day = f"{this_day:%a (%b %d)}"
                    t = f"{this_day:%#I:%M}"
    
                    all_data.append((day, t))
    
                print(title.get('title'),"(",title.get('rating'),")")
                print('-------------------------------------')
                for _, g in groupby(all_data, lambda t: t[0]):
                    g = list(g)
                    print(g[0][0] + 't' + ' - '.join(t for _, t in g))
                print(f'Total shows: ', len(title["perf"]))
                print('-------------------------------------')
    
    
    ShowtimesParseJSON()
    

    Prints:

    80 FOR BRADY ( PG )
    -------------------------------------
    Fri (Mar 03)    01:00 - 03:00
    Sat (Mar 04)    03:05 - 05:00
    Sun (Mar 05)    07:20 - 09:20
    Mon (Mar 06)    01:00 - 03:00
    Tue (Mar 07)    03:00 - 09:35
    Wed (Mar 08)    01:00 - 03:00
    Thu (Mar 09)    01:00 - 03:00
    Total shows:  14
    -------------------------------------
    
    Login or Signup to reply.
  2. A less professional approach without modifying your code too much and following your idea of checking the day before and after in order to format correctly:

    import json 
    import datetime
    from datetime import timezone
    
    def ShowtimesParseJSON ():
        with open('showtimes.json') as file:
            data = json.load(file)
            print ('n')
    
            for title in data['titles']:
                print ('n')
                print (title.get('title'),"(",title.get('rating'),")")
                print ('-------------------------------------')
    
                for idx, (perf, prev_perf,next_perf) in enumerate(zip([None]+title['perf']+[None],[None,None]+title['perf'],title['perf']+[None,None])):
                    if perf == None:
                            continue
                    perf_time = datetime.datetime.fromtimestamp(perf.get('start'), tz=timezone.utc)
                    if prev_perf != None:
                        prev_perf_time = datetime.datetime.fromtimestamp(prev_perf.get('start'), tz=timezone.utc) 
                    else:
                        prev_perf_time = datetime.datetime.fromtimestamp(perf.get('start'), tz=timezone.utc)
                    if next_perf != None:
                        next_perf_time = datetime.datetime.fromtimestamp(next_perf.get('start'), tz=timezone.utc)
                    else:
                        next_perf_time = datetime.datetime.fromtimestamp(perf.get('start'), tz=timezone.utc)
                    if perf_time.date() != prev_perf_time.date() or idx == 1:
                        print (f'{perf_time:%a (%b %d)}', end="t")
                    if perf_time.date() == next_perf_time.date() and idx != len(title['perf']):   
                        print (f'{perf_time:%#I:%M}', end=" - ")
                    else:
                        print (f'{perf_time:%#I:%M}')
                print (f'Total shows: ', idx-1)
                print ('-------------------------------------')    
    
    ShowtimesParseJSON ()
    

    Prints:

    80 FOR BRADY ( PG )
    -------------------------------------
    Fri (Mar 03)    1:00 - 3:00
    Sat (Mar 04)    3:05 - 5:00
    Sun (Mar 05)    7:20 - 9:20
    Mon (Mar 06)    1:00 - 3:00
    Tue (Mar 07)    3:00 - 9:35
    Wed (Mar 08)    1:00 - 3:00
    Thu (Mar 09)    1:00 - 3:00
    Total shows:  14
    -------------------------------------
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search