skip to Main Content

I’m working on a side project as an opportunity to learn Python and have cobbled together a webscraper that works most of the time. Essentially I have a list of ~2,000 items that I iterate over making a separate POST request for each which returns a JSON file with stock and price data.

The issue I am having is that on certain runs there is a random item in my list that returns: "TypeError: ‘NoneType’ object is not subscriptable". This knocks knocks me out of my for loop and I have to restart the entire process, with no guarantee that it works the next time through. Is there a way to skip over the items in the list that return this error and keep the script running?

The most recent error was for this row on roughly the 200th item in the list:

price = resp['data']['product']['pricing']['value']

Section of code:

for store_code in stores_codes:
 for prod in prods:

    query = {
        "operationName":"productClientOnlyProduct","variables":{
            "skipSpecificationGroup":False,"skipSubscribeAndSave":False,"skipKPF":False,"itemId":str(prod),"storeId":str(store_code),"zipCode":"75209" #not sure we need to change zip?
            },
        "query":"query productClientOnlyProduct($storeId: String, $zipCode: String, $itemId: String!, $dataSource: String, $loyaltyMembershipInput: LoyaltyMembershipInput, $skipSpecificationGroup: Boolean = false, $skipSubscribeAndSave: Boolean = false, $skipKPF: Boolean = false) {n  product(itemId: $itemId, dataSource: $dataSource, loyaltyMembershipInput: $loyaltyMembershipInput) {n    fulfillment(storeId: $storeId, zipCode: $zipCode) {n      backorderedn      fulfillmentOptions {n        typen        services {n          typen          locations {n            isAnchorn            inventory {n              isLimitedQuantityn              isOutOfStockn              isInStockn              quantityn              isUnavailablen              maxAllowedBopisQtyn              minAllowedBopisQtyn              __typenamen            }n            typen            storeNamen            locationIdn            curbsidePickupFlagn            isBuyInStoreCheckNearByn            distancen            staten            storePhonen            __typenamen          }n          deliveryTimelinen          deliveryDates {n            startDaten            endDaten            __typenamen          }n          deliveryChargen          dynamicEta {n            hoursn            minutesn            __typenamen          }n          hasFreeShippingn          freeDeliveryThresholdn          totalChargen          __typenamen        }n        fulfillablen        __typenamen      }n      anchorStoreStatusn      anchorStoreStatusTypen      backorderedShipDaten      bossExcludedShipStatesn      sthExcludedShipStaten      bossExcludedShipStaten      excludedShipStatesn      seasonStatusEligiblen      onlineStoreStatusn      onlineStoreStatusTypen      inStoreAssemblyEligiblen      __typenamen    }n    info {n      dotComColorEligiblen      hidePricen      ecoRebaten      quantityLimitn      sskMinn      sskMaxn      unitOfMeasureCoveragen      wasMaxPriceRangen      wasMinPriceRangen      fiscalYearn      productDepartmentn      classNumbern      forProfessionalUseOnlyn      globalCustomConfigurator {n        customButtonTextn        customDescriptionn        customExperiencen        customExperienceUrln        customTitlen        __typenamen      }n      paintBrandn      movingCalculatorEligiblen      labeln      prop65Warningn      returnablen      recommendationFlags {n        visualNavigationn        reqItemsn        batItemsn        __typenamen      }n      replacementOMSIDn      hasSubscriptionn      minimumOrderQuantityn      projectCalculatorEligiblen      subClassNumbern      calculatorTypen      isLiveGoodsProductn      protectionPlanSkun      hasServiceAddOnsn      consultationTypen      __typenamen    }n    itemIdn    dataSourcesn    identifiers {n      canonicalUrln      brandNamen      itemIdn      modelNumbern      productLabeln      storeSkuNumbern      upcGtin13n      specialOrderSkun      toolRentalSkuNumbern      rentalCategoryn      rentalSubCategoryn      upcn      productTypen      isSuperSkun      parentIdn      roomVOEnabledn      sampleIdn      __typenamen    }n    availabilityType {n      discontinuedn      statusn      typen      buyablen      __typenamen    }n    details {n      descriptionn      collection {n        urln        collectionIdn        __typenamen      }n      highlightsn      descriptiveAttributes {n        namen        valuen        bulletedn        sequencen        __typenamen      }n      infoAndGuides {n        namen        urln        __typenamen      }n      installation {n        leadGenUrln        __typenamen      }n      __typenamen    }n    media {n      images {n        urln        typen        subTypen        sizesn        __typenamen      }n      video {n        shortDescriptionn        thumbnailn        urln        videoStilln        link {n          textn          urln          __typenamen        }n        titlen        typen        videoIdn        longDescriptionn        __typenamen      }n      threeSixty {n        idn        urln        __typenamen      }n      augmentedRealityLink {n        usdzn        imagen        __typenamen      }n      richContent {n        contentn        __typenamen      }n      __typenamen    }n    pricing(storeId: $storeId) {n      promotion {n        dates {n          endn          startn          __typenamen        }n        typen        description {n          shortDescn          longDescn          __typenamen        }n        dollarOffn        percentageOffn        savingsCentern        savingsCenterPromosn        specialBuySavingsn        specialBuyDollarOffn        specialBuyPercentageOffn        experienceTagn        subExperienceTagn        anchorItemListn        itemListn        reward {n          tiers {n            minPurchaseAmountn            minPurchaseQuantityn            rewardPercentn            rewardAmountPerOrdern            rewardAmountPerItemn            rewardFixedPricen            __typenamen          }n          __typenamen        }n        __typenamen      }n      valuen      alternatePriceDisplayn      alternate {n        bulk {n          pricePerUnitn          thresholdQuantityn          valuen          __typenamen        }n        unit {n          caseUnitOfMeasuren          unitsOriginalPricen          unitsPerCasen          valuen          __typenamen        }n        __typenamen      }n      originaln      mapAboveOriginalPricen      messagen      preferredPriceFlagn      specialBuyn      unitOfMeasuren      __typenamen    }n    reviews {n      ratingsReviews {n        averageRatingn        totalReviewsn        __typenamen      }n      __typenamen    }n    seo {n      seoKeywordsn      seoDescriptionn      __typenamen    }n    specificationGroup @skip(if: $skipSpecificationGroup) {n      specifications {n        specNamen        specValuen        __typenamen      }n      specTitlen      __typenamen    }n    taxonomy {n      breadCrumbs {n        labeln        urln        browseUrln        creativeIconUrln        deselectUrln        dimensionNamen        refinementKeyn        __typenamen      }n      brandLinkUrln      __typenamen    }n    favoriteDetail {n      countn      __typenamen    }n    sizeAndFitDetail {n      attributeGroups {n        attributes {n          attributeNamen          dimensionsn          __typenamen        }n        dimensionLabeln        productTypen        __typenamen      }n      __typenamen    }n    subscription @skip(if: $skipSubscribeAndSave) {n      defaultfrequencyn      discountPercentagen      subscriptionEnabledn      __typenamen    }n    badges(storeId: $storeId) {n      labeln      colorn      creativeImageUrln      endDaten      messagen      namen      timerDurationn      timer {n        timeBombThresholdn        daysLeftThresholdn        dateDisplayThresholdn        messagen        __typenamen      }n      __typenamen    }n    keyProductFeatures @skip(if: $skipKPF) {n      keyProductFeaturesItems {n        features {n          namen          refinementIdn          refinementUrln          valuen          __typenamen        }n        __typenamen      }n      __typenamen    }n    seoDescriptionn    installServices {n      scheduleAMeasuren      __typenamen    }n    dataSourcen    __typenamen  }n}n"}

    url = 'https://www.homedepot.com/federation-gateway/graphql?opname=productClientOnlyProduct'

    resp = s.post(url,headers=headers,json=query).json()

    name = resp['data']['product']['identifiers']['productLabel']
    price = resp['data']['product']['pricing']['value']
    stock = resp['data']['product']['fulfillment']['fulfillmentOptions'][0]['services'][0]['locations'][0]['inventory']['quantity']

3

Answers


  1. You can surround the last three lines where you access fields of the response in a try/except block.

    try:
        name = resp['data']['product']['identifiers']['productLabel']
        price = resp['data']['product']['pricing']['value']
        stock = resp['data']['product']['fulfillment']['fulfillmentOptions'][0]['services'][0]['locations'][0]['inventory']['quantity']
    except TypeError:
        print("error occured")
        continue
    

    The continue keyword is use to continue with the next iteration, regardless of what would follow later in your code block.

    Login or Signup to reply.
  2. add a line that

    if prod is None:
        del a [prod]
    

    you could also use a try ,

    except TypeError:
       #what you want to do if type is none
    
    Login or Signup to reply.
  3. You have 2 options:

    Use try-except to catch errors:

    try:
        name = resp['data']['product']['identifiers']['productLabel']
        price = resp['data']['product']['pricing']['value']
        stock = resp['data']['product']['fulfillment']['fulfillmentOptions'][0]['services'][0]['locations'][0]['inventory']['quantity']
    except TypeError:
        print("Data not found")
    

    Use get instead of brackets [] to access the data. That way if something doesn’t exist you can replace it with a default value, and for stock, use next so you can return a default in case it doesn’t exist:

    product = resp.get('data', dict()).get('product', dict())
    name = product.get('identifiers', dict()).get('productLabel', '')
    price = product.get('pricing', dict()).get('value', 0)
    stock = next((product.get('fulfillment', dict()).get('fulfillmentOptions', dict())), 0) # returns 0 if it doesn't exist
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search