I have a dataframe with a list of urls for which I want to extract a couple of values. The returned key/values should then be added to the original dataframe with the keys as new columns and the respective values.
I thought that this would magically happen with
result_type='expand'
which it obviously doesn’t. When I try
df5["data"] = df5.apply(lambda x: request_function(x['url']),axis=1, result_type='expand')
I end up with my results all in one data column:
[{'title': ['Python Notebooks: Connect to Google Search Console API and Extract Data - Adapt'], 'description': []}]
The result I am aiming for is a Dataframe with the following 3 columns:
| URL| Title | Description|
Here is my code:
import requests
from requests_html import HTMLSession
import pandas as pd
from urllib import parse
ex_dic = {'url': ['https://www.searchenginejournal.com/reorganizing-xml-sitemaps-python/295539/', 'https://searchengineland.com/check-urls-indexed-google-using-python-259773', 'https://adaptpartners.com/technical-seo/python-notebooks-connect-to-google-search-console-api-and-extract-data/']}
df5 = pd.DataFrame(ex_dic)
df5
def request_function(url):
try:
found_results = []
r = session.get(url)
title = r.html.xpath('//title/text()')
description = r.html.xpath("//meta[@name='description']/@content")
found_results.append({ 'title': title, 'description': description})
return found_results
except requests.RequestException:
print("Connectivity error")
except (KeyError):
print("anoter error")
df5.apply(lambda x: request_function(x['url']),axis=1, result_type='expand')
2
Answers
ex_dic
should be list of dict, so that you can update the applied attribute.It actually works as you expect, if your function would return just a dictionary, not a list of dictionaries. Further, inside of your keys just provide a string, not a list. Then it works as you expect. See my example code:
Gives you this:
You can also just adjust your lambda function:
But you still need to ensure that the key values are strings, not lists.