I’m trying to find all combinations of a substring found in a single column, and then explode the dataframe with all possible combinations of each word.
Example Dataframe
URL Keyword
0 http://www.amazon.com Amazon Lightning Sale
1 https://www.ebay.com Shop eBay Today
Desired Output
URL Keyword
0 http://www.amazon.com Amazon Lightning Sale
1 http://www.amazon.com Amazon Sale Lightning
2 http://www.amazon.com Lightning Amazon Sale
3 http://www.amazon.com Sale Amazon Lightning
4 http://www.amazon.com Sale Lightning Amazon
5 http://www.amazon.com Lightning Sale Amazon
6 https://www.ebay.com Shop eBay Today
7 https://www.ebay.com Shop Today eBay
8 https://www.ebay.com eBay Shop Today
9 https://www.ebay.com eBay Today Shop
10 https://www.ebay.com Today eBay Shop
11 https://www.ebay.com Today Shop eBay
Minimum Reproducable Example
import pandas as pd
# initialize data of lists.
data = {'URL': ['http://www.amazon.com', 'https://www.ebay.com'],
'Keyword': ["Amazon Lightning Sale", "Shop eBay Today"]}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
print(df)
I’ve tried the solution here: Pandas DataFrame Combinations and expand but it’s not quite what I need.
3
Answers
create an ID for each row
Create the combinations for each row as a new DataFrame, or dict or …
Join
Joost Döbken’s answer is a bit more elegant
First the
itertools.permutations
method applied to the Keyword column will create all possible combinations of keywords as a list.Next you can use the
pandas.DataFrame.explode
function to create many items from the created lists.If you really want a full string instead of a tuple of keywords, you can replace the
list(...)
part with a string join:[" ".join(t) for t in permutations(x.split())]
Here is an alternative way without using itertools:
Output: