I’m trying to sort list by frequency and then by name (pandas 1.3.2, python 3.10).
Firstly, I count each occurence in list, then, if amount is equal, names must be ordered alphabetically.
I found out that all works when len(list) < 19. Magic…
Code:
import pandas
df_data = pandas.DataFrame({
'data':
['14209adobepremiere', 'adobe-flash-player', 'adobe-flash-player-cis',
'adobe-photoshop-cc-cis', 'discord', 'discord', 'driverpack',
'freeoffice', 'freeoffice2018', 'generals',
'tiktok-for-pc-cis', 'tlauncher', 'utorrent', 'viber',
'winrar', 'zoom', 'zoom', 'zoom-client-for-conferences',
'zoom-client-for-conferences-cis']
})
with pandas.option_context('display.max_rows', None, 'display.max_columns', None):
print(df_data['data'].value_counts().sort_index(
ascending=True,
).sort_values(ascending=False))
Expected output (by count desc, then alphabetically asc):
discord 2
zoom 2
14209adobepremiere 1
adobe-flash-player 1
adobe-flash-player-cis 1
adobe-photoshop-cc-cis 1
driverpack 1
freeoffice 1
freeoffice2018 1
generals 1
tiktok-for-pc-cis 1
tlauncher 1
utorrent 1
viber 1
winrar 1
zoom-client-for-conferences 1
zoom-client-for-conferences-cis 1
Name: data, dtype: int64
Real output (by count desc, but not alphabetically asc):
zoom 2
discord 2
14209adobepremiere 1
tiktok-for-pc-cis 1
zoom-client-for-conferences 1
winrar 1
viber 1
utorrent 1
tlauncher 1
generals 1
adobe-flash-player 1
freeoffice2018 1
freeoffice 1
driverpack 1
adobe-photoshop-cc-cis 1
adobe-flash-player-cis 1
zoom-client-for-conferences-cis 1
Name: data, dtype: int64
Thnx in advance for any help.
2
Answers
I don’t think you can chain the
.sort_values
operations on the index and then data, one method could be to reset the index, sort and reapply the index.For counting frequencies only, you could use the
collections.Counter
object on the list directly, and if needed, convert the result to apandas.DataFrame
–Output