Pandas is very effective tool for handling tabular data like this. You could create a pandas DataFrame from the data:
import pandas as pddf = pd.DataFrame(d).Tdf.columns = ('group', 'place', 'value')
and then just print out the maximum values
df[df['value'] == df.groupby('group')['value'].transform('max')]
which gives
Out[41]: group place value1 1 Toronto 0.810 2 Arkansas 0.88714114 3 Toronto 0.823 4 Nashville International Airport 0.827 4 Turkey 0.830 5 Hebron, Kentucky 0.838 5 County (United States) 0.842 6 Ontario 0.848 7 London 0.95458558 8 UNITED STATES OF AMERICA 0.84546361 9 RaleighDurham International Airport 0.865 9 Piauí 0.872 10 New Jersey 0.881 10 Morrisville, Bucks County, Pennsylvania 0.884 10 United States 0.888 10 New Brunswick, New Jersey 0.8
If you want to get the output in the original format, you can use df.to_dict
In [47]: df[df['value'] == df.groupby('group')['value'].transform('max')].T.to_dict(orient='list')Out[47]:{1: ['1', 'Toronto', 0.8], 10: ['2', 'Arkansas', 0.8871413469314575], 14: ['3', 'Toronto', 0.8], 23: ['4', 'Nashville International Airport', 0.8], 27: ['4', 'Turkey', 0.8], 30: ['5', 'Hebron, Kentucky', 0.8], 38: ['5', 'County (United States)', 0.8], 42: ['6', 'Ontario', 0.8], 48: ['7', 'London', 0.9545853137969971], 58: ['8', 'UNITED STATES OF AMERICA', 0.8454625606536865], 61: ['9', 'RaleighDurham International Airport', 0.8], 65: ['9', 'Piauí', 0.8], 72: ['10', 'New Jersey', 0.8], 81: ['10', 'Morrisville, Bucks County, Pennsylvania', 0.8], 84: ['10', 'United States', 0.8], 88: ['10', 'New Brunswick, New Jersey', 0.8]}
Short explanation
- Pandas dataframes can be created using dictionaries as arguments. The values should be lists. The
.T
takes just transpose of the table. - The
df.groupby('group')['value']
returns a SeriesGroupBy object, which behaves very much like a regular pandas.Series object. With that we can calculate the maximumvalue
for eachgroup
, using thetransform
method. - The
df['value'] == df.groupby('group')['value'].transform('max')
creates a boolean mask for selecting the maximum rows bydf[mask]
.