Python Pandas

Python Pandas

Take a look at this documentation about Group By

Group series using mapper (dict or key function, apply given function
to group, return result as series) or by a series of columns

The previous is taken from here

Heres a quick example:

df = pd.DataFrame({a:[1,1,1,2,2,2,3,3,3,3],b:np.random.randn(10)})

df
   a         b
0  1  1.048099
1  1 -0.830804
2  1  1.007282
3  2 -0.470914
4  2  1.948448
5  2 -0.144317
6  3 -0.645503
7  3 -1.694219
8  3  0.375280
9  3 -0.065624

groups = df.groupby(a)

groups # Tells you what df.groupby(a) is, not an error
<pandas.core.groupby.DataFrameGroupBy object at 0x00000000097EEB38>

groups.count() # count the number of 1 present in the a column
   b
a   
1  3
2  3
3  4

groups.sum() # sums the b column values based on a grouping

          b
a          
1  1.224577
2  1.333217
3 -2.030066

You get the idea, you can build from here using the first link I provided.

df_count = groups.count()

df_count
   b
a   
1  3
2  3
3  4

type(df_count) # assigning the `.count()` output to a variable create a new df
pandas.core.frame.DataFrame

Python Pandas

Leave a Reply

Your email address will not be published.