How to return NaN if all values are NaN using the agg() function specifying aggregation output columns
I have a dataframe like so:
data = {'Integers': [1, 2, np.nan, 4, 5],
'AllNaN': [np.nan, np.nan, np.nan, np.nan, np.nan]}
df = pd.DataFrame(data)
I want to return NaN
when performing the sum aggregations on the datagrame. There are solutions on here that advises to use agg(pd.Series.sum, min_count=1)
. However the way I have my aggregations are using the alternate agg
method like so:
agg_df=df.agg(SummedInt=('Integers','sum'), sumofallNaN=('AllNaN','sum')).reset_index()
How do I use the min_count=1
argument with this method?
Answer
IIUC, you can use a lambda function like this:
agg(SummedInt=('Integers',lambda x: x.sum(min_count=1)))
Output:
Integers
SummedInt 12.0