How can I calculate averages and min max values in a climate data frame after Creating an Index on Date-time in Pandas

I have climate data dataframe in cs format but no headers, so I read data as follows:
df_MERRA_data = pd.read_csv(fname,delimiter=',',header=None)
df_MERRA_data.columns = \
['Date-time','Temperature','Wind speed','Sun shine','Precipitation','Relative Humidity']
Then I create an index:
def extract_YYYYMMDD(Date_val):
year_val = np.rint(Date_val/1000000).astype(int)
month_val = np.rint(Date_val/10000).astype(int) - year_val*100
day_val = np.rint(Date_val/100).astype(int) - year_val*10000 - month_val*100
hour_val = np.rint(Date_val).astype(int) - year_val*1000000 - month_val*10000 - day_val*100
return year_val, month_val, day_val, hour_val
year, month, day, hour = extract_YYYYMMDD(df_MERRA_data['Date-time'])
df_MERRA_data['Year'] = year
df_MERRA_data['Month'] = month
df_MERRA_data['Day'] = day
df_MERRA_data['Hour'] = hour
Now make an index based on date and time
df_MERRA_data['Date-time'] = pd.to_datetime(df_MERRA_data[['Year','Month','Day','Hour']], errors='coerce')\
.dt.strftime('%Y-%m-%d %H')
df_MERRA_data.set_index(['Date-time'], inplace=True)
*Then later when I want to extract data on the index I get the error:*
df_MERRA_year = df_MERRA_data['Date-time'].dt.year
KeyError: 'Date-time'
These are the top three rows in the dataframe:
df_MERRA_data.head(3)
Out[3]:
Temperature Wind speed Sun shine ... Month Day Hour
Date-time ...
1985-01-01 00 17.5 14.4 87 ... 1 1 0
1985-01-01 01 17.3 14.4 88 ... 1 1 1
1985-01-01 02 17.1 14.4 88 ... 1 1 2
[3 rows x 9 columns]
Can someone please help me to successfully access the records on the index so that I can determine yearly, monthly, weekly and daily averages, maximum, minmum, etc
Thank you in advance, I am a Pandas Novice.
I have changed the Date-time column name to Date_time and datetime with no success, I have also tried the code without
df_MERRA_data.set_index(['Date-time'], inplace=True)
Thank you
Answer
Searching for a solution to your problem I found Read more article. It says that you can do the aggregation via resampling, for example:
df.resample('Y').mean()
and you can change the frequencies to
You can change the resample frequencies, such as:
- D (daily)
- W (weekly)
- M (monthly)
- Q (quarterly)
- A (yearly)
Here you can read more about aggregation. The function for average is mean
, for maximum is max
and for minimum is min
.
Enjoyed this article?
Check out more content on our blog or follow us on social media.
Browse more articles