How can I calculate averages and min max values in a climate data frame after Creating an Index on Date-time in Pandas

How can I calculate averages and min max values in a climate data frame after Creating an Index on Date-time in Pandas

I have climate data dataframe in cs format but no headers, so I read data as follows:

    df_MERRA_data = pd.read_csv(fname,delimiter=',',header=None)
    df_MERRA_data.columns = \
        ['Date-time','Temperature','Wind speed','Sun   shine','Precipitation','Relative Humidity']

Then I create an index:

    def extract_YYYYMMDD(Date_val):
        year_val = np.rint(Date_val/1000000).astype(int)
        month_val = np.rint(Date_val/10000).astype(int) - year_val*100
        day_val = np.rint(Date_val/100).astype(int) - year_val*10000 - month_val*100
        hour_val = np.rint(Date_val).astype(int) - year_val*1000000 - month_val*10000 - day_val*100
        return year_val, month_val, day_val, hour_val

    year, month, day, hour = extract_YYYYMMDD(df_MERRA_data['Date-time'])
    df_MERRA_data['Year'] = year
    df_MERRA_data['Month'] = month
    df_MERRA_data['Day'] = day
    df_MERRA_data['Hour'] = hour

Now make an index based on date and time

    df_MERRA_data['Date-time'] =   pd.to_datetime(df_MERRA_data[['Year','Month','Day','Hour']], errors='coerce')\
    .dt.strftime('%Y-%m-%d %H')
    df_MERRA_data.set_index(['Date-time'], inplace=True)

*Then later when I want to extract data on the index I get the error:*

    df_MERRA_year = df_MERRA_data['Date-time'].dt.year

KeyError: 'Date-time'

These are the top three rows in the dataframe:

    df_MERRA_data.head(3)

Out[3]:

               Temperature  Wind speed  Sun shine  ...  Month  Day  Hour
Date-time                                          ...                  
1985-01-01 00         17.5        14.4         87  ...      1    1     0
1985-01-01 01         17.3        14.4         88  ...      1    1     1
1985-01-01 02         17.1        14.4         88  ...      1    1     2

[3 rows x 9 columns]

Can someone please help me to successfully access the records on the index so that I can determine yearly, monthly, weekly and daily averages, maximum, minmum, etc

Thank you in advance, I am a Pandas Novice.

I have changed the Date-time column name to Date_time and datetime with no success, I have also tried the code without

    df_MERRA_data.set_index(['Date-time'], inplace=True)

Thank you

Answer

Searching for a solution to your problem I found Read more article. It says that you can do the aggregation via resampling, for example:

df.resample('Y').mean()

and you can change the frequencies to

You can change the resample frequencies, such as:

  • D (daily)
  • W (weekly)
  • M (monthly)
  • Q (quarterly)
  • A (yearly)

Here you can read more about aggregation. The function for average is mean, for maximum is max and for minimum is min.

Enjoyed this article?

Check out more content on our blog or follow us on social media.

Browse more articles