Etoiles inactivesEtoiles inactivesEtoiles inactivesEtoiles inactivesEtoiles inactives

Temperatures taken from this website:

This dataset is updated monthly to be updated early september with august temeratures (to be updated in early september with august temeratures).

python import numpy as np import pandas as pd import matplotlib.pyplot as plt'ggplot') %matplotlib inline

Import and inspect the data


url = ''

df = pd.readcsv('./data/export-lyon.csv', header=3, parsedates=['DATE'], index_col='DATE') ```

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 3499 entries, 2009-01-01 to 2018-07-31
Data columns (total 17 columns):
MAX_TEMPERATURE_C         3499 non-null int64
MIN_TEMPERATURE_C         3499 non-null int64
WINDSPEED_MAX_KMH         3499 non-null int64
TEMPERATURE_MORNING_C     3499 non-null int64
TEMPERATURE_NOON_C        3499 non-null int64
TEMPERATURE_EVENING_C     3499 non-null int64
PRECIP_TOTAL_DAY_MM       3499 non-null float64
HUMIDITY_MAX_PERCENT      3499 non-null int64
VISIBILITY_AVG_KM         3499 non-null float64
PRESSURE_MAX_MB           3499 non-null int64
CLOUDCOVER_AVG_PERCENT    3499 non-null float64
HEATINDEX_MAX_C           3499 non-null int64
DEWPOINT_MAX_C            3499 non-null int64
WINDTEMP_MAX_C            3499 non-null int64
WEATHER_CODE_MORNING      3499 non-null int64
WEATHER_CODE_NOON         3499 non-null int64
WEATHER_CODE_EVENING      3499 non-null int64
dtypes: float64(3), int64(14)
memory usage: 492.0 KB

What we are most interested in are min/max temperatures:

python n_days = 15 win_type = None df[['MAX_TEMPERATURE_C', 'MIN_TEMPERATURE_C']].rolling(window=n_days, win_type=win_type).mean().plot(figsize=(14, 7))

<matplotlib.axes._subplots.AxesSubplot at 0x7fd309550908>


Heat wave and warm day definitions

Meteorological services warn that there is a risk of heatwave when for at least three days, minimum temperatures, especially at night, are above 20 °C and maximum temperatures above 33 °C.

From here (in french).

So let's define a warm day to have MAX_TEMPERATURE_C >= 33 and MIN_TEMPERATURE_C >= 20. A heat wave is made of at least 3 consecutive warm days.

Number of warm days per year

python df['WARM_DAY'] = False df.loc[(df.MAX_TEMPERATURE_C >= 33) & (df.MIN_TEMPERATURE_C >= 20), 'WARM_DAY'] = True

python print(len(df[df.WARM_DAY]), 'warm days between', df.index[0].date(), 'and', df.index[-1].date())

97 warm days between 2009-01-01 and 2018-07-31

python yearly_count = df['WARM_DAY'].resample('Y').sum().to_frame() yearly_count['year'] = yearly_count.index.year yearly_count.set_index('year', inplace=True), 7), legend=False) plt.xlabel("Year") plt.ylabel("Number of warm days per year")

Text(0,0.5,'Number of warm days per year')


Number of high heat index days per year ("apparent temperature")

Effects of the heat index from wikipedia:

| Heat Index | Effect | |--- | --- | | 27–32 °C | Caution: fatigue is possible with prolonged exposure and activity. Continuing activity could result in heat cramps. | | 32–41 °C | Extreme caution: heat cramps and heat exhaustion are possible. Continuing activity could result in heat stroke. | | 41–54 °C | Danger: heat cramps and heat exhaustion are likely; heat stroke is probable with continued activity. | | over 54 °C | Extreme danger: heat stroke is imminent. |

python df['WARM_DAY_HI_27-31'] = False df['WARM_DAY_HI_32-40'] = False df['WARM_DAY_HI_over_41'] = False df.loc[(df.HEATINDEX_MAX_C >= 27) & (df.HEATINDEX_MAX_C < 32), 'WARM_DAY_HI_27-31'] = True df.loc[(df.HEATINDEX_MAX_C >= 32) & (df.HEATINDEX_MAX_C < 41), 'WARM_DAY_HI_32-40'] = True df.loc[df.HEATINDEX_MAX_C >= 41, 'WARM_DAY_HI_over_41'] = True

python yearly_count_HI = df[['WARM_DAY_HI_27-31', 'WARM_DAY_HI_32-40', 'WARM_DAY_HI_over_41']].resample('Y').sum() yearly_count_HI['year'] = yearly_count_HI.index.year yearly_count_HI.set_index('year', inplace=True)

python renaming = {'WARM_DAY': 'warm day count', 'WARM_DAY_HI_27-31': 'days with H.I. in [27, 32)', 'WARM_DAY_HI_32-40': 'days with H.I. in [32, 41)', 'WARM_DAY_HI_over_41': 'days with H.I. over 41'} pd.merge(yearly_count, yearly_count_HI, on='year').rename(renaming, axis=1), 7))

<matplotlib.axes._subplots.AxesSubplot at 0x7fd30584bf98>


It is rather difficult to use the heat index because it may differ a lot from the temperature ( Let us stick to the min/max temperatures.

xkcd cartoon

Longest heat wave per year

```python def countcontiguouswarm_days(vec):

CONT_WARM_DAYS = np.zeros(vec.shape[0], dtype=int)
count = 0
for i in range(1, vec.shape[0]):
    if vec[i] == 1:
        count += 1
        CONT_WARM_DAYS[i] = count
        count = 0
for i in range(vec.shape[0]-2, -1, -1):
    if CONT_WARM_DAYS[i] > 0:
        if CONT_WARM_DAYS[i+1] > 0:
            CONT_WARM_DAYS[i] = CONT_WARM_DAYS[i+1]



python df['CONT_WARM_DAYS'] = count_contiguous_warm_days(df.WARM_DAY.values)

python longest_heatwave = df['CONT_WARM_DAYS'].resample('Y').max().fillna(0).to_frame() longest_heatwave['year'] = longest_heatwave.index.year longest_heatwave.set_index('year', inplace=True)

python longest_heatwave.sort_values('CONT_WARM_DAYS', ascending=False)[:5], 5), legend=False) plt.xlabel("Top 5 years") plt.ylabel("Longest heat wave per year (days)")

Text(0,0.5,'Longest heat wave per year (days)')


Cumulative number of heat wave days per year

Here we only consider days in heat waves (with at least 3 consecutive warm days).

python title = 'Cumulative number of heat wave days' heatwave_days = df[df.CONT_WARM_DAYS >= 3].CONT_WARM_DAYS.resample('Y').count() = title heatwave_days = heatwave_days.to_frame() heatwave_days['year'] = heatwave_days.index.year heatwave_days.set_index('year', inplace=True)

python heatwave_days.sort_values(title, ascending=False)[:5], 5), legend=False) plt.xlabel("Top 5 years") plt.ylabel(title)

Text(0,0.5,'Cumulative number of heat wave days')


Peak temperature

python yearly_peak = df['MAX_TEMPERATURE_C'].resample('Y').max().to_frame() yearly_peak['year'] = yearly_peak.index.year yearly_peak.set_index('year', inplace=True)

python yearly_peak.sort_values("MAX_TEMPERATURE_C", ascending=False)[:5], 5), legend=False) plt.xlabel("Top 5 years") plt.ylabel("Peak temperature")

Text(0,0.5,'Peak temperature')


Average temperature in June, July and August

The formula used to compute the monthly temperature is not very sophisticated... We take the mean of the min and max temperature over the month.

python df_month = df[['MAX_TEMPERATURE_C', 'MIN_TEMPERATURE_C']].resample('M').mean() df_month['MEAN_TEMP'] = df_month.mean(axis=1) df_month['year'] = df_month.index.year df_month['month'] = df_month.index.month

python df_june = df_month[df_month.month==6][['year', 'MEAN_TEMP']] df_july = df_month[df_month.month==7][['year', 'MEAN_TEMP']] df_august = df_month[df_month.month==8][['year', 'MEAN_TEMP']]

python df_summer_months = pd.DataFrame({'june': df_june.MEAN_TEMP.values}, index=df_june.year.values) df_summer_months = df_summer_months.join(df_july.rename({'MEAN_TEMP': 'july'}, axis=1).set_index('year')) df_summer_months = df_summer_months.join(df_august.rename({'MEAN_TEMP': 'august'}, axis=1).set_index('year')), 7)) plt.xlabel("Year") plt.ylabel("Mean temperature of the summer months")

Text(0,0.5,'Mean temperature of the summer months')



This month of July was hot...