Data Manipulation Tutorial

This notebook provides examples on how to carry out data manipulation and aggregation using the post_processing python library. Be sure to go through the Quick Start section of the documentation for instructions on how to access and import the libary and its packages.

If you would like to open an editable runnable version of the tutorial click here to be directed to a binder platform

The Library is still under active development and empty sections will be completed in Due time

All files are available in the github repository here

Requirements

The conda environmnent contains all libraries associated the post processing library. After setting up the conda environment, you only have to import the data maniupulation module from postprocessinglib.evaluation.

In this example though, I will also be importing other modules to help generate the data that I will be trying to analyse.

[2]:

import pandas as pd
from postprocessinglib.evaluation import data

GENERATE DATAFRAMES

This is the main overarching function that returns the required files in theier respective formats for use by the other modules and functions in the library. In its simplest form, it requires a csv file which contains the predicted and measured data formatted reffered to as the merged data as shown below :

merged data

Some datetime	station1_obs	station1_sim	station2_obs	station2_sim

or two csv files - one for observed and one for the simulated data, similarly formatted as shown below:

obs data

Some datetime	station1_obs	station2_obs

sim data

Some datetime	station1_sim	station2_sim

We then pass these into our generate dataframes function as shown below:

[3]:

# passing a small controlled csv file with only two stations for testing
path = "MESH_output_streamflow_1.csv"

# csv_fpath is used to represent the merged csv file
DATAFRAMES = data.generate_dataframes(csv_fpaths=path)

The start date for the Data is 1980-01-01

By default the function returns a dictionary that contains 3 dataframes - the Merged dataframe, the Observed dataframe and the Simuated dataframe, represented as DF, DF_OBSERVED and DF_SIMULATED respectively as demonstrated below:

[4]:

print("The Merged dataframe:")
print(DATAFRAMES["DF"].head(5))
print("\nThe Observed dataframe:")
print(DATAFRAMES["DF_OBSERVED"].head(5))
print("\nThe Simulated dataframe:")
print(DATAFRAMES["DF_SIMULATED"].head(5))

The Merged dataframe:
            QOMEAS_05BB001  QOSIM_05BB001  QOMEAS_05BA001  QOSIM_05BA001
1980-01-01            15.0     940.934800             NaN     695.375700
1980-01-02            15.0    1000.471000             NaN     233.646000
1980-01-03            13.3     303.156100             NaN       5.282405
1980-01-04            11.8      13.658420             NaN       0.481292
1980-01-05            12.9       2.095571             NaN       0.298271

The Observed dataframe:
            QOMEAS_05BB001  QOMEAS_05BA001
1980-01-01            15.0             NaN
1980-01-02            15.0             NaN
1980-01-03            13.3             NaN
1980-01-04            11.8             NaN
1980-01-05            12.9             NaN

The Simulated dataframe:
            QOSIM_05BB001  QOSIM_05BA001
1980-01-01     940.934800     695.375700
1980-01-02    1000.471000     233.646000
1980-01-03     303.156100       5.282405
1980-01-04      13.658420       0.481292
1980-01-05       2.095571       0.298271

You are also able to tell the function to skip the first few values by writing a value to the warm_up parameter of the function. We are also able to specify a start and end date using the start_date and end_date. These are useful in cases when you want only a fixed time like a particular year everything after or before a particular date. A few examples are shown below:

[5]:

# assuming the simulation model needs 366 days (the first year) to warm up and account for errors during the learning phase.
DATAFRAMES = data.generate_dataframes(csv_fpaths=path, warm_up=366)

The start date for the Data is 1981-01-01

Observe that the data now skips the entire first year starting from 1981-01-01 as opposed to 1980-01-01.

[6]:

DATAFRAMES_till2009 = data.generate_dataframes(csv_fpaths=path, warm_up=366, end_date='2009-12-31')
print("\nThe End of the Merged dataframe:")
print(DATAFRAMES_till2009["DF"].tail(5))

The start date for the Data is 1981-01-01

The End of the Merged dataframe:
            QOMEAS_05BB001  QOSIM_05BB001  QOMEAS_05BA001  QOSIM_05BA001
2009-12-27             NaN       4.114114             NaN       0.815359
2009-12-28             NaN       4.091105             NaN       0.810912
2009-12-29             NaN       4.068261             NaN       0.806497
2009-12-30             NaN       4.045577             NaN       0.802113
2009-12-31             NaN       4.023057             NaN       0.797758

Notice how it ends at 2009 as specified

[7]:

DATAFRAMES_from1995 = data.generate_dataframes(csv_fpaths=path, warm_up=366, start_date='1995-01-01')
print("\nThe Start of the Observed dataframe:")
print(DATAFRAMES_from1995["DF_OBSERVED"].head(5))

The start date for the Data is 1995-01-01

The Start of the Observed dataframe:
            QOMEAS_05BB001  QOMEAS_05BA001
1995-01-01            8.37             NaN
1995-01-02           10.10             NaN
1995-01-03           12.20             NaN
1995-01-04           13.00             NaN
1995-01-05           13.20             NaN

Observe that the data now starts 1995

[8]:

DATAFRAMES_January2010 = data.generate_dataframes(csv_fpaths=path, warm_up=366, start_date='2010-01-01' , end_date='2010-1-31')
print("\nThe Start of the Merged dataframe:")
print(DATAFRAMES_January2010["DF"].head(5))
print("\nThe End of the Merged dataframe:")
print(DATAFRAMES_January2010["DF"].tail(5))

The start date for the Data is 2010-01-01

The Start of the Merged dataframe:
            QOMEAS_05BB001  QOSIM_05BB001  QOMEAS_05BA001  QOSIM_05BA001
2010-01-01             NaN       4.000698             NaN       0.793435
2010-01-02             NaN       3.978494             NaN       0.789141
2010-01-03             NaN       3.956450             NaN       0.784877
2010-01-04             NaN       3.934558             NaN       0.780643
2010-01-05             NaN       3.912824             NaN       0.776437

The End of the Merged dataframe:
            QOMEAS_05BB001  QOSIM_05BB001  QOMEAS_05BA001  QOSIM_05BA001
2010-01-27             NaN       3.471175             NaN       0.690854
2010-01-28             NaN       3.452648             NaN       0.687258
2010-01-29             NaN       3.434252             NaN       0.683687
2010-01-30             NaN       3.415977             NaN       0.680139
2010-01-31             NaN       3.397829             NaN       0.676615

Observe that the data now starts from January 1st 2010 as specified and ends at January 31st 2010

Something to note though is that when specifying a warm up date and a start date and the start date exists before the warm up, its start date will be pushed forward to the warm up date as the warm up parameter takes precedence. For instance:

[9]:

# Here we set our start date to be somewhere within the 366 warmup time. It will get overidden and start from the end of
# the warm up time
DATAFRAMES_from_June_1980 = data.generate_dataframes(csv_fpaths=path, warm_up=366, start_date='1980-06-01')
print("\nThe Start of the Predicted Data: ")
print(DATAFRAMES_from_June_1980["DF_SIMULATED"].head(5))

The start date for the Data is 1981-01-01

The Start of the Predicted Data:
            QOSIM_05BB001  QOSIM_05BA001
1981-01-01       2.518999       1.001954
1981-01-02       2.507289       0.997078
1981-01-03       2.495637       0.992233
1981-01-04       2.484073       0.987417
1981-01-05       2.472571       0.982631

As you can observe it starts from 1981 despite specifying a start date of June 1st 1980!

The three dataframes - merged, observed and simulated - form the backbone of the library. Every other function in the library uses one or more of at least these three dataframes to perform analysis whether visual, descriptive or diagonistic.

DAILY AGGREGATION

This function returns the daily aggregate of the data passed into it. Its aggregates using the method passed or if one isnt given, its default is mean. Its that simple. Most of the data already comes with daily time stamps though so this is one of the fewer used functions. Its functionality is shown below:

[10]:

data.daily_aggregate(df=DATAFRAMES["DF_MERGED"])

[10]:

	Station1		Station2
	QOMEAS	QOSIM1	QOMEAS	QOSIM1
1981/001	9.85	2.518999	NaN	1.001954
1981/002	10.20	2.507289	NaN	0.997078
1981/003	10.00	2.495637	NaN	0.992233
1981/004	10.10	2.484073	NaN	0.987417
1981/005	9.99	2.472571	NaN	0.982631
...	...	...	...	...
2017/361	NaN	4.418050	NaN	1.380227
2017/362	NaN	4.393084	NaN	1.372171
2017/363	NaN	4.368303	NaN	1.364174
2017/364	NaN	4.343699	NaN	1.356237
2017/365	NaN	4.319275	NaN	1.348359

13514 rows × 4 columns

It returns a dataframe indexed daily from 1 till 365/366 i.e., the days of the year

WEEKLY AGGREGATION

This function returns the weekly aggregate of the data passed into it. Its aggregates using the method passed or if one isnt given, its default is mean. Its that simple. Its functionality is shown below:

[11]:

data.weekly_aggregate(df=DATAFRAMES["DF_MERGED"]) # default method of aggregation is mean

[11]:

	Station1		Station2
	QOMEAS	QOSIM1	QOMEAS	QOSIM1
1980-12-29	10.037500	2.501499	NaN	0.994671
1981-01-05	9.244286	2.438589	NaN	0.968506
1981-01-12	8.461429	2.361289	NaN	0.936405
1981-01-19	8.345714	2.287077	NaN	0.905643
1981-01-26	8.461429	2.215803	NaN	0.876150
...	...	...	...	...
2017-11-27	NaN	5.165260	NaN	1.621172
2017-12-04	NaN	4.958076	NaN	1.554878
2017-12-11	NaN	4.760181	NaN	1.490809
2017-12-18	NaN	4.572103	NaN	1.429983
2017-12-25	NaN	4.393448	NaN	1.372291

1931 rows × 4 columns

It returns a dataframe indexed weekly from 0/1 till 52/53 i.e., the weeks of the year

[12]:

data.weekly_aggregate(df=DATAFRAMES["DF_MERGED"], method='sum') # here we aggregate by summing up all the values of the week.

[12]:

	Station1		Station2
	QOMEAS	QOSIM1	QOMEAS	QOSIM1
1980-12-29	40.15	10.005998	0.0	3.978683
1981-01-05	64.71	17.070122	0.0	6.779542
1981-01-12	59.23	16.529020	0.0	6.554838
1981-01-19	58.42	16.009540	0.0	6.339498
1981-01-26	59.23	15.510619	0.0	6.133052
...	...	...	...	...
2017-11-27	0.00	36.156822	0.0	11.348205
2017-12-04	0.00	34.706534	0.0	10.884148
2017-12-11	0.00	33.321266	0.0	10.435665
2017-12-18	0.00	32.004719	0.0	10.009884
2017-12-25	0.00	30.754137	0.0	9.606034

1931 rows × 4 columns

YEARLY AGGREGATION

This function returns the yearly aggregate of the data passed into it. Its aggregates using the method passed or if one isnt given, its default is mean. Its that simple. Its functionality is shown below:

[13]:

data.yearly_aggregate(df=DATAFRAMES["DF_MERGED"]) # default method of aggregation is mean

[13]:

	Station1		Station2
	QOMEAS	QOSIM1	QOMEAS	QOSIM1
1981-01	8.815806	2.352880	NaN	0.932961
1981-02	7.780000	2.060198	NaN	0.811975
1981-03	6.896129	1.812920	NaN	0.710498
1981-04	7.981333	3.132911	NaN	0.821663
1981-05	43.538710	50.736276	10.111333	15.202072
...	...	...	...	...
2017-08	NaN	32.222317	NaN	17.704763
2017-09	NaN	28.141430	NaN	14.315134
2017-10	NaN	7.698483	NaN	2.615914
2017-11	NaN	5.625516	NaN	1.770196
2017-12	NaN	4.712923	NaN	1.475527

444 rows × 4 columns

It returns a dataframe indexed yearly from the first year in your data till the last year.

[14]:

data.yearly_aggregate(df=DATAFRAMES["DF_MERGED"], method='median') # here we aggregate by finding the median of the values each year.

[14]:

	Station1		Station2
	QOMEAS	QOSIM1	QOMEAS	QOSIM1
1981-01	8.590	2.350375	NaN	0.931876
1981-02	7.825	2.058540	NaN	0.811263
1981-03	6.860	1.811243	NaN	0.709789
1981-04	7.400	1.720303	NaN	0.634707
1981-05	15.800	5.115736	5.165	1.589602
...	...	...	...	...
2017-08	NaN	31.761380	NaN	17.606670
2017-09	NaN	25.176370	NaN	12.297040
2017-10	NaN	6.744768	NaN	2.161909
2017-11	NaN	5.622982	NaN	1.767848
2017-12	NaN	4.705044	NaN	1.472965

444 rows × 4 columns

MONTHLY AGGREGATION

This function returns the monthly aggregate of the data passed into it. Its aggregates using the method passed or if one isnt given, its default is mean. Its that simple. Its functionality is shown below:

[15]:

data.monthly_aggregate(df=DATAFRAMES["DF_MERGED"]) # default method of aggregation is mean

[15]:

	Station1		Station2
	QOMEAS	QOSIM1	QOMEAS	QOSIM1
1981-01	8.815806	2.352880	NaN	0.932961
1981-02	7.780000	2.060198	NaN	0.811975
1981-03	6.896129	1.812920	NaN	0.710498
1981-04	7.981333	3.132911	NaN	0.821663
1981-05	43.538710	50.736276	10.111333	15.202072
...	...	...	...	...
2017-08	NaN	32.222317	NaN	17.704763
2017-09	NaN	28.141430	NaN	14.315134
2017-10	NaN	7.698483	NaN	2.615914
2017-11	NaN	5.625516	NaN	1.770196
2017-12	NaN	4.712923	NaN	1.475527

444 rows × 4 columns

It returns a dataframe indexed monthly from 1 till 12 i.e., the months of the year

[16]:

data.monthly_aggregate(df=DATAFRAMES["DF_MERGED"], method='max') # here we aggregate by returning the maximum value that month

[16]:

	Station1		Station2
	QOMEAS	QOSIM1	QOMEAS	QOSIM1
1981-01	10.20	2.518999	NaN	1.001954
1981-02	8.51	2.186006	NaN	0.863836
1981-03	7.50	1.931955	NaN	0.759228
1981-04	15.30	17.241560	NaN	3.692734
1981-05	165.00	220.485800	36.6	97.054240
...	...	...	...	...
2017-08	NaN	41.657390	NaN	21.751910
2017-09	NaN	116.866500	NaN	53.925540
2017-10	NaN	19.746100	NaN	8.157081
2017-11	NaN	6.091304	NaN	1.927400
2017-12	NaN	5.134585	NaN	1.611400

444 rows × 4 columns

STATISTICS AGGREGATION

This allows us to calculate the aggregate of all the simulations accross the dataframe for every datetime index. Its aggregates using the method passed or if one isnt given, its default is mean. Unlike the other methods of aggregation, we are also able to perform quantile calculations with this method of aggregation. Its that simple. Its functionality is shown below:

[17]:

data.stat_aggregate(df=DATAFRAMES["DF_MERGED"], method='max')

[17]:

	Station1	Station2
	MAX	MAX
1981-01-01	2.518999	1.001954
1981-01-02	2.507289	0.997078
1981-01-03	2.495637	0.992233
1981-01-04	2.484073	0.987417
1981-01-05	2.472571	0.982631
...	...	...
2017-12-27	4.418050	1.380227
2017-12-28	4.393084	1.372171
2017-12-29	4.368303	1.364174
2017-12-30	4.343699	1.356237
2017-12-31	4.319275	1.348359

13514 rows × 2 columns

[18]:

data.stat_aggregate(df=DATAFRAMES["DF_MERGED"], method='q75')

[18]:

	Station1	Station2
	Q75	Q75
1981-01-01	2.518999	1.001954
1981-01-02	2.507289	0.997078
1981-01-03	2.495637	0.992233
1981-01-04	2.484073	0.987417
1981-01-05	2.472571	0.982631
...	...	...
2017-12-27	4.418050	1.380227
2017-12-28	4.393084	1.372171
2017-12-29	4.368303	1.364174
2017-12-30	4.343699	1.356237
2017-12-31	4.319275	1.348359

13514 rows × 2 columns

PERIODIC/SEASONAL AGGREGATION

This allows us to return a specific period of time for every year or a select few years within a data set. Its allows you to essentially analyse a season or period every year without having to look through every day. For exmaple, lets say we want to isolate what the streamflow was like on the first 2 days of January every year, we would go..

[19]:

data.seasonal_period(df=DATAFRAMES["DF"], daily_period=('01-01', '01-02'))

[19]:

	QOMEAS_05BB001	QOSIM_05BB001	QOMEAS_05BA001	QOSIM_05BA001
1981-01-01	9.85	2.518999	NaN	1.001954
1981-01-02	10.20	2.507289	NaN	0.997078
1982-01-01	7.17	5.465301	NaN	2.429704
1982-01-02	7.02	5.433753	NaN	2.414755
1983-01-01	8.98	5.371416	NaN	2.441398
...	...	...	...	...
2015-01-02	NaN	6.944578	NaN	1.615503
2016-01-01	NaN	3.686424	NaN	1.005240
2016-01-02	NaN	3.666536	NaN	0.999701
2017-01-01	NaN	2.700768	NaN	0.872387
2017-01-02	NaN	2.687161	NaN	0.867826

74 rows × 4 columns

Observe that for every year, we only get the first two days of the year. We can also specify specific years and not return every year. For example we can get the predicted values for the first week of summer for the years 1999, 2001 and 2005 as shown below:

[20]:

data.seasonal_period(df=DATAFRAMES["DF_SIMULATED"], daily_period=('06-21', '06-28'), years=[1999, 2001, 2005])

[20]:

	QOSIM_05BB001	QOSIM_05BA001
1999-06-21	155.64250	127.26980
1999-06-22	147.19820	74.31406
1999-06-23	80.06104	34.51432
1999-06-24	45.07025	38.66659
1999-06-25	129.59160	104.75280
1999-06-26	156.33070	51.86972
1999-06-27	166.94590	62.61042
1999-06-28	85.17047	27.95974
2001-06-21	42.08820	32.47492
2001-06-22	44.51981	29.40924
2001-06-23	41.09985	29.90357
2001-06-24	42.13543	39.85459
2001-06-25	94.21543	135.35250
2001-06-26	127.62240	35.24305
2001-06-27	39.27969	39.95149
2001-06-28	77.72211	67.53999
2005-06-21	51.65529	24.16283
2005-06-22	51.92735	28.55426
2005-06-23	94.62647	35.51184
2005-06-24	40.85018	16.88546
2005-06-25	90.53637	109.68720
2005-06-26	189.87700	46.47357
2005-06-27	67.25498	43.57297
2005-06-28	157.66590	80.75896

As you can see, we are able to get the first week of summer for those 3 years.

LONG TERM AGGREGATION

this allows us to compute the long-term seasonal aggregate values of a given DataFrame by applying the specified aggregation method to each day across all years in the provided time period. The resulting data is aggregated into a single year (1 to 365/366 days). This way we are able to see how the models perform year in year out compared to the actual recorded data - both aggregated as necessary. An example is shown below:

[21]:

data.long_term_seasonal(df=DATAFRAMES["DF"]) # As usual the default aggregation method is mean/average

[21]:

	QOMEAS_05BB001	QOSIM_05BB001	QOMEAS_05BA001	QOSIM_05BA001
jday
1	9.446471	4.037666	NaN	1.130686
2	9.428125	4.014474	NaN	1.123915
3	9.660625	3.991451	NaN	1.117196
4	9.804375	3.968602	NaN	1.110529
5	9.787500	3.945921	NaN	1.103913
...	...	...	...	...
362	9.942500	4.188140	NaN	1.169614
363	9.695000	4.163847	NaN	1.162533
364	9.633125	4.139735	NaN	1.155507
365	9.516875	4.115805	NaN	1.148535
366	9.870000	4.433329	NaN	1.191653

366 rows × 4 columns

We are also able to calculate quantiles, For example the 75th Quantile value for all years aggregated into a single year looks like:

[22]:

data.long_term_seasonal(df=DATAFRAMES["DF_SIMULATED"], method = 'Q75')

[22]:

	QOSIM_05BB001	QOSIM_05BA001
jday
1	4.830453	1.315370
2	4.801530	1.306986
3	4.772831	1.298670
4	4.744344	1.290422
5	4.716085	1.282241
...	...	...
362	4.978491	1.372171
363	4.948421	1.364174
364	4.918590	1.356237
365	4.888982	1.348359
366	4.859608	1.323824

366 rows × 2 columns

Naturally, when dealing with multi model evaluations, we are able to perform statictics on the output of the long term seasonal aggregations. This was we are able to extract the mean, median, max, etc of the long term seasonal aggregations, leavong us with just the statistics. An example of this is shown below:

[23]:

data.stat_aggregate(df=data.long_term_seasonal(df=DATAFRAMES["DF_MERGED"], method = 'median'), method='median')

[23]:

	Station1	Station2
	MEDIAN	MEDIAN
jday
1	3.636044	1.001954
2	3.616069	0.997078
3	3.596241	0.992233
4	3.576548	0.987417
5	3.556993	0.982631
...	...	...
362	3.767376	1.027792
363	3.746928	1.022093
364	3.726621	1.016435
365	3.706450	1.010818
366	4.164878	1.192406

366 rows × 2 columns

Note

All of these functions with their various means of aggregation are available as individual functions but they can also be generated right from the generate_dataframes() function if you know eaxactly what you’ll need from the beginning. It just requires specifying a few more parameters. These parameters are shown below:

[24]:

## Lets use a time period of 1981 to 1990 to demonstrate this
DATAFRAMES = data.generate_dataframes(csv_fpaths=path, warm_up=365, start_date = "1981-01-01", end_date = "1990-12-31",
                                      # optional arguments
                                      # you specify that you want an aggregated dataframe by passing 'True' into
                                      # the respective parameter and then you pass in your preffered method of aggregation

                                      # If you want daily aggregation
                                      daily_agg = True, da_method = 'min',
                                      # lets see a weekly aggregation
                                      weekly_agg = True, wa_method = 'min', # we want the minimum value each week
                                      # lets also see monthly aggregation
                                      monthly_agg = True, ma_method = 'inst', # we want the maximum value each month
                                      # lets also see yearly aggregation
                                      yearly_agg = True, ya_method = 'sum', # we want the sum of all values each year
                                      # lets see the stats aggregation
                                      stat_agg = True, stat_method = 'q75'
                                      # note that without inputing the respective methods,
                                      # the functions will still default to mean as the method of aggregation
                                     )

The start date for the Data is 1981-01-01

[25]:

DATAFRAMES = data.generate_dataframes(csv_fpaths=path, warm_up=365, start_date = "1981-01-01", end_date = "1990-12-31",
                                      # seasonal aggregation
                                      # obtaining the months of May till August from every year from 1981 to 1985
                                      seasonal_p = True, sp_dperiod = ('05-01', '08-30'),
                                      sp_subset = ('1981-01-01', '1985-12-31'),
                                      # instead of sp_subset, we can also use years = [1981, 1982, 1983, 1984, 1985].

                                      # long term seasonal aggregation
                                      long_term = True, lt_method = ["q33.33", "median" ,'q75' ,'Q25' ,'q33' ],
                                      # when using long term in the generate_dataframes function, we are able to pass
                                      # in a list of methods of aggregation we want generated. BY dafault though it will
                                      # always generate maximum, minimum and median value dataframes
                                     )

The start date for the Data is 1981-01-01

Putting it all together, we have:

[26]:

DATAFRAMES = data.generate_dataframes(csv_fpaths=path, warm_up=365, start_date = "1981-01-01", end_date = "1990-12-31",
                                      daily_agg = True, da_method = 'min',
                                      weekly_agg = True, wa_method = 'min',
                                      monthly_agg = True, ma_method = 'inst',
                                      yearly_agg = True, ya_method = 'sum',
                                      stat_agg = True, stat_method = 'q75',
                                      seasonal_p = True, sp_dperiod = ('05-01', '08-30'), sp_subset = ('1981-01-01', '1985-12-31'),
                                      long_term = True, lt_method = ["q33.33", "median" ,'q75' ,'Q25' ,'q33' ],
                                     )


for key, value in DATAFRAMES.items():
    print(f"{key}:\n{value}")

The start date for the Data is 1981-01-01
DF:
            QOMEAS_05BB001  QOSIM_05BB001  QOMEAS_05BA001  QOSIM_05BA001
1981-01-01            9.85       2.518999             NaN       1.001954
1981-01-02           10.20       2.507289             NaN       0.997078
1981-01-03           10.00       2.495637             NaN       0.992233
1981-01-04           10.10       2.484073             NaN       0.987417
1981-01-05            9.99       2.472571             NaN       0.982631
...                    ...            ...             ...            ...
1990-12-27           10.10       6.615961             NaN       1.737144
1990-12-28            9.50       6.573054             NaN       1.725025
1990-12-29            8.60       6.530500             NaN       1.713013
1990-12-30            8.20       6.488300             NaN       1.701107
1990-12-31            8.25       6.446449             NaN       1.689308

[3652 rows x 4 columns]
DF_OBSERVED:
            QOMEAS_05BB001  QOMEAS_05BA001
1981-01-01            9.85             NaN
1981-01-02           10.20             NaN
1981-01-03           10.00             NaN
1981-01-04           10.10             NaN
1981-01-05            9.99             NaN
...                    ...             ...
1990-12-27           10.10             NaN
1990-12-28            9.50             NaN
1990-12-29            8.60             NaN
1990-12-30            8.20             NaN
1990-12-31            8.25             NaN

[3652 rows x 2 columns]
DF_SIMULATED:
            QOSIM_05BB001  QOSIM_05BA001
1981-01-01       2.518999       1.001954
1981-01-02       2.507289       0.997078
1981-01-03       2.495637       0.992233
1981-01-04       2.484073       0.987417
1981-01-05       2.472571       0.982631
...                   ...            ...
1990-12-27       6.615961       1.737144
1990-12-28       6.573054       1.725025
1990-12-29       6.530500       1.713013
1990-12-30       6.488300       1.701107
1990-12-31       6.446449       1.689308

[3652 rows x 2 columns]
DF_MERGED:
           Station1           Station2
             QOMEAS    QOSIM1   QOMEAS    QOSIM1
1981-01-01     9.85  2.518999      NaN  1.001954
1981-01-02    10.20  2.507289      NaN  0.997078
1981-01-03    10.00  2.495637      NaN  0.992233
1981-01-04    10.10  2.484073      NaN  0.987417
1981-01-05     9.99  2.472571      NaN  0.982631
...             ...       ...      ...       ...
1990-12-27    10.10  6.615961      NaN  1.737144
1990-12-28     9.50  6.573054      NaN  1.725025
1990-12-29     8.60  6.530500      NaN  1.713013
1990-12-30     8.20  6.488300      NaN  1.701107
1990-12-31     8.25  6.446449      NaN  1.689308

[3652 rows x 4 columns]
DF_DAILY:
         Station1           Station2
           QOMEAS    QOSIM1   QOMEAS    QOSIM1
1981/001     9.85  2.518999      NaN  1.001954
1981/002    10.20  2.507289      NaN  0.997078
1981/003    10.00  2.495637      NaN  0.992233
1981/004    10.10  2.484073      NaN  0.987417
1981/005     9.99  2.472571      NaN  0.982631
...           ...       ...      ...       ...
1990/361    10.10  6.615961      NaN  1.737144
1990/362     9.50  6.573054      NaN  1.725025
1990/363     8.60  6.530500      NaN  1.713013
1990/364     8.20  6.488300      NaN  1.701107
1990/365     8.25  6.446449      NaN  1.689308

[3652 rows x 4 columns]
DF_WEEKLY:
           Station1           Station2
             QOMEAS    QOSIM1   QOMEAS    QOSIM1
1980-12-29     9.85  2.484073      NaN  0.987417
1981-01-05     8.70  2.404939      NaN  0.954524
1981-01-12     8.24  2.328990      NaN  0.923008
1981-01-19     7.86  2.256059      NaN  0.892801
1981-01-26     8.10  2.186006      NaN  0.863836
...             ...       ...      ...       ...
1990-12-03    10.80  7.453629      NaN  1.975199
1990-12-10     8.70  7.112537      NaN  1.877937
1990-12-17     8.10  6.791226      NaN  1.786726
1990-12-24     8.20  6.488300      NaN  1.701107
1990-12-31     8.25  6.446449      NaN  1.689308

[523 rows x 4 columns]
DF_MONTHLY:
        Station1             Station2
          QOMEAS      QOSIM1   QOMEAS     QOSIM1
1981-01     8.62    2.195846      NaN   0.867900
1981-02     7.20    1.940355      NaN   0.762678
1981-03     7.25    1.699932      NaN   0.664341
1981-04    15.30    3.859564      NaN   0.584523
1981-05   113.00  220.485800    28.20  96.363520
...          ...         ...      ...        ...
1990-08    33.60   40.431200    10.10  23.856810
1990-09    90.90   19.438340    30.50   6.175078
1990-10    21.10    9.648046     4.01   2.642092
1990-11    12.00    7.920140     3.90   2.109938
1990-12     8.25    6.446449      NaN   1.689308

[120 rows x 4 columns]
DF_YEARLY:
        Station1              Station2
          QOMEAS       QOSIM1   QOMEAS      QOSIM1
1981-01   273.29    72.939293     0.00   28.921777
1981-02   217.84    57.685555     0.00   22.735300
1981-03   213.78    56.200513     0.00   22.025424
1981-04   239.44    93.987320     0.00   24.649901
1981-05  1349.70  1572.824547   303.34  471.264224
...          ...          ...      ...         ...
1990-08  1541.80  1520.330940   550.40  758.817000
1990-09  1007.10   855.418340   269.57  408.246703
1990-10  1035.10   361.582897   277.26   95.083195
1990-11   460.12   261.980893     3.90   70.673194
1990-12   324.60   220.982680     0.00   58.369321

[120 rows x 4 columns]
DF_CUSTOM:
           Station1            Station2
             QOMEAS     QOSIM1   QOMEAS     QOSIM1
1981-05-01     13.8   4.924126      NaN   0.587910
1981-05-02     12.0   4.199440     3.01   0.592167
1981-05-03     10.9   2.605448     2.85   0.580767
1981-05-04     10.3   2.898053     2.72   0.582108
1981-05-05     10.4   3.357134     2.80   0.578170
...             ...        ...      ...        ...
1985-08-26     45.9  22.102120    13.00  13.373290
1985-08-27     43.6  21.293030    12.50  12.981310
1985-08-28     42.5  21.262790    12.40  13.329020
1985-08-29     41.6  21.753170    12.80  13.696490
1985-08-30     42.0  22.593580    13.20  16.211630

[610 rows x 4 columns]
LONG_TERM_MIN:
     Station1           Station2
       QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1        7.17  1.802230      NaN  0.485006
2        7.02  1.794152      NaN  0.482805
3        7.10  1.786115      NaN  0.480617
4        7.31  1.778133      NaN  0.478442
5        7.64  1.770190      NaN  0.476279
...       ...       ...      ...       ...
362      7.25  1.835014      NaN  0.493944
363      7.29  1.826749      NaN  0.491689
364      7.27  1.818528      NaN  0.489450
365      7.21  1.810352      NaN  0.487221
366     10.30  2.939887      NaN  0.782759

[366 rows x 4 columns]
LONG_TERM_MAX:
     Station1           Station2
       QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1        12.7  5.465301      NaN  2.441398
2        12.7  5.433753      NaN  2.426411
3        12.7  5.402424      NaN  2.411540
4        12.7  5.371336      NaN  2.396784
5        12.8  5.340471      NaN  2.382142
...       ...       ...      ...       ...
362      13.3  6.573054      NaN  2.502534
363      13.0  6.530500      NaN  2.487069
364      12.8  6.488300      NaN  2.471726
365      12.7  6.446449      NaN  2.456503
366      11.3  3.132862      NaN  0.925092

[366 rows x 4 columns]
LONG_TERM_MEDIAN:
     Station1           Station2
       QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1       9.795  3.397196      NaN  0.961191
2       9.815  3.379185      NaN  0.956437
3       9.855  3.361293      NaN  0.951713
4       9.875  3.343524      NaN  0.947018
5       9.695  3.325880      NaN  0.942354
...       ...       ...      ...       ...
362    10.350  3.698825      NaN  1.020910
363     9.490  3.678319      NaN  1.015117
364     9.070  3.657953      NaN  1.009368
365     9.350  3.637736      NaN  1.003660
366    10.800  3.036374      NaN  0.853926

[366 rows x 4 columns]
LONG_TERM_Q33.33:
       Station1           Station2
         QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1      9.519838  3.116782      NaN  0.883651
2      9.419919  3.100879      NaN  0.878571
3      9.599898  3.085059      NaN  0.873528
4      9.709991  3.069359      NaN  0.868522
5      9.400000  3.053748      NaN  0.863553
...         ...       ...      ...       ...
362    9.499817  3.373855      NaN  0.904356
363    8.849961  3.356529      NaN  0.899122
364    8.949970  3.339308      NaN  0.893926
365    8.919976  3.322208      NaN  0.888770
366   10.633300  3.004206      NaN  0.830199

[366 rows x 4 columns]
LONG_TERM_Q75:
     Station1           Station2
       QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1     10.0375  4.156184      NaN  1.248870
2     10.2000  4.132273      NaN  1.242024
3     10.6500  4.108537      NaN  1.235226
4     10.3250  4.084971      NaN  1.228475
5     10.5225  4.061583      NaN  1.221772
...       ...       ...      ...       ...
362   11.2500  5.222542      NaN  1.627869
363   11.0000  5.192005      NaN  1.617079
364   10.8000  5.161703      NaN  1.606380
365   10.4950  5.131630      NaN  1.595775
366   11.0500  3.084618      NaN  0.889509

[366 rows x 4 columns]
LONG_TERM_Q25:
     Station1           Station2
       QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1      9.1150  2.972452      NaN  0.881223
2      9.2175  2.956880      NaN  0.876179
3      9.3450  2.941408      NaN  0.871171
4      9.6875  2.926045      NaN  0.866200
5      9.4000  2.910776      NaN  0.861267
...       ...       ...      ...       ...
362    9.0425  3.241984      NaN  0.901784
363    8.7525  3.225315      NaN  0.896586
364    8.8750  3.208758      NaN  0.891427
365    8.8600  3.192306      NaN  0.886306
366   10.5500  2.988131      NaN  0.818343

[366 rows x 4 columns]
LONG_TERM_Q33:
     Station1           Station2
       QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1      9.5038  3.111064      NaN  0.883555
2      9.4119  3.095175      NaN  0.878476
3      9.5898  3.079368      NaN  0.873434
4      9.7091  3.063681      NaN  0.868430
5      9.4000  3.048084      NaN  0.863463
...       ...       ...      ...       ...
362    9.4817  3.368631      NaN  0.904254
363    8.8461  3.351331      NaN  0.899021
364    8.9470  3.334136      NaN  0.893827
365    8.9176  3.317062      NaN  0.888672
366   10.6300  3.003569      NaN  0.829729

[366 rows x 4 columns]
DF_STATS:
            Station1                                Station2            \
                 MIN       MAX    MEDIAN       Q75       MIN       MAX
1981-01-01  2.518999  2.518999  2.518999  2.518999  1.001954  1.001954
1981-01-02  2.507289  2.507289  2.507289  2.507289  0.997078  0.997078
1981-01-03  2.495637  2.495637  2.495637  2.495637  0.992233  0.992233
1981-01-04  2.484073  2.484073  2.484073  2.484073  0.987417  0.987417
1981-01-05  2.472571  2.472571  2.472571  2.472571  0.982631  0.982631
...              ...       ...       ...       ...       ...       ...
1990-12-27  6.615961  6.615961  6.615961  6.615961  1.737144  1.737144
1990-12-28  6.573054  6.573054  6.573054  6.573054  1.725025  1.725025
1990-12-29  6.530500  6.530500  6.530500  6.530500  1.713013  1.713013
1990-12-30  6.488300  6.488300  6.488300  6.488300  1.701107  1.701107
1990-12-31  6.446449  6.446449  6.446449  6.446449  1.689308  1.689308


              MEDIAN       Q75
1981-01-01  1.001954  1.001954
1981-01-02  0.997078  0.997078
1981-01-03  0.992233  0.992233
1981-01-04  0.987417  0.987417
1981-01-05  0.982631  0.982631
...              ...       ...
1990-12-27  1.737144  1.737144
1990-12-28  1.725025  1.725025
1990-12-29  1.713013  1.713013
1990-12-30  1.701107  1.701107
1990-12-31  1.689308  1.689308

[3652 rows x 8 columns]

[ ]: