Data Manipulation Tutorial

This notebook provides examples on how to carry out data manipulation and aggregation using the post_processing python library. Be sure to go through the Quick Start section of the documentation for instructions on how to access and import the libary and its packages.

If you would like to open an editable runnable version of the tutorial click here to be directed to a binder platform

The Library is still under active development and empty sections will be completed in Due time

Table of content

All files are available in the github repository here

Requirements

The conda environmnent contains all libraries associated the post processing library. After setting up the conda environment, you only have to import the data maniupulation module from postprocessinglib.evaluation.

In this example though, I will also be importing other modules to help generate the data that I will be trying to analyse.

[2]:
import pandas as pd
from postprocessinglib.evaluation import data

GENERATE DATAFRAMES

This is the main overarching function that returns the required files in theier respective formats for use by the other modules and functions in the library. In its simplest form, it requires a csv file which contains the predicted and measured data formatted reffered to as the merged data as shown below :

merged data

Some datetime

station1_obs

station1_sim

station2_obs

station2_sim

or two csv files - one for observed and one for the simulated data, similarly formatted as shown below:

obs data

Some datetime

station1_obs

station2_obs

sim data

Some datetime

station1_sim

station2_sim

We then pass these into our generate dataframes function as shown below:

[3]:
# passing a small controlled csv file with only two stations for testing
path = "MESH_output_streamflow_1.csv"

# csv_fpath is used to represent the merged csv file
DATAFRAMES = data.generate_dataframes(csv_fpaths=path)
The start date for the Data is 1980-01-01

By default the function returns a dictionary that contains 3 dataframes - the Merged dataframe, the Observed dataframe and the Simuated dataframe, represented as DF, DF_OBSERVED and DF_SIMULATED respectively as demonstrated below:

[4]:
print("The Merged dataframe:")
print(DATAFRAMES["DF"].head(5))
print("\nThe Observed dataframe:")
print(DATAFRAMES["DF_OBSERVED"].head(5))
print("\nThe Simulated dataframe:")
print(DATAFRAMES["DF_SIMULATED"].head(5))
The Merged dataframe:
            QOMEAS_05BB001  QOSIM_05BB001  QOMEAS_05BA001  QOSIM_05BA001
1980-01-01            15.0     940.934800             NaN     695.375700
1980-01-02            15.0    1000.471000             NaN     233.646000
1980-01-03            13.3     303.156100             NaN       5.282405
1980-01-04            11.8      13.658420             NaN       0.481292
1980-01-05            12.9       2.095571             NaN       0.298271

The Observed dataframe:
            QOMEAS_05BB001  QOMEAS_05BA001
1980-01-01            15.0             NaN
1980-01-02            15.0             NaN
1980-01-03            13.3             NaN
1980-01-04            11.8             NaN
1980-01-05            12.9             NaN

The Simulated dataframe:
            QOSIM_05BB001  QOSIM_05BA001
1980-01-01     940.934800     695.375700
1980-01-02    1000.471000     233.646000
1980-01-03     303.156100       5.282405
1980-01-04      13.658420       0.481292
1980-01-05       2.095571       0.298271

You are also able to tell the function to skip the first few values by writing a value to the warm_up parameter of the function. We are also able to specify a start and end date using the start_date and end_date. These are useful in cases when you want only a fixed time like a particular year everything after or before a particular date. A few examples are shown below:

[5]:
# assuming the simulation model needs 366 days (the first year) to warm up and account for errors during the learning phase.
DATAFRAMES = data.generate_dataframes(csv_fpaths=path, warm_up=366)
The start date for the Data is 1981-01-01

Observe that the data now skips the entire first year starting from 1981-01-01 as opposed to 1980-01-01.

[6]:
DATAFRAMES_till2009 = data.generate_dataframes(csv_fpaths=path, warm_up=366, end_date='2009-12-31')
print("\nThe End of the Merged dataframe:")
print(DATAFRAMES_till2009["DF"].tail(5))
The start date for the Data is 1981-01-01

The End of the Merged dataframe:
            QOMEAS_05BB001  QOSIM_05BB001  QOMEAS_05BA001  QOSIM_05BA001
2009-12-27             NaN       4.114114             NaN       0.815359
2009-12-28             NaN       4.091105             NaN       0.810912
2009-12-29             NaN       4.068261             NaN       0.806497
2009-12-30             NaN       4.045577             NaN       0.802113
2009-12-31             NaN       4.023057             NaN       0.797758

Notice how it ends at 2009 as specified

[7]:
DATAFRAMES_from1995 = data.generate_dataframes(csv_fpaths=path, warm_up=366, start_date='1995-01-01')
print("\nThe Start of the Observed dataframe:")
print(DATAFRAMES_from1995["DF_OBSERVED"].head(5))
The start date for the Data is 1995-01-01

The Start of the Observed dataframe:
            QOMEAS_05BB001  QOMEAS_05BA001
1995-01-01            8.37             NaN
1995-01-02           10.10             NaN
1995-01-03           12.20             NaN
1995-01-04           13.00             NaN
1995-01-05           13.20             NaN

Observe that the data now starts 1995

[8]:
DATAFRAMES_January2010 = data.generate_dataframes(csv_fpaths=path, warm_up=366, start_date='2010-01-01' , end_date='2010-1-31')
print("\nThe Start of the Merged dataframe:")
print(DATAFRAMES_January2010["DF"].head(5))
print("\nThe End of the Merged dataframe:")
print(DATAFRAMES_January2010["DF"].tail(5))
The start date for the Data is 2010-01-01

The Start of the Merged dataframe:
            QOMEAS_05BB001  QOSIM_05BB001  QOMEAS_05BA001  QOSIM_05BA001
2010-01-01             NaN       4.000698             NaN       0.793435
2010-01-02             NaN       3.978494             NaN       0.789141
2010-01-03             NaN       3.956450             NaN       0.784877
2010-01-04             NaN       3.934558             NaN       0.780643
2010-01-05             NaN       3.912824             NaN       0.776437

The End of the Merged dataframe:
            QOMEAS_05BB001  QOSIM_05BB001  QOMEAS_05BA001  QOSIM_05BA001
2010-01-27             NaN       3.471175             NaN       0.690854
2010-01-28             NaN       3.452648             NaN       0.687258
2010-01-29             NaN       3.434252             NaN       0.683687
2010-01-30             NaN       3.415977             NaN       0.680139
2010-01-31             NaN       3.397829             NaN       0.676615

Observe that the data now starts from January 1st 2010 as specified and ends at January 31st 2010

Something to note though is that when specifying a warm up date and a start date and the start date exists before the warm up, its start date will be pushed forward to the warm up date as the warm up parameter takes precedence. For instance:

[9]:
# Here we set our start date to be somewhere within the 366 warmup time. It will get overidden and start from the end of
# the warm up time
DATAFRAMES_from_June_1980 = data.generate_dataframes(csv_fpaths=path, warm_up=366, start_date='1980-06-01')
print("\nThe Start of the Predicted Data: ")
print(DATAFRAMES_from_June_1980["DF_SIMULATED"].head(5))
The start date for the Data is 1981-01-01

The Start of the Predicted Data:
            QOSIM_05BB001  QOSIM_05BA001
1981-01-01       2.518999       1.001954
1981-01-02       2.507289       0.997078
1981-01-03       2.495637       0.992233
1981-01-04       2.484073       0.987417
1981-01-05       2.472571       0.982631

As you can observe it starts from 1981 despite specifying a start date of June 1st 1980!

The three dataframes - merged, observed and simulated - form the backbone of the library. Every other function in the library uses one or more of at least these three dataframes to perform analysis whether visual, descriptive or diagonistic.

DAILY AGGREGATION

This function returns the daily aggregate of the data passed into it. Its aggregates using the method passed or if one isnt given, its default is mean. Its that simple. Most of the data already comes with daily time stamps though so this is one of the fewer used functions. Its functionality is shown below:

[10]:
data.daily_aggregate(df=DATAFRAMES["DF_MERGED"])
[10]:
Station1 Station2
QOMEAS QOSIM1 QOMEAS QOSIM1
1981/001 9.85 2.518999 NaN 1.001954
1981/002 10.20 2.507289 NaN 0.997078
1981/003 10.00 2.495637 NaN 0.992233
1981/004 10.10 2.484073 NaN 0.987417
1981/005 9.99 2.472571 NaN 0.982631
... ... ... ... ...
2017/361 NaN 4.418050 NaN 1.380227
2017/362 NaN 4.393084 NaN 1.372171
2017/363 NaN 4.368303 NaN 1.364174
2017/364 NaN 4.343699 NaN 1.356237
2017/365 NaN 4.319275 NaN 1.348359

13514 rows × 4 columns

It returns a dataframe indexed daily from 1 till 365/366 i.e., the days of the year

WEEKLY AGGREGATION

This function returns the weekly aggregate of the data passed into it. Its aggregates using the method passed or if one isnt given, its default is mean. Its that simple. Its functionality is shown below:

[11]:
data.weekly_aggregate(df=DATAFRAMES["DF_MERGED"]) # default method of aggregation is mean
[11]:
Station1 Station2
QOMEAS QOSIM1 QOMEAS QOSIM1
1980-12-29 10.037500 2.501499 NaN 0.994671
1981-01-05 9.244286 2.438589 NaN 0.968506
1981-01-12 8.461429 2.361289 NaN 0.936405
1981-01-19 8.345714 2.287077 NaN 0.905643
1981-01-26 8.461429 2.215803 NaN 0.876150
... ... ... ... ...
2017-11-27 NaN 5.165260 NaN 1.621172
2017-12-04 NaN 4.958076 NaN 1.554878
2017-12-11 NaN 4.760181 NaN 1.490809
2017-12-18 NaN 4.572103 NaN 1.429983
2017-12-25 NaN 4.393448 NaN 1.372291

1931 rows × 4 columns

It returns a dataframe indexed weekly from 0/1 till 52/53 i.e., the weeks of the year

[12]:
data.weekly_aggregate(df=DATAFRAMES["DF_MERGED"], method='sum') # here we aggregate by summing up all the values of the week.
[12]:
Station1 Station2
QOMEAS QOSIM1 QOMEAS QOSIM1
1980-12-29 40.15 10.005998 0.0 3.978683
1981-01-05 64.71 17.070122 0.0 6.779542
1981-01-12 59.23 16.529020 0.0 6.554838
1981-01-19 58.42 16.009540 0.0 6.339498
1981-01-26 59.23 15.510619 0.0 6.133052
... ... ... ... ...
2017-11-27 0.00 36.156822 0.0 11.348205
2017-12-04 0.00 34.706534 0.0 10.884148
2017-12-11 0.00 33.321266 0.0 10.435665
2017-12-18 0.00 32.004719 0.0 10.009884
2017-12-25 0.00 30.754137 0.0 9.606034

1931 rows × 4 columns

YEARLY AGGREGATION

This function returns the yearly aggregate of the data passed into it. Its aggregates using the method passed or if one isnt given, its default is mean. Its that simple. Its functionality is shown below:

[13]:
data.yearly_aggregate(df=DATAFRAMES["DF_MERGED"]) # default method of aggregation is mean
[13]:
Station1 Station2
QOMEAS QOSIM1 QOMEAS QOSIM1
1981-01 8.815806 2.352880 NaN 0.932961
1981-02 7.780000 2.060198 NaN 0.811975
1981-03 6.896129 1.812920 NaN 0.710498
1981-04 7.981333 3.132911 NaN 0.821663
1981-05 43.538710 50.736276 10.111333 15.202072
... ... ... ... ...
2017-08 NaN 32.222317 NaN 17.704763
2017-09 NaN 28.141430 NaN 14.315134
2017-10 NaN 7.698483 NaN 2.615914
2017-11 NaN 5.625516 NaN 1.770196
2017-12 NaN 4.712923 NaN 1.475527

444 rows × 4 columns

It returns a dataframe indexed yearly from the first year in your data till the last year.

[14]:
data.yearly_aggregate(df=DATAFRAMES["DF_MERGED"], method='median') # here we aggregate by finding the median of the values each year.
[14]:
Station1 Station2
QOMEAS QOSIM1 QOMEAS QOSIM1
1981-01 8.590 2.350375 NaN 0.931876
1981-02 7.825 2.058540 NaN 0.811263
1981-03 6.860 1.811243 NaN 0.709789
1981-04 7.400 1.720303 NaN 0.634707
1981-05 15.800 5.115736 5.165 1.589602
... ... ... ... ...
2017-08 NaN 31.761380 NaN 17.606670
2017-09 NaN 25.176370 NaN 12.297040
2017-10 NaN 6.744768 NaN 2.161909
2017-11 NaN 5.622982 NaN 1.767848
2017-12 NaN 4.705044 NaN 1.472965

444 rows × 4 columns

MONTHLY AGGREGATION

This function returns the monthly aggregate of the data passed into it. Its aggregates using the method passed or if one isnt given, its default is mean. Its that simple. Its functionality is shown below:

[15]:
data.monthly_aggregate(df=DATAFRAMES["DF_MERGED"]) # default method of aggregation is mean
[15]:
Station1 Station2
QOMEAS QOSIM1 QOMEAS QOSIM1
1981-01 8.815806 2.352880 NaN 0.932961
1981-02 7.780000 2.060198 NaN 0.811975
1981-03 6.896129 1.812920 NaN 0.710498
1981-04 7.981333 3.132911 NaN 0.821663
1981-05 43.538710 50.736276 10.111333 15.202072
... ... ... ... ...
2017-08 NaN 32.222317 NaN 17.704763
2017-09 NaN 28.141430 NaN 14.315134
2017-10 NaN 7.698483 NaN 2.615914
2017-11 NaN 5.625516 NaN 1.770196
2017-12 NaN 4.712923 NaN 1.475527

444 rows × 4 columns

It returns a dataframe indexed monthly from 1 till 12 i.e., the months of the year

[16]:
data.monthly_aggregate(df=DATAFRAMES["DF_MERGED"], method='max') # here we aggregate by returning the maximum value that month
[16]:
Station1 Station2
QOMEAS QOSIM1 QOMEAS QOSIM1
1981-01 10.20 2.518999 NaN 1.001954
1981-02 8.51 2.186006 NaN 0.863836
1981-03 7.50 1.931955 NaN 0.759228
1981-04 15.30 17.241560 NaN 3.692734
1981-05 165.00 220.485800 36.6 97.054240
... ... ... ... ...
2017-08 NaN 41.657390 NaN 21.751910
2017-09 NaN 116.866500 NaN 53.925540
2017-10 NaN 19.746100 NaN 8.157081
2017-11 NaN 6.091304 NaN 1.927400
2017-12 NaN 5.134585 NaN 1.611400

444 rows × 4 columns

STATISTICS AGGREGATION

This allows us to calculate the aggregate of all the simulations accross the dataframe for every datetime index. Its aggregates using the method passed or if one isnt given, its default is mean. Unlike the other methods of aggregation, we are also able to perform quantile calculations with this method of aggregation. Its that simple. Its functionality is shown below:

[17]:
data.stat_aggregate(df=DATAFRAMES["DF_MERGED"], method='max')
[17]:
Station1 Station2
MAX MAX
1981-01-01 2.518999 1.001954
1981-01-02 2.507289 0.997078
1981-01-03 2.495637 0.992233
1981-01-04 2.484073 0.987417
1981-01-05 2.472571 0.982631
... ... ...
2017-12-27 4.418050 1.380227
2017-12-28 4.393084 1.372171
2017-12-29 4.368303 1.364174
2017-12-30 4.343699 1.356237
2017-12-31 4.319275 1.348359

13514 rows × 2 columns

[18]:
data.stat_aggregate(df=DATAFRAMES["DF_MERGED"], method='q75')
[18]:
Station1 Station2
Q75 Q75
1981-01-01 2.518999 1.001954
1981-01-02 2.507289 0.997078
1981-01-03 2.495637 0.992233
1981-01-04 2.484073 0.987417
1981-01-05 2.472571 0.982631
... ... ...
2017-12-27 4.418050 1.380227
2017-12-28 4.393084 1.372171
2017-12-29 4.368303 1.364174
2017-12-30 4.343699 1.356237
2017-12-31 4.319275 1.348359

13514 rows × 2 columns

PERIODIC/SEASONAL AGGREGATION

This allows us to return a specific period of time for every year or a select few years within a data set. Its allows you to essentially analyse a season or period every year without having to look through every day. For exmaple, lets say we want to isolate what the streamflow was like on the first 2 days of January every year, we would go..

[19]:
data.seasonal_period(df=DATAFRAMES["DF"], daily_period=('01-01', '01-02'))
[19]:
QOMEAS_05BB001 QOSIM_05BB001 QOMEAS_05BA001 QOSIM_05BA001
1981-01-01 9.85 2.518999 NaN 1.001954
1981-01-02 10.20 2.507289 NaN 0.997078
1982-01-01 7.17 5.465301 NaN 2.429704
1982-01-02 7.02 5.433753 NaN 2.414755
1983-01-01 8.98 5.371416 NaN 2.441398
... ... ... ... ...
2015-01-02 NaN 6.944578 NaN 1.615503
2016-01-01 NaN 3.686424 NaN 1.005240
2016-01-02 NaN 3.666536 NaN 0.999701
2017-01-01 NaN 2.700768 NaN 0.872387
2017-01-02 NaN 2.687161 NaN 0.867826

74 rows × 4 columns

Observe that for every year, we only get the first two days of the year. We can also specify specific years and not return every year. For example we can get the predicted values for the first week of summer for the years 1999, 2001 and 2005 as shown below:

[20]:
data.seasonal_period(df=DATAFRAMES["DF_SIMULATED"], daily_period=('06-21', '06-28'), years=[1999, 2001, 2005])
[20]:
QOSIM_05BB001 QOSIM_05BA001
1999-06-21 155.64250 127.26980
1999-06-22 147.19820 74.31406
1999-06-23 80.06104 34.51432
1999-06-24 45.07025 38.66659
1999-06-25 129.59160 104.75280
1999-06-26 156.33070 51.86972
1999-06-27 166.94590 62.61042
1999-06-28 85.17047 27.95974
2001-06-21 42.08820 32.47492
2001-06-22 44.51981 29.40924
2001-06-23 41.09985 29.90357
2001-06-24 42.13543 39.85459
2001-06-25 94.21543 135.35250
2001-06-26 127.62240 35.24305
2001-06-27 39.27969 39.95149
2001-06-28 77.72211 67.53999
2005-06-21 51.65529 24.16283
2005-06-22 51.92735 28.55426
2005-06-23 94.62647 35.51184
2005-06-24 40.85018 16.88546
2005-06-25 90.53637 109.68720
2005-06-26 189.87700 46.47357
2005-06-27 67.25498 43.57297
2005-06-28 157.66590 80.75896

As you can see, we are able to get the first week of summer for those 3 years.

LONG TERM AGGREGATION

this allows us to compute the long-term seasonal aggregate values of a given DataFrame by applying the specified aggregation method to each day across all years in the provided time period. The resulting data is aggregated into a single year (1 to 365/366 days). This way we are able to see how the models perform year in year out compared to the actual recorded data - both aggregated as necessary. An example is shown below:

[21]:
data.long_term_seasonal(df=DATAFRAMES["DF"]) # As usual the default aggregation method is mean/average
[21]:
QOMEAS_05BB001 QOSIM_05BB001 QOMEAS_05BA001 QOSIM_05BA001
jday
1 9.446471 4.037666 NaN 1.130686
2 9.428125 4.014474 NaN 1.123915
3 9.660625 3.991451 NaN 1.117196
4 9.804375 3.968602 NaN 1.110529
5 9.787500 3.945921 NaN 1.103913
... ... ... ... ...
362 9.942500 4.188140 NaN 1.169614
363 9.695000 4.163847 NaN 1.162533
364 9.633125 4.139735 NaN 1.155507
365 9.516875 4.115805 NaN 1.148535
366 9.870000 4.433329 NaN 1.191653

366 rows × 4 columns

We are also able to calculate quantiles, For example the 75th Quantile value for all years aggregated into a single year looks like:

[22]:
data.long_term_seasonal(df=DATAFRAMES["DF_SIMULATED"], method = 'Q75')
[22]:
QOSIM_05BB001 QOSIM_05BA001
jday
1 4.830453 1.315370
2 4.801530 1.306986
3 4.772831 1.298670
4 4.744344 1.290422
5 4.716085 1.282241
... ... ...
362 4.978491 1.372171
363 4.948421 1.364174
364 4.918590 1.356237
365 4.888982 1.348359
366 4.859608 1.323824

366 rows × 2 columns

Naturally, when dealing with multi model evaluations, we are able to perform statictics on the output of the long term seasonal aggregations. This was we are able to extract the mean, median, max, etc of the long term seasonal aggregations, leavong us with just the statistics. An example of this is shown below:

[23]:
data.stat_aggregate(df=data.long_term_seasonal(df=DATAFRAMES["DF_MERGED"], method = 'median'), method='median')
[23]:
Station1 Station2
MEDIAN MEDIAN
jday
1 3.636044 1.001954
2 3.616069 0.997078
3 3.596241 0.992233
4 3.576548 0.987417
5 3.556993 0.982631
... ... ...
362 3.767376 1.027792
363 3.746928 1.022093
364 3.726621 1.016435
365 3.706450 1.010818
366 4.164878 1.192406

366 rows × 2 columns

Note

All of these functions with their various means of aggregation are available as individual functions but they can also be generated right from the generate_dataframes() function if you know eaxactly what you’ll need from the beginning. It just requires specifying a few more parameters. These parameters are shown below:

[24]:
## Lets use a time period of 1981 to 1990 to demonstrate this
DATAFRAMES = data.generate_dataframes(csv_fpaths=path, warm_up=365, start_date = "1981-01-01", end_date = "1990-12-31",
                                      # optional arguments
                                      # you specify that you want an aggregated dataframe by passing 'True' into
                                      # the respective parameter and then you pass in your preffered method of aggregation

                                      # If you want daily aggregation
                                      daily_agg = True, da_method = 'min',
                                      # lets see a weekly aggregation
                                      weekly_agg = True, wa_method = 'min', # we want the minimum value each week
                                      # lets also see monthly aggregation
                                      monthly_agg = True, ma_method = 'inst', # we want the maximum value each month
                                      # lets also see yearly aggregation
                                      yearly_agg = True, ya_method = 'sum', # we want the sum of all values each year
                                      # lets see the stats aggregation
                                      stat_agg = True, stat_method = 'q75'
                                      # note that without inputing the respective methods,
                                      # the functions will still default to mean as the method of aggregation
                                     )
The start date for the Data is 1981-01-01
[25]:
DATAFRAMES = data.generate_dataframes(csv_fpaths=path, warm_up=365, start_date = "1981-01-01", end_date = "1990-12-31",
                                      # seasonal aggregation
                                      # obtaining the months of May till August from every year from 1981 to 1985
                                      seasonal_p = True, sp_dperiod = ('05-01', '08-30'),
                                      sp_subset = ('1981-01-01', '1985-12-31'),
                                      # instead of sp_subset, we can also use years = [1981, 1982, 1983, 1984, 1985].

                                      # long term seasonal aggregation
                                      long_term = True, lt_method = ["q33.33", "median" ,'q75' ,'Q25' ,'q33' ],
                                      # when using long term in the generate_dataframes function, we are able to pass
                                      # in a list of methods of aggregation we want generated. BY dafault though it will
                                      # always generate maximum, minimum and median value dataframes
                                     )
The start date for the Data is 1981-01-01

Putting it all together, we have:

[26]:
DATAFRAMES = data.generate_dataframes(csv_fpaths=path, warm_up=365, start_date = "1981-01-01", end_date = "1990-12-31",
                                      daily_agg = True, da_method = 'min',
                                      weekly_agg = True, wa_method = 'min',
                                      monthly_agg = True, ma_method = 'inst',
                                      yearly_agg = True, ya_method = 'sum',
                                      stat_agg = True, stat_method = 'q75',
                                      seasonal_p = True, sp_dperiod = ('05-01', '08-30'), sp_subset = ('1981-01-01', '1985-12-31'),
                                      long_term = True, lt_method = ["q33.33", "median" ,'q75' ,'Q25' ,'q33' ],
                                     )


for key, value in DATAFRAMES.items():
    print(f"{key}:\n{value}")
The start date for the Data is 1981-01-01
DF:
            QOMEAS_05BB001  QOSIM_05BB001  QOMEAS_05BA001  QOSIM_05BA001
1981-01-01            9.85       2.518999             NaN       1.001954
1981-01-02           10.20       2.507289             NaN       0.997078
1981-01-03           10.00       2.495637             NaN       0.992233
1981-01-04           10.10       2.484073             NaN       0.987417
1981-01-05            9.99       2.472571             NaN       0.982631
...                    ...            ...             ...            ...
1990-12-27           10.10       6.615961             NaN       1.737144
1990-12-28            9.50       6.573054             NaN       1.725025
1990-12-29            8.60       6.530500             NaN       1.713013
1990-12-30            8.20       6.488300             NaN       1.701107
1990-12-31            8.25       6.446449             NaN       1.689308

[3652 rows x 4 columns]
DF_OBSERVED:
            QOMEAS_05BB001  QOMEAS_05BA001
1981-01-01            9.85             NaN
1981-01-02           10.20             NaN
1981-01-03           10.00             NaN
1981-01-04           10.10             NaN
1981-01-05            9.99             NaN
...                    ...             ...
1990-12-27           10.10             NaN
1990-12-28            9.50             NaN
1990-12-29            8.60             NaN
1990-12-30            8.20             NaN
1990-12-31            8.25             NaN

[3652 rows x 2 columns]
DF_SIMULATED:
            QOSIM_05BB001  QOSIM_05BA001
1981-01-01       2.518999       1.001954
1981-01-02       2.507289       0.997078
1981-01-03       2.495637       0.992233
1981-01-04       2.484073       0.987417
1981-01-05       2.472571       0.982631
...                   ...            ...
1990-12-27       6.615961       1.737144
1990-12-28       6.573054       1.725025
1990-12-29       6.530500       1.713013
1990-12-30       6.488300       1.701107
1990-12-31       6.446449       1.689308

[3652 rows x 2 columns]
DF_MERGED:
           Station1           Station2
             QOMEAS    QOSIM1   QOMEAS    QOSIM1
1981-01-01     9.85  2.518999      NaN  1.001954
1981-01-02    10.20  2.507289      NaN  0.997078
1981-01-03    10.00  2.495637      NaN  0.992233
1981-01-04    10.10  2.484073      NaN  0.987417
1981-01-05     9.99  2.472571      NaN  0.982631
...             ...       ...      ...       ...
1990-12-27    10.10  6.615961      NaN  1.737144
1990-12-28     9.50  6.573054      NaN  1.725025
1990-12-29     8.60  6.530500      NaN  1.713013
1990-12-30     8.20  6.488300      NaN  1.701107
1990-12-31     8.25  6.446449      NaN  1.689308

[3652 rows x 4 columns]
DF_DAILY:
         Station1           Station2
           QOMEAS    QOSIM1   QOMEAS    QOSIM1
1981/001     9.85  2.518999      NaN  1.001954
1981/002    10.20  2.507289      NaN  0.997078
1981/003    10.00  2.495637      NaN  0.992233
1981/004    10.10  2.484073      NaN  0.987417
1981/005     9.99  2.472571      NaN  0.982631
...           ...       ...      ...       ...
1990/361    10.10  6.615961      NaN  1.737144
1990/362     9.50  6.573054      NaN  1.725025
1990/363     8.60  6.530500      NaN  1.713013
1990/364     8.20  6.488300      NaN  1.701107
1990/365     8.25  6.446449      NaN  1.689308

[3652 rows x 4 columns]
DF_WEEKLY:
           Station1           Station2
             QOMEAS    QOSIM1   QOMEAS    QOSIM1
1980-12-29     9.85  2.484073      NaN  0.987417
1981-01-05     8.70  2.404939      NaN  0.954524
1981-01-12     8.24  2.328990      NaN  0.923008
1981-01-19     7.86  2.256059      NaN  0.892801
1981-01-26     8.10  2.186006      NaN  0.863836
...             ...       ...      ...       ...
1990-12-03    10.80  7.453629      NaN  1.975199
1990-12-10     8.70  7.112537      NaN  1.877937
1990-12-17     8.10  6.791226      NaN  1.786726
1990-12-24     8.20  6.488300      NaN  1.701107
1990-12-31     8.25  6.446449      NaN  1.689308

[523 rows x 4 columns]
DF_MONTHLY:
        Station1             Station2
          QOMEAS      QOSIM1   QOMEAS     QOSIM1
1981-01     8.62    2.195846      NaN   0.867900
1981-02     7.20    1.940355      NaN   0.762678
1981-03     7.25    1.699932      NaN   0.664341
1981-04    15.30    3.859564      NaN   0.584523
1981-05   113.00  220.485800    28.20  96.363520
...          ...         ...      ...        ...
1990-08    33.60   40.431200    10.10  23.856810
1990-09    90.90   19.438340    30.50   6.175078
1990-10    21.10    9.648046     4.01   2.642092
1990-11    12.00    7.920140     3.90   2.109938
1990-12     8.25    6.446449      NaN   1.689308

[120 rows x 4 columns]
DF_YEARLY:
        Station1              Station2
          QOMEAS       QOSIM1   QOMEAS      QOSIM1
1981-01   273.29    72.939293     0.00   28.921777
1981-02   217.84    57.685555     0.00   22.735300
1981-03   213.78    56.200513     0.00   22.025424
1981-04   239.44    93.987320     0.00   24.649901
1981-05  1349.70  1572.824547   303.34  471.264224
...          ...          ...      ...         ...
1990-08  1541.80  1520.330940   550.40  758.817000
1990-09  1007.10   855.418340   269.57  408.246703
1990-10  1035.10   361.582897   277.26   95.083195
1990-11   460.12   261.980893     3.90   70.673194
1990-12   324.60   220.982680     0.00   58.369321

[120 rows x 4 columns]
DF_CUSTOM:
           Station1            Station2
             QOMEAS     QOSIM1   QOMEAS     QOSIM1
1981-05-01     13.8   4.924126      NaN   0.587910
1981-05-02     12.0   4.199440     3.01   0.592167
1981-05-03     10.9   2.605448     2.85   0.580767
1981-05-04     10.3   2.898053     2.72   0.582108
1981-05-05     10.4   3.357134     2.80   0.578170
...             ...        ...      ...        ...
1985-08-26     45.9  22.102120    13.00  13.373290
1985-08-27     43.6  21.293030    12.50  12.981310
1985-08-28     42.5  21.262790    12.40  13.329020
1985-08-29     41.6  21.753170    12.80  13.696490
1985-08-30     42.0  22.593580    13.20  16.211630

[610 rows x 4 columns]
LONG_TERM_MIN:
     Station1           Station2
       QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1        7.17  1.802230      NaN  0.485006
2        7.02  1.794152      NaN  0.482805
3        7.10  1.786115      NaN  0.480617
4        7.31  1.778133      NaN  0.478442
5        7.64  1.770190      NaN  0.476279
...       ...       ...      ...       ...
362      7.25  1.835014      NaN  0.493944
363      7.29  1.826749      NaN  0.491689
364      7.27  1.818528      NaN  0.489450
365      7.21  1.810352      NaN  0.487221
366     10.30  2.939887      NaN  0.782759

[366 rows x 4 columns]
LONG_TERM_MAX:
     Station1           Station2
       QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1        12.7  5.465301      NaN  2.441398
2        12.7  5.433753      NaN  2.426411
3        12.7  5.402424      NaN  2.411540
4        12.7  5.371336      NaN  2.396784
5        12.8  5.340471      NaN  2.382142
...       ...       ...      ...       ...
362      13.3  6.573054      NaN  2.502534
363      13.0  6.530500      NaN  2.487069
364      12.8  6.488300      NaN  2.471726
365      12.7  6.446449      NaN  2.456503
366      11.3  3.132862      NaN  0.925092

[366 rows x 4 columns]
LONG_TERM_MEDIAN:
     Station1           Station2
       QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1       9.795  3.397196      NaN  0.961191
2       9.815  3.379185      NaN  0.956437
3       9.855  3.361293      NaN  0.951713
4       9.875  3.343524      NaN  0.947018
5       9.695  3.325880      NaN  0.942354
...       ...       ...      ...       ...
362    10.350  3.698825      NaN  1.020910
363     9.490  3.678319      NaN  1.015117
364     9.070  3.657953      NaN  1.009368
365     9.350  3.637736      NaN  1.003660
366    10.800  3.036374      NaN  0.853926

[366 rows x 4 columns]
LONG_TERM_Q33.33:
       Station1           Station2
         QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1      9.519838  3.116782      NaN  0.883651
2      9.419919  3.100879      NaN  0.878571
3      9.599898  3.085059      NaN  0.873528
4      9.709991  3.069359      NaN  0.868522
5      9.400000  3.053748      NaN  0.863553
...         ...       ...      ...       ...
362    9.499817  3.373855      NaN  0.904356
363    8.849961  3.356529      NaN  0.899122
364    8.949970  3.339308      NaN  0.893926
365    8.919976  3.322208      NaN  0.888770
366   10.633300  3.004206      NaN  0.830199

[366 rows x 4 columns]
LONG_TERM_Q75:
     Station1           Station2
       QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1     10.0375  4.156184      NaN  1.248870
2     10.2000  4.132273      NaN  1.242024
3     10.6500  4.108537      NaN  1.235226
4     10.3250  4.084971      NaN  1.228475
5     10.5225  4.061583      NaN  1.221772
...       ...       ...      ...       ...
362   11.2500  5.222542      NaN  1.627869
363   11.0000  5.192005      NaN  1.617079
364   10.8000  5.161703      NaN  1.606380
365   10.4950  5.131630      NaN  1.595775
366   11.0500  3.084618      NaN  0.889509

[366 rows x 4 columns]
LONG_TERM_Q25:
     Station1           Station2
       QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1      9.1150  2.972452      NaN  0.881223
2      9.2175  2.956880      NaN  0.876179
3      9.3450  2.941408      NaN  0.871171
4      9.6875  2.926045      NaN  0.866200
5      9.4000  2.910776      NaN  0.861267
...       ...       ...      ...       ...
362    9.0425  3.241984      NaN  0.901784
363    8.7525  3.225315      NaN  0.896586
364    8.8750  3.208758      NaN  0.891427
365    8.8600  3.192306      NaN  0.886306
366   10.5500  2.988131      NaN  0.818343

[366 rows x 4 columns]
LONG_TERM_Q33:
     Station1           Station2
       QOMEAS    QOSIM1   QOMEAS    QOSIM1
jday
1      9.5038  3.111064      NaN  0.883555
2      9.4119  3.095175      NaN  0.878476
3      9.5898  3.079368      NaN  0.873434
4      9.7091  3.063681      NaN  0.868430
5      9.4000  3.048084      NaN  0.863463
...       ...       ...      ...       ...
362    9.4817  3.368631      NaN  0.904254
363    8.8461  3.351331      NaN  0.899021
364    8.9470  3.334136      NaN  0.893827
365    8.9176  3.317062      NaN  0.888672
366   10.6300  3.003569      NaN  0.829729

[366 rows x 4 columns]
DF_STATS:
            Station1                                Station2            \
                 MIN       MAX    MEDIAN       Q75       MIN       MAX
1981-01-01  2.518999  2.518999  2.518999  2.518999  1.001954  1.001954
1981-01-02  2.507289  2.507289  2.507289  2.507289  0.997078  0.997078
1981-01-03  2.495637  2.495637  2.495637  2.495637  0.992233  0.992233
1981-01-04  2.484073  2.484073  2.484073  2.484073  0.987417  0.987417
1981-01-05  2.472571  2.472571  2.472571  2.472571  0.982631  0.982631
...              ...       ...       ...       ...       ...       ...
1990-12-27  6.615961  6.615961  6.615961  6.615961  1.737144  1.737144
1990-12-28  6.573054  6.573054  6.573054  6.573054  1.725025  1.725025
1990-12-29  6.530500  6.530500  6.530500  6.530500  1.713013  1.713013
1990-12-30  6.488300  6.488300  6.488300  6.488300  1.701107  1.701107
1990-12-31  6.446449  6.446449  6.446449  6.446449  1.689308  1.689308


              MEDIAN       Q75
1981-01-01  1.001954  1.001954
1981-01-02  0.997078  0.997078
1981-01-03  0.992233  0.992233
1981-01-04  0.987417  0.987417
1981-01-05  0.982631  0.982631
...              ...       ...
1990-12-27  1.737144  1.737144
1990-12-28  1.725025  1.725025
1990-12-29  1.713013  1.713013
1990-12-30  1.701107  1.701107
1990-12-31  1.689308  1.689308

[3652 rows x 8 columns]
[ ]: