filter_valid_data

postprocessinglib.utilities._helper_functions.filter_valid_data(df: DataFrame, station_num: int = 0, station: str = '') DataFrame

Removes the invalid values from a dataframe

Invalid in this case refers to NaN, negative and infinity. It goes through the dataframe, checking the individual colummn indentified by station_num or station for whether it contains Nan, negative or infinity and removes the rows that contain these values.

Parameters:
  • df (pd.DataFrame) – the dataframe which you want to remove invalid values from

  • station_num (int) – the number referring to the station values we are trying to modify

  • station (str = "") – the column name representing the station from which you want to remove invalid values from

Returns:

the modified input dataframe with rows containing NaN, negative and inf values removed.

Return type:

pd.DataFrame

Example

>>> import numpy as np
>>> import pandas as pd
>>> from postprocessinglib.utilities import _helper_functions
>>> # Create your index as an array
>>> index = np.array([1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990])
>>> .
>>> # Create a test dataframe
>>> test_df = pd.DataFrame(data = data, columns = ("obs1"), index = index)
>>> print(test_df)
          obs1
1981       NaN
1982  0.332201
1983  0.251259
1984  0.620732
1985      -inf
1986  0.013643
1987       NaN
1988  0.115222
1989  0.434341
1990       NaN
>>> .
>>> valid_data = _helper_functions.filter_valid_data(df=test_df, station_num=0)
>>> print(valid_data)
          obs1
1982  0.332201
1983  0.251259
1984  0.620732
1986  0.013643
1988  0.115222
1989  0.434341
>>> ## Obsereve how all the 'invalid' values in obs1 have been removed.