filter_valid_data
- postprocessinglib.utilities._helper_functions.filter_valid_data(df: DataFrame, station_num: int = 0, station: str = '') DataFrame
Removes the invalid values from a dataframe
Invalid in this case refers to NaN, negative and infinity. It goes through the dataframe, checking the individual colummn indentified by station_num or station for whether it contains Nan, negative or infinity and removes the rows that contain these values.
- Parameters:
df (pd.DataFrame) – the dataframe which you want to remove invalid values from
station_num (int) – the number referring to the station values we are trying to modify
station (str = "") – the column name representing the station from which you want to remove invalid values from
- Returns:
the modified input dataframe with rows containing NaN, negative and inf values removed.
- Return type:
pd.DataFrame
Example
>>> import numpy as np >>> import pandas as pd >>> from postprocessinglib.utilities import _helper_functions >>> # Create your index as an array >>> index = np.array([1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990]) >>> . >>> # Create a test dataframe >>> test_df = pd.DataFrame(data = data, columns = ("obs1"), index = index) >>> print(test_df) obs1 1981 NaN 1982 0.332201 1983 0.251259 1984 0.620732 1985 -inf 1986 0.013643 1987 NaN 1988 0.115222 1989 0.434341 1990 NaN >>> . >>> valid_data = _helper_functions.filter_valid_data(df=test_df, station_num=0) >>> print(valid_data) obs1 1982 0.332201 1983 0.251259 1984 0.620732 1986 0.013643 1988 0.115222 1989 0.434341 >>> ## Obsereve how all the 'invalid' values in obs1 have been removed.