lenapy.lenapy_time
The lenapy_time module implements some usuals functions to be applied on timeseries.
- class lenapy.lenapy_time.TimeSet(xarray_obj)[source]
Bases:
objectThis class implements an extension of any dataset to add some usefull methods often used on timeseries in earth science data handling
- climato(**kwargs)[source]
Perform climato analysis on all the variables in a dataset Input data are decomposed into : * annual cycle * semi-annual cycle * trend * mean * residual signal The returned data are a combination of these elements depending on passed arguments (signal, mean, trend, cycle) If return_coeffs=True, the coefficients of the decompositions are returned
- Parameters:
signal (Bool (default=True)) – returns residual signal
mean (Bool (default=True)) – returns mean signal
trend (Bool (default=True)) – returns trend (unit=day**-1)
cycle (Bool (default=False)) – return annual and semi-annual cycles (cos and sin)
return_coeffs (Bool (default=False)) – returns cycle coefficient, mean and trend
time_period (slice (default=slice(None,None), ie the whole time period of the data)) – Reference time period when climatology has to be computed
fillna (Bool (default=False)) – if fillna=True and signal=True, Nan in signal is replaced by the other selected components Only for 1D signal, for higher dimensions any NaN in the signal will produce a NaN in the output
- Returns:
climato (dataset) – a dataset with the same structure as the input, with modified data according to the chosen options
if return_coeffs=True, an extra dataset is provided with the coefficients of the decomposition
Example
data = lntime.open_geodata('/home/user/lenapy/data/gohc_2020.nc') output,coeffs = data.lntime.climato(mean=True, trend=True, signal=True,return_coeffs=True)
- generate_climato(coeffs, **kwargs)[source]
Returns a signal based on a given climatology (mean, trend, cycles)
- Parameters:
coeffs (xr.DataArray) – returned by the climato method with return_climato=True
mean (Bool (default=True)) – returns mean signal
trend (Bool (default=True)) – returns trend
cycle (Bool (default=False)) – return annual and semi-annual cycles
- filter(filter_name='lanczos', q=3, **kwargs)[source]
Apply a specified filter on all the time-dependent data in the dataset. Boundaries are handled by operating a mirror operation on the residual data after removing a q-order polyfit from the data. Available filters are in the .utils python file
- Parameters:
filter_name (function or str) – if string, filter function name, from the .filters file if function, external function defined by user, returning a kernel
q (int) – order of the polyfit to handle boundary effects
**kwargs – Keyword arguments for the chosen filter
- Returns:
filtered – Filtered dataset
- Return type:
xr.Dataset
Example
>>>data = lntime.open_geodata(‘/home/user/lenapy/data/isas.nc’) >>>data.lntime.filter(lanczos,q=3,coupure=12,order=2)
- interp_time(other, **kwargs)[source]
Interpolate DataArray at the same dates than other
- Parameters:
other (xr.DataArray) – must have a time dimension
- Returns:
interpolated – new DataArray interpolated
- Return type:
xr.DataArray
- class lenapy.lenapy_time.TimeArray(xarray_obj)[source]
Bases:
objectThis class implements an extension of any dataArray to add some usefull methods often used on timeseries in earth science data handling.
- climato(**kwargs)[source]
Perform climato analysis on a dataarray Input data are decomposed into : * annual cycle * semi-annual cycle * trend * mean * residual signal The returned data are a combination of these elements depending on passed arguments (signal, mean, trend, cycle) If return_coeffs=True, the coefficients of the decompositions are returned
- Parameters:
signal (Bool (default=True)) – returns residual signal
mean (Bool (default=True)) – returns mean signal
trend (Bool (default=True)) – returns trend (unit=day**-1)
cycle (Bool (default=False)) – return annual and semi-annual cycles (cos and sin)
return_coeffs (Bool (default=False)) – returns cycle coefficient, mean and trend
t_min (datetime format or string (default=None,None), ie the whole time period of the data)) – Reference time period when climatology has to be computed
t_max (datetime format or string (default=None,None), ie the whole time period of the data)) – Reference time period when climatology has to be computed
fillna (Bool (default=False)) – if fillna=True and signal=True, Nan in signal is replaced by the other selected components Only for 1D signal, for higher dimensions any NaN in the signal will produce a NaN in the output
- Returns:
climato (dataset) – a dataset with the same structure as the input, with modified data according to the chosen options
if return_coeffs=True, an extra dataset is provided with the coefficients of the decomposition
Example
data = lntime.open_geodata('/home/user/lenapy/data/gohc_2020.nc').ohc output,coeffs = data.lntime.climato(mean=True, trend=True, signal=True,return_coeffs=True)
- generate_climato(coeffs, **kwargs)[source]
Returns a signal based on a given climatology (mean, trend, cycles)
- Parameters:
coeffs (DataArray) – returned by the climato method with return_climato=True
mean (Bool (default=True)) – returns mean signal
trend (Bool (default=True)) – returns trend
cycle (Bool (default=False)) – return annual and semi-annual cycles
- filter(filter_name='lanczos', q=3, **kwargs)[source]
Apply a specified filter on all the time-dependent datarray Boundaries are handled by operating a mirror operation on the residual data after removing a q-order polyfit from the data. Available filters are in the .utils python file
- Parameters:
filter_name (function or string) – if string, filter function name, from the .filters file if function, external function defined by user, returning a kernel
q (int) – order of the polyfit to handle boundary effects
**kwargs – keyword arguments for the chosen filter
- Returns:
filtered
- Return type:
filtered dataset
Example
data = lntime.open_geodata('/home/user/lenapy/data/isas.nc').temp data.lntime.filter(lanczos,q=3,coupure=12,order=2)
- interp_time(other, **kwargs)[source]
Interpolate DataArray at the same dates than other
- Parameters:
other (xr.DataArray) – must have a time dimension
- Returns:
interpolated – new DataArray interpolated
- Return type:
xr.DataArray
- plot(**kwargs)[source]
Plots the timeseries of the data in the TimeArray, including an uncertainty. Computes the uncertainty on all dimensions that are not time.
- Parameters:
thick_line (String (default='median')) – How to aggregate the data to plot the main thick line. Can be: * median: computes the median * mean: computes the mean * None: does not plot a main thick line
shaded_area (String (default='auto')) – How to aggregate the data to plot the uncertainty around the thick line. Can be: * auto: plots 1.645 standard deviation if thick_line is mean and quantiles 5-95 if thick_line is median. * auto-multiple: plots 1,2 and 3 standard deviations if thick_line is mean and quantiles 5-95, 17-83 and 25-75 if thick_line is median. * std: plots a multiple of the standard deviation based on kwarg standard_deviation_multiple * quantiles: plots quantiles based on the kwargs quantile_min and quantile_max * None: does not plot uncertainty
hue (String (default=None)) – Similar to hue in xarray.DataArray.plot(hue=…), group data by the dimension before aggregating and computing uncertainties. Has to be a dimension other than time in the dataarray.
standard_deviation_multiple (Float > 0 (default=1.65)) – The multiple of standard deviations to use for the uncertainty with shaded_area=std
quantile_min (Float between 0 and 1 (default=0.05)) – lower quantile to compute uncertainty with shaded_area=quantiles
quantile_max (Float between 0 and 1 (default=0.95)) – upper quantile to compute uncertainty with shaded_area=quantiles
color (String or List (default=None)) – color of the main thick line and the shaded area. Must be a string
thick_line_color (String or List (default=None)) – color of the main thick line. Must be a string If hue and one color are provided, the single color is used for all line plots. If hue and a list of colors are provided, the colors are cycled.
shaded_area_color (String or List (default=None)) – color of the shaded area. Must be a string. If not provided, defaults to the thick_line_color value. If hue and one color are provided, the single color is used for all area plots. If hue and a list of colors are provided, the colors are cycled.
shaded_area_alpha (Float between 0 and 1 (default=0.2)) – Transparency of the uncertainty plots
ax (matplotlib.pyplot.Axes instance (default=None)) – If not provided, plots on the current axes.
label (String (default=None)) – If provided, label that is provided to ax.plot. Does not work if hue is provided.
line_kwargs (kwargs) – Additional arguments provided to the plot function for the main thick line
area_kwargs (kwargs) – Additional arguments provided to the plot function for the uncertainty
add_legend (Bool (default=True)) – if True, adds matplotlib legend to the current ax after plotting the data.
- to_datetime(time_type)[source]
Convert DataArray time format to standard pandas time format
- Parameters:
time_type (string) – Can be ‘frac_year’ or ‘360_day’
- Returns:
converted – new DataArray with the time dimension in a standard pandas format
- Return type:
xr.DataArray
- diff_3pts(dim, **kw)[source]
Derivative formula along the selected dimension, returning on each point the linear regression on the three points defined by the selected point and its two neighbours
- diff_2pts(dim, **kw)[source]
Derivative formula along the selected dimension, returning for each pair of points the slope, set at the middle coordinates of these two points
- trend(time_unit='1s')[source]
Perform a linear regression on the data, and returns the slope coefficient
- fill_time()[source]
Fill missing values in a timeseries in adding some new points, by respecting the time sampling. Missing values are not NaN but real absent points in the timeseries. A linear interpolation is performed at the missing points.
- OLS(degree, tref=None, sigma=None, datetime_unit='s')[source]
Returns the OLS estimator performed with a degree “degree” regression
- GLS(degree, tref=None, sigma=None, datetime_unit='s')[source]
Returns the GLS estimator performed with a degree “degree” regression and a covariance matrix “sigma”
- corr(other, remove_trend=False, **kwargs)[source]
Returns the Pearson correlation coefficient between the timeseries and another one. The other one is interpolated at the dates of the calling timeseries. If remove_trend=True, the two timeseries are detrended before correlation.
- fillna_climato()[source]
Returns a DataArray with all NaN values replaced by climatology and trend Climatology is computed over the optional time_period slice
- EOF(dim, k)[source]
Return an instance of the eof class based on the data array and the dimension names of the eof
- SavitzkyGolay(dim='time', window=5, order=1, step=1, sigma=None)[source]
Perform a Savitzky-Golay filter on a dataArray and return filtered derivatives up to maximal order
- Parameters:
dim (string) – name of the dimension along which to apply the filter
window (int) – length of the filtering window (must be odd)
order (int) – order of the polynome to fit the function across the window
step (float or time type) – distance between to consecutive points of abscissa
sigma (same type as step (optionnal)) – standard deviation of the weights function to be applied on the window
- Returns:
filtered – new DataArray filtered with an extra dimension ‘order’, giving the successive filtered derivatives of the signal
- Return type:
xr.DataArray