lenapy.lenapy_time

The lenapy_time module implements some usuals functions to be applied on timeseries.

class lenapy.lenapy_time.TimeSet(xarray_obj)[source]

Bases: object

This class implements an extension of any dataset to add some usefull methods often used on timeseries in earth science data handling

Coeffs_climato(**kwargs)[source]
climato(**kwargs)[source]

Perform climato analysis on all the variables in a dataset Input data are decomposed into : * annual cycle * semi-annual cycle * trend * mean * residual signal The returned data are a combination of these elements depending on passed arguments (signal, mean, trend, cycle) If return_coeffs=True, the coefficients of the decompositions are returned

Parameters:
  • signal (Bool (default=True)) – returns residual signal

  • mean (Bool (default=True)) – returns mean signal

  • trend (Bool (default=True)) – returns trend (unit=day**-1)

  • cycle (Bool (default=False)) – return annual and semi-annual cycles (cos and sin)

  • return_coeffs (Bool (default=False)) – returns cycle coefficient, mean and trend

  • time_period (slice (default=slice(None,None), ie the whole time period of the data)) – Reference time period when climatology has to be computed

  • fillna (Bool (default=False)) – if fillna=True and signal=True, Nan in signal is replaced by the other selected components Only for 1D signal, for higher dimensions any NaN in the signal will produce a NaN in the output

Returns:

  • climato (dataset) – a dataset with the same structure as the input, with modified data according to the chosen options

  • if return_coeffs=True, an extra dataset is provided with the coefficients of the decomposition

Example

data = lntime.open_geodata('/home/user/lenapy/data/gohc_2020.nc')
output,coeffs = data.lntime.climato(mean=True, trend=True, signal=True,return_coeffs=True)
generate_climato(coeffs, **kwargs)[source]

Returns a signal based on a given climatology (mean, trend, cycles)

Parameters:
  • coeffs (xr.DataArray) – returned by the climato method with return_climato=True

  • mean (Bool (default=True)) – returns mean signal

  • trend (Bool (default=True)) – returns trend

  • cycle (Bool (default=False)) – return annual and semi-annual cycles

filter(filter_name='lanczos', q=3, **kwargs)[source]

Apply a specified filter on all the time-dependent data in the dataset. Boundaries are handled by operating a mirror operation on the residual data after removing a q-order polyfit from the data. Available filters are in the .utils python file

Parameters:
  • filter_name (function or str) – if string, filter function name, from the .filters file if function, external function defined by user, returning a kernel

  • q (int) – order of the polyfit to handle boundary effects

  • **kwargs – Keyword arguments for the chosen filter

Returns:

filtered – Filtered dataset

Return type:

xr.Dataset

Example

>>>data = lntime.open_geodata(‘/home/user/lenapy/data/isas.nc’) >>>data.lntime.filter(lanczos,q=3,coupure=12,order=2)

interp_time(other, **kwargs)[source]

Interpolate DataArray at the same dates than other

Parameters:

other (xr.DataArray) – must have a time dimension

Returns:

interpolated – new DataArray interpolated

Return type:

xr.DataArray

to_datetime(time_type)[source]

Convert dataset time format to standard pandas time format

Parameters:

time_type (string) – Can be ‘frac_year’ or ‘360_day’

Returns:

converted – new dataset with the time dimension in a standard pandas format

Return type:

dataset

fill_time()[source]

Fill missing values in a timeseries in adding some new points, by respecting the time sampling. Missing values are not NaN but real absent points in the timeseries. A linear interpolation is performed at the missing points.

class lenapy.lenapy_time.TimeArray(xarray_obj)[source]

Bases: object

This class implements an extension of any dataArray to add some usefull methods often used on timeseries in earth science data handling.

Coeffs_climato(**kwargs)[source]
climato(**kwargs)[source]

Perform climato analysis on a dataarray Input data are decomposed into : * annual cycle * semi-annual cycle * trend * mean * residual signal The returned data are a combination of these elements depending on passed arguments (signal, mean, trend, cycle) If return_coeffs=True, the coefficients of the decompositions are returned

Parameters:
  • signal (Bool (default=True)) – returns residual signal

  • mean (Bool (default=True)) – returns mean signal

  • trend (Bool (default=True)) – returns trend (unit=day**-1)

  • cycle (Bool (default=False)) – return annual and semi-annual cycles (cos and sin)

  • return_coeffs (Bool (default=False)) – returns cycle coefficient, mean and trend

  • t_min (datetime format or string (default=None,None), ie the whole time period of the data)) – Reference time period when climatology has to be computed

  • t_max (datetime format or string (default=None,None), ie the whole time period of the data)) – Reference time period when climatology has to be computed

  • fillna (Bool (default=False)) – if fillna=True and signal=True, Nan in signal is replaced by the other selected components Only for 1D signal, for higher dimensions any NaN in the signal will produce a NaN in the output

Returns:

  • climato (dataset) – a dataset with the same structure as the input, with modified data according to the chosen options

  • if return_coeffs=True, an extra dataset is provided with the coefficients of the decomposition

Example

data = lntime.open_geodata('/home/user/lenapy/data/gohc_2020.nc').ohc
output,coeffs = data.lntime.climato(mean=True, trend=True, signal=True,return_coeffs=True)
generate_climato(coeffs, **kwargs)[source]

Returns a signal based on a given climatology (mean, trend, cycles)

Parameters:
  • coeffs (DataArray) – returned by the climato method with return_climato=True

  • mean (Bool (default=True)) – returns mean signal

  • trend (Bool (default=True)) – returns trend

  • cycle (Bool (default=False)) – return annual and semi-annual cycles

filter(filter_name='lanczos', q=3, **kwargs)[source]

Apply a specified filter on all the time-dependent datarray Boundaries are handled by operating a mirror operation on the residual data after removing a q-order polyfit from the data. Available filters are in the .utils python file

Parameters:
  • filter_name (function or string) – if string, filter function name, from the .filters file if function, external function defined by user, returning a kernel

  • q (int) – order of the polyfit to handle boundary effects

  • **kwargs – keyword arguments for the chosen filter

Returns:

filtered

Return type:

filtered dataset

Example

data = lntime.open_geodata('/home/user/lenapy/data/isas.nc').temp
data.lntime.filter(lanczos,q=3,coupure=12,order=2)
interp_time(other, **kwargs)[source]

Interpolate DataArray at the same dates than other

Parameters:

other (xr.DataArray) – must have a time dimension

Returns:

interpolated – new DataArray interpolated

Return type:

xr.DataArray

plot(**kwargs)[source]

Plots the timeseries of the data in the TimeArray, including an uncertainty. Computes the uncertainty on all dimensions that are not time.

Parameters:
  • thick_line (String (default='median')) – How to aggregate the data to plot the main thick line. Can be: * median: computes the median * mean: computes the mean * None: does not plot a main thick line

  • shaded_area (String (default='auto')) – How to aggregate the data to plot the uncertainty around the thick line. Can be: * auto: plots 1.645 standard deviation if thick_line is mean and quantiles 5-95 if thick_line is median. * auto-multiple: plots 1,2 and 3 standard deviations if thick_line is mean and quantiles 5-95, 17-83 and 25-75 if thick_line is median. * std: plots a multiple of the standard deviation based on kwarg standard_deviation_multiple * quantiles: plots quantiles based on the kwargs quantile_min and quantile_max * None: does not plot uncertainty

  • hue (String (default=None)) – Similar to hue in xarray.DataArray.plot(hue=…), group data by the dimension before aggregating and computing uncertainties. Has to be a dimension other than time in the dataarray.

  • standard_deviation_multiple (Float > 0 (default=1.65)) – The multiple of standard deviations to use for the uncertainty with shaded_area=std

  • quantile_min (Float between 0 and 1 (default=0.05)) – lower quantile to compute uncertainty with shaded_area=quantiles

  • quantile_max (Float between 0 and 1 (default=0.95)) – upper quantile to compute uncertainty with shaded_area=quantiles

  • color (String or List (default=None)) – color of the main thick line and the shaded area. Must be a string

  • thick_line_color (String or List (default=None)) – color of the main thick line. Must be a string If hue and one color are provided, the single color is used for all line plots. If hue and a list of colors are provided, the colors are cycled.

  • shaded_area_color (String or List (default=None)) – color of the shaded area. Must be a string. If not provided, defaults to the thick_line_color value. If hue and one color are provided, the single color is used for all area plots. If hue and a list of colors are provided, the colors are cycled.

  • shaded_area_alpha (Float between 0 and 1 (default=0.2)) – Transparency of the uncertainty plots

  • ax (matplotlib.pyplot.Axes instance (default=None)) – If not provided, plots on the current axes.

  • label (String (default=None)) – If provided, label that is provided to ax.plot. Does not work if hue is provided.

  • line_kwargs (kwargs) – Additional arguments provided to the plot function for the main thick line

  • area_kwargs (kwargs) – Additional arguments provided to the plot function for the uncertainty

  • add_legend (Bool (default=True)) – if True, adds matplotlib legend to the current ax after plotting the data.

to_datetime(time_type)[source]

Convert DataArray time format to standard pandas time format

Parameters:

time_type (string) – Can be ‘frac_year’ or ‘360_day’

Returns:

converted – new DataArray with the time dimension in a standard pandas format

Return type:

xr.DataArray

diff_3pts(dim, **kw)[source]

Derivative formula along the selected dimension, returning on each point the linear regression on the three points defined by the selected point and its two neighbours

diff_2pts(dim, **kw)[source]

Derivative formula along the selected dimension, returning for each pair of points the slope, set at the middle coordinates of these two points

trend(time_unit='1s')[source]

Perform a linear regression on the data, and returns the slope coefficient

detrend()[source]

remove the trend from a dataarray

fill_time()[source]

Fill missing values in a timeseries in adding some new points, by respecting the time sampling. Missing values are not NaN but real absent points in the timeseries. A linear interpolation is performed at the missing points.

covariance_analysis()[source]

Returns an instance of the covariance class based on the dataArray

OLS(degree, tref=None, sigma=None, datetime_unit='s')[source]

Returns the OLS estimator performed with a degree “degree” regression

GLS(degree, tref=None, sigma=None, datetime_unit='s')[source]

Returns the GLS estimator performed with a degree “degree” regression and a covariance matrix “sigma”

corr(other, remove_trend=False, **kwargs)[source]

Returns the Pearson correlation coefficient between the timeseries and another one. The other one is interpolated at the dates of the calling timeseries. If remove_trend=True, the two timeseries are detrended before correlation.

fillna_climato()[source]

Returns a DataArray with all NaN values replaced by climatology and trend Climatology is computed over the optional time_period slice

EOF(dim, k)[source]

Return an instance of the eof class based on the data array and the dimension names of the eof

SavitzkyGolay(dim='time', window=5, order=1, step=1, sigma=None)[source]

Perform a Savitzky-Golay filter on a dataArray and return filtered derivatives up to maximal order

Parameters:
  • dim (string) – name of the dimension along which to apply the filter

  • window (int) – length of the filtering window (must be odd)

  • order (int) – order of the polynome to fit the function across the window

  • step (float or time type) – distance between to consecutive points of abscissa

  • sigma (same type as step (optionnal)) – standard deviation of the weights function to be applied on the window

Returns:

filtered – new DataArray filtered with an extra dimension ‘order’, giving the successive filtered derivatives of the signal

Return type:

xr.DataArray