A subclass of MaskedArray designed to manipulate time series.
Parameters: |
|
---|
See also
A TimeSeries object is the combination of three ndarrays:
These three arrays can be accessed as attributes of a TimeSeries object. Another very useful attribute is series, that gives the possibility to directly access data and mask as a masked array.
As TimeSeries objects subclass MaskedArray, they inherit all their attributes and methods, as well as the attributes and methods of regular ndarrays.
- data¶
- Returns a view of a TimeSeries as a ndarray. This attribute is read-only and cannot be directly set.
- mask¶
Returns the mask of the object, as a ndarray with the same shape as data, or as the special value nomask (equivalent to False). This attribute is writable and can be modified.
If data has a standard dtype (no named fields), the dtype of the mask is boolean. If data is a structured array with named fields, the mask has the same structure as the data‘s, but each field is atomically boolean.
In any case, a value of True in the mask indicates that the corresponding value of the series is invalid.
- series¶
- Returns a view of a TimeSeries as a MaskedArray. This attribute is read-only and cannot be directly set
- dates¶
- Returns the DateArray object of the dates of the series. This attribute is writable and can be modified. However, the size of the array must be zero or match either the size of the series or its length.
- varshape¶
- Returns the number of equivalent variables for each date. If varshape == (), the series has only one variable and is called a 1V-series.
freq | freqstr |
year | years |
qyear | |
quarter | quarters |
month | months |
week | weeks |
day | days |
day_of_week | weekdays |
day_of_year | yeardays |
hour | hours |
minute | minutes |
second | seconds |
start_date | end_date |
To construct a TimeSeries object, the simplest method is to directly call the class constructor with the proper parameters.
However, the recommended way is to use the time_series factory function.
Creates a TimeSeries object.
The data parameter can be a valid TimeSeries object. In that case, the dates, start_date or freq parameters are optional: if none of them is given, the dates of the result are the dates of data.
If data is not a TimeSeries, then dates must be either None or an object recognized by the date_array function (used internally):
- an existing DateArray object;
- a sequence of Date objects with the same frequency;
- a sequence of datetime.datetime objects;
- a sequence of dates in string format;
- a sequence of integers corresponding to the representation of Date objects.
In any of the last four possibilities, the freq parameter is mandatory.
If dates is None, a continuous DateArray is automatically constructed as an array of size len(data) starting at start_date and with a frequency freq.
Parameters: | data : array_like
dates : {None, var}, optional
start_date : {Date}, optional
length : {integer}, optional
freq : {freq_spec}, optional
|
---|
See also
Notes
Note
By default, the series is automatically sorted in chronological order. This behavior can be overwritten by setting the keyword autosort=False.
The simplest example of a TimeSeries consists in a series series of one variable, where a date is associated with each element of the array. In that case, the dates attribute is a DateArray with the same size as the underlying array.
For example, we can create a 4-element series:
>>> first_date = ts.Date('D', '2009-01-01')
>>> series = ts.time_series([1, 2, 3, 4], start_date=first_date)
>>> series
timeseries([1 2 3 4],
dates = [01-Jan-2009 ... 04-Jan-2009],
freq = D)
Note that with the use of the start_date keyword, the size of the dates attribute is automatically adjusted by time_series to match the size of the input data.
The dates can now be modified in place. For example, they can be shifted by one week with the following command.
>>> series.dates +=7
>>> series
timeseries([1 2 3 4],
dates = [08-Jan-2009 ... 11-Jan-2009],
freq = D)
The dates can also be changed by setting the dates attribute to another DateArray object. In that case, the size of the new dates must match the size of the series, or a TimeSeriesCompatibilityError is raised. Setting the dates attribute to an object of a different type raises a TypeError exception.
It is often convenient to manipulate a series of several variables at once. Once possibility is to use a structured array as input, as illustrated by the following example:
>>> series = ts.time_series(zip(np.random.normal(0, 1, 10),
... np.random.uniform(0, 1, 10)),
... dtype=[('norm', float), ('unif', float)],
... start_date=ts.Date('D', '2001-01-01'))
In this example, series consists of two fields (‘norm’ and ‘unif’). Note that in this example, the two fields have the same type (float), but this is not a requirement. Each field can be accessed as an independent TimeSeries using series['norm'] and series['unif'].
In practice, each individual entry of series is a numpy.void object. The series as a whole behaves as a 1D masked array, as represented by the shape of the series: series.shape = (10,). Because series is a 1D array, the size of series.dates must match series.size.
Despite the convenience of this approach to manipulate multi-variable series, it presents a serious disadvantage: structured arrays are usually not recognized by standard numpy functions.
An alternative is then to represent a series as a two-dimensional array, using columns as variables and rows as actual obervations. In that case, all the variables must have the same type, and the size of the dates attibute must match the length of the series.
More generally, it is possible to create a multi-variable series as a nD array. The corresponding dates must then satisfy the condition series.dates.size == series.shape[0] or a TimeSeriesCompatibilityError is raised. The specific attribute varshape is then set to keep track of the number of variables.
For example, a series of 50 years of monthly data can be represented as a (600,)-array of observations at a monthly frequency, or as a (50,12)-array of observations at an annual frequency.
>>> start - ts.Date('M', '2001-01')
>>> data = np.random.uniform(-1, +1, 50*12).reshape(50, 12)
>>> mseries = ts.time_series(data, start_date=start, length=50*12)
>>> aseries = ts.time_series(data, start_date=start.asfreq('Y'), length=50)
Both series have the same shape, (50, 12), but mseries is a series of one variable, with mseries.varshape == (), while aseries is a series of 12 variables, aseries.varshape == (12,), each variable corresponding to a month.
>>> (mseries.shape, mseries.varshape)
((50, 12), ())
>>> (aseries.shape, aseries.varshape)
((50, 12), (12,))
Because aseries is basically a 2D array, we can easily compute annual and monthly means. Thus, monthly means over the whole 50 years can be calculated at once with the mean method, using axis=0 as parameter. We can also compute the equivalent of 50 years of annual data using mean method, this time with axis=1.
>>> amean = aseries.mean(axis=1)
>>> amean.shape = (50,)
>>> mmean = aseries.mean(axis=0)
>>> mmean.shape = (12,)
Another example of multi-variable series would be one year of daily (256x256) raster map. This dataset can easily be represented as a (365,256,256)-array, and a corresponding series created with the following code:
>>> data = np.random.uniform(-1, +1, 365*256*256).reshape(365, 256, 256)
>>> newseries = ts.time_series(data, start_date=ts.now('D'))
The following methods access information about the dates attribute:
|
Returns the time steps between consecutive dates, in the same unit as the instance frequency. |
|
Returns whether the instance has missing dates. |
|
Returns whether the instance has duplicated dates. |
|
Returns whether the instance has no missing dates. |
|
Returns whether the instance is valid (that there are no missing nor duplicated dates). |
|
Returns whether the instance is sorted in chronological order. |
TimeSeries.date_to_index | |
TimeSeries.sort_chronologically |
TimeSeries.adjust_endpoints | |
TimeSeries.compressed | |
TimeSeries.fill_missing_dates |
For reshape, resize, and transpose, the single tuple argument may be replaced with n integers which will be interpreted as an n-tuple.
TimeSeries.flatten | |
TimeSeries.ravel | |
TimeSeries.reshape | |
TimeSeries.resize | |
TimeSeries.split | |
TimeSeries.squeeze | |
TimeSeries.swapaxes | |
TimeSeries.transpose | |
TimeSeries.T |
TimeSeries.copy | |
TimeSeries.dump | |
TimeSeries.dumps |