The data series structure

A data series is an object of the DataSeries class in the data_series module or an inheriting class. It is basically a wrapper around a numpy array, which is its data attribute, together with a name and a unit. Most data series also contain some additional metadata and/or references to other data series. The most important function of these is to keep track of everything in time, as described below.

data series and time

(Copied from text in design workshop 2, in December 2020):

Time is special! In some deeper way, time is just another dimension… but for hyphenated laboratory measurements, as well as multi-technique experimental projects in general (so long as samples and equipment are moving slow compared to the speed of light), time is special because it is the one measurable quantity that is always shared between all detectors.

Absolute time (epoch timestamp) exists in two places:

  • Measurement.tstamp: This timestamp is a bit decorative – it tells the measurement’s plotter and data selection methods what to use as t=0

  • TSeries.tstamp: This timestamp is truth. It defines the t=0 for the primary time data of any measurement.

Data carriers:

  • DataSeries: The TimeSeries is a special case of the DataSeries. All data carried by ixdat will be as a numpy array in a Series. All Series share a primary key (id in table series in the db diagram on the left), and in addition to the data have a name (think “column header”) and a unit. Series is a table in the ixdat database structure, with helper tables for special cases.

  • TimeSeries: The only additional row for TimeSeries (table tseries) is tstamp, as described above.

  • Field: Some series consist of values spanning a space defined by other series. Such a series is called a Field, and defined by a list of references to the series which define their space. In the database, this is represented in a field_axis table, of which n rows (with axis_number from 0 to n-1) will be dedicated to representing each n-D Field.

  • ValueSeries: Finally, a very common series type is a scalar value over time. This is called a ValueSeries, and must have a corresponding TimeSeries. A ValueSeries is actually a special case of a Field, spanning a 1-d space, and so doesn’t need a new table in the db.

Immutability! All of the data carriers above will be immutable! This means that, even though truth is preserved by adding dt to a tsteries.tstamp and subtracting dt from tseries.data, we will never do this! This is a cheap calculation that ixdat can do on demand. Same with appending corresponding series from consecutive measurements. Performing these operations on every series in a measurement set is referred to as building a combined measurement, and is only done when explicitly asked for (f.ex. to export or save the combined measurement). Building makes new Series rather than mutating existing ones. A possible exception to immutability may be appending data to use ixdat on an ongoing measurement.

The data_series module

This module defines the DataSeries class, the elementary data structure of ixdat

An ixdat DataSeries is a wrapper around a numpy array containing the metadata needed to combine it with other DataSeries. Typically this means a reference to the time variable corresponding to the rows of the array. The time variable itself is a special case, TimeSeries, which must know its absolute (unix) timestamp.

class ixdat.data_series.ConstantValue(*args, **kwargs)[source]

This is a stand-in for a VSeries for when we know the value is constant

property data

When loading data, Field checks that its dimensions match its # of axes

class ixdat.data_series.DataSeries(name, unit_name, data)[source]

The base class for all numerical data representation in ixdat.

These class’s objects are saved and loaded as rows in the data_series table

property data

The data as a np.array, loaded the first time it is needed.

classmethod from_dict(obj_as_dict)[source]

Return the right type of DataSeries based on the info in its serialization

property unit_name

The name of the data series’ unit

class ixdat.data_series.Field(name, unit_name, data, a_ids=None, axes_series=None)[source]

Class for storing multi-dimensional data spanning ‘axes’

Characterized by a list of references to these axes, which are themselves also DataSeries. This is represented in the extra linkers.

property a_ids

List of the id’s of the axes spanned by the field

property axes_series

List of the DataSeries defining the axes spanned by the field

property data

When loading data, Field checks that its dimensions match its # of axes

get_axis_id(axis_number)[source]

Return the id of the axis_number’th axis of the data

get_axis_series(axis_number)[source]

Return the DataSeries of the axis_number’th axis of the data

property tstamp

The unix time corresponding to t=0 for the time-resolved axis of the Field

The timestamp of a Field is the timestamp of its TimeSeries or ValueSeries

class ixdat.data_series.TimeSeries(name, unit_name, data, tstamp)[source]

Class to store time data. These are characterized by having a tstamp

property tseries

Trivially, a TimeSeries is its own TimeSeries

class ixdat.data_series.ValueSeries(name, unit_name, data, t_id=None, tseries=None, a_ids=None, axes_series=None)[source]

Class to store scalar values that are measured over time.

Characterized by a reference to the corresponding time series. This reference is represented in relational databases as a row in an auxiliary linker table

property t

The measurement times as a 1-d np array

property t_id

the id of the TimeSeries

Type

int

property tstamp

The timestamp, from the TimeSeries of the ValueSeries

property v

The value as a 1-d np array

ixdat.data_series.append_series(series_list, sorted=True, name=None, tstamp=None)[source]

Return series appending series_list relative to series_list[0].tseries.tstamp

Parameters
  • series_list (list of Series) – The series to append (must all be of same type)

  • sorted (bool) – Whether to sort the data so that time only goes forward

  • name (str) – Name to give the appended series. Defaults to series_list[0].name

  • tstamp (unix tstamp) – The t=0 of the returned series or its TimeSeries.

ixdat.data_series.append_tseries(series_list, sorted=True, return_sort_indeces=False, name=None, tstamp=None)[source]

Return new TimeSeries with the data appended.

Parameters
  • series_list (list of TimeSeries) – The time series to append

  • sorted (bool) – Whether to sort the data so that time only goes forward

  • return_sort_indeces (bool) – Whether to return the indeces that sort the data

  • name (str) – Name to give the appended series. Defaults to series_list[0].name

  • tstamp (unix tstamp) – The t=0 of the returned TimeSeries.

ixdat.data_series.append_vseries_by_time(series_list, sorted=True, name=None, tstamp=None)[source]

Return new ValueSeries with the data in series_list appended

Parameters
  • series_list (list of ValueSeries) – The value series to append

  • sorted (bool) – Whether to sort the data so that time only goes forward

  • name (str) – Name to give the appended series. Defaults to series_list[0].name

  • tstamp (unix tstamp) – The t=0 of the returned ValueSeries’ TimeSeries.

ixdat.data_series.get_tspans_from_mask(t, mask)[source]

Return a list of tspans for time intervals remaining when mask is applied to t

FIXME: This is pure numpy manipulation and probably belongs somewhere else.

ixdat.data_series.time_shifted(series, tstamp=None)[source]

Return a series with the time shifted to be relative to tstamp