Readers: getting data into ixdat

Source: https://github.com/ixdat/ixdat/tree/user_ready/src/ixdat/readers

A full list of the readers thus accessible and their names can be viewed by typing:

>>> from ixdat.readers import READER_CLASSES
>>> READER_CLASSES

Reading .csv files exported by ixdat: The IxdatCSVReader

ixdat can export measureemnt data in a .csv format with necessary information in the header. See Exporters: getting data out of ixdat. It can naturally read the data that it exports itself. Exporting and reading, however, may result in loss of raw data (unlike save()).

The ixdat_csv module

Module defining the ixdat csv reader, so ixdat can read the files it exports.

class ixdat.readers.ixdat_csv.IxdatCSVReader[source]

A class that reads the csv’s made by ixdat.exporters.csv_exporter.CSVExporter

read() is the important method - it takes the path to the mpt file as argument and returns an ECMeasurement object (ec_measurement) representing that file. The ECMeasurement contains a reference to the BiologicMPTReader object, as ec_measurement.reader. This makes available all the following stuff, likely useful for debugging.

path_to_file

the location and name of the file read by the reader

Type:Path
n_line

the number of the last line read by the reader

Type:int
place_in_file

The last location in the file read by the reader. This is used internally to tell the reader how to parse each line. Options are: “header”, “column names”, and “data”.

Type:str
header_lines

a list of the header lines of the files. This includes the column name line. The header can be nicely viewed with the print_header() function.

Type:list of str
tstamp

The unix time corresponding to t=0

Type:str
technique

The name of the technique

Type:str
N_header_lines

The number of lines in the header of the file

Type:int
column_names

The names of the data columns in the file

Type:list of str
column_data (dict of str

np.array): The data in the file as a dict. Note that the np arrays are the same ones as in the measurement’s DataSeries, so this does not waste memory.

file_has_been_read

This is used to make sure read() is only successfully called once by the Reader. False until read() is called, then True.

Type:bool
measurement

The measurement returned by read() when the file is read. self.measureemnt is None before read() is called.

Type:Measurement
print_header()[source]

Print the file header including column names. read() must be called first.

process_column_line(line)[source]

Split the line to get the names of the file’s data columns

process_data_line(line)[source]

Split the line and append the numbers the corresponding data column arrays

process_header_line(line)[source]

Search line for important metadata and set the relevant attribute of self

process_line(line)[source]

Call the correct line processing method depending on self.place_in_file

read(path_to_file, name=None, cls=None, **kwargs)[source]

Return a Measurement with the data and metadata recorded in path_to_file

This loops through the lines of the file, processing one at a time. For header lines, this involves searching for metadata. For the column name line, this involves creating empty arrays for each data series. For the data lines, this involves appending to these arrays. After going through all the lines, it converts the arrays to DataSeries. The technique is specified in the header, and used to pick the TechniqueMeasurement class. Finally, the method returns a TechniqueMeasurement object measurement with these DataSeries. All attributes of this reader can be accessed from the measurement as measurement.reader.attribute_name.

Parameters:
  • path_to_file (Path) – The full abs or rel path including the “.mpt” extension
  • name (str) – The name of the measurement to return (defaults to path_to_file)
  • cls (Measurement subclass) – The class of measurement to return. By default, cls will be determined from the technique specified in the header of path_to_file.
  • **kwargs (dict) – Key-word arguments are passed to ECMeasurement.__init__

Returns cls: a Measurement of type cls

read_aux_file(path_to_aux_file, name)[source]

Read an auxiliary file and include its series list in the measurement

class ixdat.readers.ixdat_csv.IxdatSpectrumReader[source]

A reader for ixdat spectra.

read(path_to_file, name=None, cls=None, **kwargs)[source]

Read an ixdat spectrum.

This reads the header with the process_line() function inherited from IxdatCSVReader. Then it uses pandas to read the data.

Parameters:
  • path_to_file (Path) – The full abs or rel path including the “.mpt” extension
  • name (str) – The name of the measurement to return (defaults to path_to_file)
  • cls (Spectrum subclass) – The class of measurement to return. By default, cls will be determined from the technique specified in the header of path_to_file.
  • **kwargs (dict) – Key-word arguments are passed to ECMeasurement.__init__

Returns cls: a Spectrum of type cls

ixdat.readers.ixdat_csv.get_column_unit(column_name)[source]

Return the unit name of an ixdat column, i.e the part of the name after the ‘/’

Importing from other experimental data platforms

cinfdata is a web-based database system for experimental data, developed and used at DTU SurfCat (formerly CINF) in concert with The PyExpLabSys suite of experimental data acquisition tools. Both are available at https://github.com/CINF.

As of yet, ixdat only has a text-file reader for data exported from cinfdata, but in the future it will also have a reader which downloads from the website given e.g. a setup and date.

The cinfdata module

Module defining the ixdat csv reader, so ixdat can read the files it exports.

class ixdat.readers.cinfdata.CinfdataTXTReader[source]

A class that reads the text exported by cinfdata’s text export functionality

TODO: We should also have a reader class that downloads the data from cinfdata like
EC_MS’s download_cinfdata_set: https://github.com/ScottSoren/EC_MS/blob/master/src/EC_MS/Data_Importing.py#L711
path_to_file

the location and name of the file read by the reader

Type:Path
n_line

the number of the last line read by the reader

Type:int
place_in_file

The last location in the file read by the reader. This is used internally to tell the reader how to parse each line. Options are: “header”, “column names”, and “data”.

Type:str
header_lines

a list of the header lines of the files. This includes the column name line. The header can be nicely viewed with the print_header() function.

Type:list of str
tstamp

The unix time corresponding to t=0 for the measurement

Type:str
tstamp_list

list of epoch tstamps in the file’s timestamp line

Type:list of float
column_tstamps

The unix time corresponding to t=0 for each time column

Type:dict
technique

The name of the technique

Type:str
column_names

The names of the data columns in the file

Type:list of str
t_and_v_cols

{name: (tcol, vcol)} where name is the name of the ValueSeries (e.g. “M2”), tcol is the name of the corresponding time column in the file (e.g. “M2-x”), and vcol is the the name of the value column in the file (e.g. “M2-y).

Type:dict
column_data (dict of str

np.array): The data in the file as a dict. Note that the np arrays are the same ones as in the measurement’s DataSeries, so this does not waste memory.

file_has_been_read

This is used to make sure read() is only successfully called once by the Reader. False until read() is called, then True.

Type:bool
measurement

The measurement returned by read() when the file is read. self.measureemnt is None before read() is called.

Type:Measurement
print_header()[source]

Print the file header including column names. read() must be called first.

process_column_line(line)[source]

Split the line to get the names of the file’s data columns

process_data_line(line)[source]

Split the line and append the numbers the corresponding data column arrays

process_header_line(line)[source]

Search line for important metadata and set the relevant attribute of self

process_line(line)[source]

Call the correct line processing method depending on self.place_in_file

read(path_to_file, name=None, cls=None, **kwargs)[source]

Return an MSMeasurement with the data and metadata recorded in path_to_file

This loops through the lines of the file, processing one at a time. For header lines, this involves searching for metadata. For the column name line, this involves creating empty arrays for each data series. For the data lines, this involves appending to these arrays. After going through all the lines, it converts the arrays to DataSeries. For cinfdata text files, each value column has its own timecolumn, and they are not necessarily all the same length. Finally, the method returns an ECMeasurement with these DataSeries. The ECMeasurement contains a reference to the reader. All attributes of this reader can be accessed from the measurement as measurement.reader.attribute_name.

Parameters:
  • path_to_file (Path) – The full abs or rel path including the “.mpt” extension
  • **kwargs (dict) – Key-word arguments are passed to ECMeasurement.__init__
ixdat.readers.cinfdata.get_column_unit(column_name)[source]

Return the unit name of an ixdat column, i.e the part of the name after the ‘/’

Electrochemistry and sub-techniques

These are readers which by default return an ECMeasurement. (See Electrochemistry)

The biologic module

This module implements the Reader for .mpt files made by BioLogic’s EC-Lab software

Demonstrated/tested at the bottom under if __name__ == “__main__”:

class ixdat.readers.biologic.BiologicMPTReader[source]

A class to read .mpt files written by Biologic’s EC-Lab.

read() is the important method - it takes the path to the mpt file as argument and returns an ECMeasurement object (ec_measurement) representing that file. The ECMeasurement contains a reference to the BiologicMPTReader object, as ec_measurement.reader. This makes available all the following stuff, likely useful for debugging.

path_to_file

the location and name of the file read by the reader

Type:Path
n_line

the number of the last line read by the reader

Type:int
place_in_file

The last location in the file read by the reader. This is used internally to tell the reader how to parse each line. Options are: “header”, “column names”, and “data”.

Type:str
header_lines

a list of the header lines of the files. This includes the column name line. The header can be nicely viewed with the print_header() function.

Type:list of str
timestamp_string

The string identified to represent the t=0 time of the measurement recorded in the file.

Type:str
tstamp

The unix time corresponding to t=0, parsed from timestamp_string

Type:str
ec_technique

The name of the electrochemical sub-technique, i.e. “Cyclic Voltammatry Advanced”, etc.

Type:str
N_header_lines

The number of lines in the header of the file

Type:int
column_names

The names of the data columns in the file

Type:list of str
column_data (dict of str

np.array): The data in the file as a dict. Note that the np arrays are the same ones as in the measurement’s DataSeries, so this does not waste memory.

file_has_been_read

This is used to make sure read() is only successfully called once by the Reader. False until read() is called, then True.

Type:bool
measurement

The measurement returned by read() when the file is read. self.measureemnt is None before read() is called.

Type:Measurement
print_header()[source]

Print the file header including column names. read() must be called first.

process_column_line(line)[source]

Split the line to get the names of the file’s data columns

process_data_line(line)[source]

Split the line and append the numbers the corresponding data column arrays

process_header_line(line)[source]

Search line for important metadata and set the relevant attribute of self

process_line(line)[source]

Call the correct line processing method depending on self.place_in_file

read(path_to_file, name=None, cls=None, **kwargs)[source]

Return an ECMeasurement with the data and metadata recorded in path_to_file

This loops through the lines of the file, processing one at a time. For header lines, this involves searching for metadata. For the column name line, this involves creating empty arrays for each data series. For the data lines, this involves appending to these arrays. After going through all the lines, it converts the arrays to DataSeries. For .mpt files, there is one TimeSeries, with name “time/s”, and all other data series are ValueSeries sharing this TimeSeries. Finally, the method returns an ECMeasurement with these DataSeries. The ECMeasurement contains a reference to the reader.

Parameters:
  • path_to_file (Path) – The full abs or rel path including the “.mpt” extension
  • name (str) – The name to use if not the file name
  • cls (Measurement subclass) – The Measurement class to return an object of. Defaults to ECMeasurement and should probably be a subclass thereof in any case.
  • **kwargs (dict) – Key-word arguments are passed to cls.__init__
ixdat.readers.biologic.get_column_unit(column_name)[source]

Return the unit name of a .mpt column, i.e the part of the name after the ‘/’

The autolab module

This module implements the reader for ascii exports from autolab’s Nova software

class ixdat.readers.autolab.NovaASCIIReader[source]

A reader for ascii files exported by Autolab’s Nova software

read(path_to_file, cls=None, name=None, tstamp=None, timestring=None, timestring_form='%d/%m/%Y %H:%M:%S', **kwargs)[source]

read the ascii export from Autolab’s Nova software

Parameters:
  • path_to_file (Path) – The full abs or rel path including the suffix (.txt)
  • name (str) – The name to use if not the file name
  • cls (Measurement subclass) – The Measurement class to return an object of. Defaults to ECMeasurement and should probably be a subclass thereof in any case.
  • tstamp (float) – timestamp of the measurement, if known
  • timestring (str) – timestring describing the timestamp of the measurement
  • timestring_form (str) – form of the timestring. Default is “%d/%m/%Y %H:%M:%S”
  • **kwargs (dict) – Key-word arguments are passed to cls.__init__
ixdat.readers.autolab.get_column_unit(column_name)[source]

Return the unit name of an autolab column, i.e the last part of the name in ()

The ivium module

This module implements the reader for the text export of Ivium’s software

class ixdat.readers.ivium.IviumDataReader[source]

Class for reading single ivium files

read(path_to_file, cls=None, name=None, **kwargs)[source]

read the ascii export from the Ivium software

Parameters:
  • path_to_file (Path) – The full abs or rel path including the suffix (.txt)
  • name (str) – The name to use if not the file name
  • cls (Measurement subclass) – The Measurement class to return an object of. Defaults to ECMeasurement.
  • **kwargs (dict) – Key-word arguments are passed to cls.__init__
Returns:

technique measurement object with the ivium data

Return type:

cls

class ixdat.readers.ivium.IviumDatasetReader[source]

Class for reading sets of ivium files exported together

read(path_to_file, cls=None, name=None, **kwargs)[source]

Return a Measurement containing the data of an ivium dataset,

An ivium dataset is a group of ivium files exported together. They share a folder and a base name, and are suffixed “_1”, “_2”, etc.

Parameters:
  • path_to_file (Path or str) – Path(path_to_file).parent is interpreted as the folder where the files of the ivium dataset is. Path(path_to_file).name up to the first “_” is interpreted as the shared start of the files in the dataset. You can thus use the base name of the exported files or the full path of any one of them.
  • cls (Measurement class) – The measurement class. Defaults to ECMeasurement.
  • name (str) – The name of the dataset. Defaults to the base name of the dataset
  • kwargs – key-word arguments are included in the dictionary for cls.from_dict()

Returns cls or ECMeasurement: a measurement object with the ivium data

ixdat.readers.ivium.get_column_unit(column_name)[source]

Return the unit name of an ivium column, i.e what follows the first ‘/’.

Mass Spectrometry and sub-techniques

These are readers which by default return an MSMeasurement. (See Mass Spectrometry)

The pfeiffer module

This module implements the reader for Pfeiffer Vacuum’s PV Mass Spec software

class ixdat.readers.pfeiffer.PVMassSpecReader[source]

A reader for (advanced) MID files exported from PVMassSpec (’… - Bin.dat’)

read(path_to_file, cls=None, name=None, **kwargs)[source]

Return a Measurement with the (advanced) MID data in the PVMassSpec file

Parameters:
  • path_to_file (Path or str) – a path to the file exported by PVMassSpec with (advanced) MID data. This file is typically exported with a name that ends in ‘- Bin.dat’, and with the timestamp in the file name. Note that the file can be renamed, as the original name is in the file, and the timestamp is read from there.
  • cls (Measurement subclass) – The technique class of which to return an object. Defaults to MSMeasurement.
  • name (str) – The name of the measurement. Defaults to Path(path_to_file).name
  • kwargs – key-word args are used to initiate the measurement via cls.as_dict()

Return cls: The measurement object

class ixdat.readers.pfeiffer.PVMassSpecScanReader[source]

A reader for mass spectra files exported from PVMassSpec (’… - Scan.dat’)

ixdat.readers.pfeiffer.get_column_unit(column_name)[source]

Return the unit name of an ivium column, i.e what follows the first ‘/’.

ixdat.readers.pfeiffer.mass_from_column_name(mass)[source]

Return the PVMassSpec mass ‘M<x>’ given the column name ‘<x>_amu’ as string

EC-MS and sub-techniques

These are readers which by default return an ECMSMeasurement. (See Electrochemistry - Mass Spectrometry (EC-MS))

The zilien module

class ixdat.readers.zilien.ZilienSpectrumReader(path_to_spectrum=None)[source]

A reader for individual Zilien spectra TODO: A Zilien reader which loads all spectra at once in a SpectrumSeries object

read(path_to_spectrum, cls=None, **kwargs)[source]

Make a measurement from all the single-value .tsv files in a Zilien tmp dir FIXME: This reader was written hastily and could be designed better.

Parameters:
  • path_to_tmp_dir (Path or str) – the path to the tmp dir
  • cls (Spectrum class) – Defaults to MSSpectrum
  • kwargs – Key-word arguments are passed on ultimately to cls.__init__
class ixdat.readers.zilien.ZilienTMPReader(path_to_tmp_dir=None)[source]

A class for stitching the files in a Zilien tmp directory to an ECMSMeasurement

This is necessary because Zilien often crashes, leaving only the tmp directory. This is less advanced but more readable than the Spectro Inlets stitching solution.

read(path_to_tmp_dir, cls=None, **kwargs)[source]

Make a measurement from all the single-value .tsv files in a Zilien tmp dir

Parameters:
  • path_to_tmp_dir (Path or str) – the path to the tmp dir
  • cls (Measurement class) – Defaults to ECMSMeasurement
class ixdat.readers.zilien.ZilienTSVReader[source]

Class for reading files saved by Spectro Inlets’ Zilien software

read(path_to_file, cls=None, name=None, **kwargs)[source]

Read a zilien file

TODO: This is a hack using EC_MS to read the .tsv. Will be replaced.

ixdat.readers.zilien.series_list_from_tmp(path_to_file)[source]

Return [ValueSeries, TimeSeries] with the data in a zilien tmp .tsv file

The ec_ms_pkl module

class ixdat.readers.ec_ms_pkl.EC_MS_CONVERTER[source]

Imports old .pkl files obtained from the legacy EC-MS package

read(file_path, cls=None, **kwargs)[source]

Return an ECMSMeasurement with the data recorded in path_to_file Most of the work is done by module-level function measurement_from_ec_ms_dataset

Parameters:
  • path_to_file (Path) – The full abs or rel path including the
  • extension. (".pkl") –
ixdat.readers.ec_ms_pkl.measurement_from_ec_ms_dataset(ec_ms_dict, name=None, cls=<class 'ixdat.techniques.ec_ms.ECMSMeasurement'>, reader=None, technique=None, **kwargs)[source]

Return an ixdat Measurement with the data from an EC_MS data dictionary.

This loops through the keys of the EC-MS dict and searches for MS and EC data. Names the dataseries according to their names in the original dict. Omitts any other data as well as metadata.

Parameters:
  • ec_ms_dict (dict) – The EC_MS data dictionary
  • name (str) – Name of the measurement
  • cls (Measurement class) – The class to return a measurement of
  • reader (Reader object) – The class which read ec_ms_dataset from file
  • technique (str) – The name of the technique

EC-MS and sub-techniques

These are readers which by default return a SpectroECMeasurement. (See Spectro-Electrochemistry)

The msrh_sec module

class ixdat.readers.msrh_sec.MsrhSECReader[source]

A reader for SEC saved in three files: spectra vs v; wavelengths; current vs v

read(path_to_file, path_to_ref_spec_file, path_to_V_J_file, scan_rate, tstamp=None, cls=None)[source]

Read potential-dep. SEC data from 3 csv’s to return a SpectroECMeasurement

The function is well-commented so take a look at the source

Parameters:
  • path_to_file (Path or str) – The full path to the file containing the spectra data. This file has voltage in the first row, and a first column with an arbitrary counter which has to be replaced by wavelength.
  • path_to_ref_spec_file (Path or str) – The full path to the file containing the wavelenth data, together usually with the adsorption-free spectrum. The length of the columns should be the same as in the spectrum data but in practice is a few points longer. The excess points at the starts of the columns are discarded.
  • path_to_V_J_file (Path or str) – The full path to the file containing the current data vs potential. The columns may be reversed in order. In the end the potential in the spectra file will be retained and the potential here used to interpolate the current onto the spectra file’s potential.
  • scan_rate (float) – Scan rate in [mV/s]. This is used to figure out the measurement’s time variable, as time is bizarrely not included in any of the data files.
  • tstamp (float) – Timestamp. If None, the user will be prompted for the measurement start time or whether to use the file creation time. This is necessary because tstamp is also not included in any of the files but is central to how ixdat organizes data. If you’re sure that tstamp doesn’t matter for you, put e.g. tstamp=1 to suppress the prompt.
  • cls (Measurement subclass) – The class of measurement to return. Defaults to SpectroECMeasurement.