Readers: getting data into ixdat
Source: https://github.com/ixdat/ixdat/tree/user_ready/src/ixdat/readers
On this page, you can find the documentation for the different readers for importing data obtained using different techniques. For easier navigation between the different sections, use the menu on the left.
Initiating a measurement
A typical workflow is to start by reading a file. For convenience, most readers are
accessible directly from Measurement
. So, for example, to read a .mpt file exported
by Biologic’s EC-Lab, one can type:
>>> from ixdat import Measurement
>>> ec_meas = Measurement.read("my_file.mpt", reader="biologic")
See readers for a description of the available readers.
The biologic reader (ixdat.readers.biologic.BiologicMPTReader
) ensures that the
object returned, ec_meas
, is of type ECMeasurement
.
Another workflow starts with loading a measurement from the active ixdat
backend.
This can also be done straight from Measurement
, as follows:
>>> from ixdat import Measurement
>>> ec_meas = Measurement.get(3)
Where the row with id=3 of the measurements table represents an electrochemistry
measurement. Here the column “technique” in the measurements table specifies which
TechniqueMeasurement class is returned. For row three of the measurements
table, the entry “technique” is “EC”, ensuring ec_meas
is an object of type
ECMeasurement
.
A full list of the readers thus accessible and their names can be viewed by typing:
>>> from ixdat.readers import READER_CLASSES
>>> READER_CLASSES
Reading .csv files exported by ixdat: The IxdatCSVReader
ixdat
can export measurement data in a .csv format with necessary information in the
header. See Exporters: getting data out of ixdat. It can naturally read the data that it exports itself. Exporting and reading,
however, may result in loss of raw data (unlike save()
).
The ixdat_csv
module
Module defining the ixdat csv reader, so ixdat can read the files it exports.
- class ixdat.readers.ixdat_csv.IxdatCSVReader[source]
A class that reads the csv’s made by ixdat.exporters.csv_exporter.CSVExporter
read() is the important method - it takes the path to the mpt file as argument and returns an ECMeasurement object (ec_measurement) representing that file. The ECMeasurement contains a reference to the BiologicMPTReader object, as ec_measurement.reader. This makes available all the following stuff, likely useful for debugging.
- path_to_file
the location and name of the file read by the reader
- Type
Path
- n_line
the number of the last line read by the reader
- Type
int
- place_in_file
The last location in the file read by the reader. This is used internally to tell the reader how to parse each line. Options are: “header”, “column names”, and “data”.
- Type
str
- header_lines
a list of the header lines of the files. This includes the column name line. The header can be nicely viewed with the print_header() function.
- Type
list of str
- tstamp
The unix time corresponding to t=0
- Type
str
- technique
The name of the technique
- Type
str
- N_header_lines
The number of lines in the header of the file
- Type
int
- column_names
The names of the data columns in the file
- Type
list of str
- column_data (dict of str
np.array): The data in the file as a dict. Note that the np arrays are the same ones as in the measurement’s DataSeries, so this does not waste memory.
- file_has_been_read
This is used to make sure read() is only successfully called once by the Reader. False until read() is called, then True.
- Type
bool
- measurement
The measurement returned by read() when the file is read. self.measureemnt is None before read() is called.
- Type
Measurement
- process_data_line(line)[source]
Split the line and append the numbers the corresponding data column arrays
- process_header_line(line)[source]
Search line for important metadata and set the relevant attribute of self
- read(path_to_file, name=None, cls=None, **kwargs)[source]
Return a Measurement with the data and metadata recorded in path_to_file
This loops through the lines of the file, processing one at a time. For header lines, this involves searching for metadata. For the column name line, this involves creating empty arrays for each data series. For the data lines, this involves appending to these arrays. After going through all the lines, it converts the arrays to DataSeries. The technique is specified in the header, and used to pick the TechniqueMeasurement class. Finally, the method returns a TechniqueMeasurement object measurement with these DataSeries. All attributes of this reader can be accessed from the measurement as measurement.reader.attribute_name.
- Parameters
path_to_file (Path) – The full abs or rel path including the “.mpt” extension
name (str) – The name of the measurement to return (defaults to path_to_file)
cls (Measurement subclass) – The class of measurement to return. By default, cls will be determined from the technique specified in the header of path_to_file.
**kwargs (dict) – Key-word arguments are passed to ECMeasurement.__init__
Returns cls: a Measurement of type cls
- class ixdat.readers.ixdat_csv.IxdatSpectrumReader[source]
A reader for ixdat spectra.
- read(path_to_file, name=None, cls=<class 'ixdat.spectra.Spectrum'>, **kwargs)[source]
Read an ixdat spectrum.
This reads the header with the process_line() function inherited from IxdatCSVReader. Then it uses pandas to read the data.
- Parameters
path_to_file (Path) – The full absolute or relative path including extension
name (str) – The name of the measurement to return (defaults to path_to_file)
cls (Spectrum subclass) – The class of measurement to return. By default, cls will be determined from the technique specified in the header of path_to_file.
**kwargs (dict) – Key-word arguments are passed to ECMeasurement.__init__
Returns cls: a Spectrum of type cls
Importing from other experimental data platforms
cinfdata is a web-based database system for experimental data, developed and used at DTU SurfCat
(formerly CINF) in concert with The PyExpLabSys
suite of experimental data acquisition tools.
Both are available at https://github.com/CINF.
As of yet, ixdat
only has a text-file reader for data exported from cinfdata, but
in the future it will also have a reader which downloads from the website given e.g. a
setup and date.
The cinfdata
module
Module defining readers for DTU Surfcat’s legendary cinfdata system
- class ixdat.readers.cinfdata.CinfdataTXTReader[source]
A class that reads the text exported by cinfdata’s text export functionality
- TODO: We should also have a reader class that downloads the data from cinfdata like
EC_MS’s download_cinfdata_set: https://github.com/ScottSoren/EC_MS/blob/master/src/EC_MS/Data_Importing.py#L711
- path_to_file
the location and name of the file read by the reader
- Type
Path
- n_line
the number of the last line read by the reader
- Type
int
- place_in_file
The last location in the file read by the reader. This is used internally to tell the reader how to parse each line. Options are: “header”, “column names”, and “data”.
- Type
str
- header_lines
a list of the header lines of the files. This includes the column name line. The header can be nicely viewed with the print_header() function.
- Type
list of str
- tstamp
The unix time corresponding to t=0 for the measurement
- Type
str
- tstamp_list
list of epoch tstamps in the file’s timestamp line
- Type
list of float
- column_tstamps
The unix time corresponding to t=0 for each time column
- Type
dict
- technique
The name of the technique
- Type
str
- column_names
The names of the data columns in the file
- Type
list of str
- t_and_v_cols
{name: (tcol, vcol)} where name is the name of the ValueSeries (e.g. “M2”), tcol is the name of the corresponding time column in the file (e.g. “M2-x”), and vcol is the the name of the value column in the file (e.g. “M2-y).
- Type
dict
- column_data (dict of str
np.array): The data in the file as a dict. Note that the np arrays are the same ones as in the measurement’s DataSeries, so this does not waste memory.
- file_has_been_read
This is used to make sure read() is only successfully called once by the Reader. False until read() is called, then True.
- Type
bool
- measurement
The measurement returned by read() when the file is read. self.measureemnt is None before read() is called.
- Type
Measurement
- process_data_line(line)[source]
Split the line and append the numbers the corresponding data column arrays
- process_header_line(line)[source]
Search line for important metadata and set the relevant attribute of self
- read(path_to_file, name=None, cls=None, **kwargs)[source]
Return an MSMeasurement with the data and metadata recorded in path_to_file
This loops through the lines of the file, processing one at a time. For header lines, this involves searching for metadata. For the column name line, this involves creating empty arrays for each data series. For the data lines, this involves appending to these arrays. After going through all the lines, it converts the arrays to DataSeries. For cinfdata text files, each value column has its own timecolumn, and they are not necessarily all the same length. Finally, the method returns an ECMeasurement with these DataSeries. The ECMeasurement contains a reference to the reader. All attributes of this reader can be accessed from the measurement as measurement.reader.attribute_name.
- Parameters
path_to_file (Path) – The full abs or rel path including the “.txt” extension
**kwargs (dict) – Key-word arguments are passed to ECMeasurement.__init__
Electrochemistry and sub-techniques
These are readers which by default return an ECMeasurement
.
(See Electrochemistry)
The biologic
module
This module implements the Reader for .mpt files made by BioLogic’s EC-Lab software
Demonstrated/tested at the bottom under if __name__ == “__main__”:
- class ixdat.readers.biologic.BiologicReader[source]
A class to read .mpt files written by Biologic’s EC-Lab.
read() is the important method - it takes the path to the mpt file as argument and returns an ECMeasurement object (ec_measurement) representing that file. The ECMeasurement contains a reference to the BiologicMPTReader object, as ec_measurement.reader. This makes available all the following stuff, likely useful for debugging.
- file_has_been_read
This is used to make sure read() is only successfully called once by the Reader. False until read() is called, then True.
- Type
bool
- measurement
The measurement returned by read() when the file is read. self.measurement is None before read() is called.
- Type
Measurement
- measurement_name
The name of the measurement being read
- Type
str
- path_to_file
the location and name of the file read by the reader
- Type
Path
- tstamp
The unix time corresponding to t=0, parsed from timestamp_string
- Type
float
- measurement_class
Type of measurement to return
- Type
class
- data_series_list
Data series of the measurement being read
- Type
list of DataSeries
- aliases
Aliases for data series in the measurement being read
- Type
dict
- tseries
Time series for the returned measurement (biologic files have one shared time variable)
- Type
- ec_technique
The name of the electrochemical sub-technique, i.e. “Cyclic Voltammetry Advanced”, etc.
- Type
str
- n_line
the number of the last line read by the reader
- Type
int
- place_in_file
The last location in the file read by the reader. This is used internally to tell the reader how to parse each line. Options are: “header”, “column names”, and “data”.
- Type
str
- header_lines
a list of the header lines of the files. This includes the column name line. The header can be nicely viewed with the print_header() function.
- Type
list of str
- timestamp_string
The string identified to represent the t=0 time of the measurement recorded in the file.
- Type
str
- N_header_lines
The number of lines in the header of the file
- Type
int
- column_names
The names of the data columns in the file
- Type
list of str
- column_data (dict of str
np.array): The data in the file as a dict. Note that the np arrays are the same ones as in the measurement’s DataSeries, so this does not waste memory.
- df
The data from a .mpr as read by an external package
- Type
Pandas DataFrame
- read(path_to_file, name=None, cls=<class 'ixdat.techniques.ec.ECMeasurement'>, **kwargs)[source]
Return an ECMeasurement with the data and metadata recorded in path_to_file
This loops through the lines of the file, processing one at a time. For header lines, this involves searching for metadata. For the column name line, this involves creating empty arrays for each data series. For the data lines, this involves appending to these arrays. After going through all the lines, it converts the arrays to DataSeries. For .mpt files, there is one TimeSeries, with name “time/s”, and all other data series are ValueSeries sharing this TimeSeries. Finally, the method returns an ECMeasurement with these DataSeries. The ECMeasurement contains a reference to the reader.
- Parameters
path_to_file (Path) – The full absolute or relative path including the suffix (“.mpt” or “.mpr”)
**kwargs (dict) – Key-word arguments are passed to ECMeasurement.__init__
name (str) – The name to use if not the file name
cls (Measurement subclass) – The Measurement class to return an object of. Defaults to ECMeasurement and should probably be a subclass thereof in any case.
**kwargs – Key-word arguments are passed to cls.__init__
- series_list_from_mpr(path_to_file=None)[source]
Read a .mpr file to generate the reader’s data_series_list
- Raises: ReadError, if no external package succeeds in reading the file. The
error message includes instructions to install each external package or the error that is raised when attempting to use it.
- series_list_from_mpr_eclabfiles(path_to_file=None)[source]
Read a biologic .mpr file
This makes use of the package eclabfiles. See: https://github.com/vetschn/eclabfiles. The dataframe read in by eclabfiles.to_df() is stored in the returned measurement meas as meas.reader.df.
- Parameters
path_to_file (Path) – The full abs or rel path including the “.mpr” extension
- series_list_from_mpr_galvani(path_to_file=None)[source]
Read a biologic .mpr file
This makes use of the package galvani. See: https://github.com/echemdata/galvani. The dataframe read in by eclabfiles.to_df() is stored in the returned measurement meas as meas.reader.df.
- Parameters
path_to_file (Path) – The full abs or rel path including the “.mpr” extension
- ixdat.readers.biologic.fix_WE_potential(measurement)[source]
Fix column of zeros in “<Ewe>/V” sometimes exported by EC Lab for CP measurements.
Some Biologic potentiostats / EC-Lab versions sometimes export a column of zeros for “<Ewe>/V” in the .mpt files in chronopotentiometry measurements. This function replaces the series of zeros with the correct potential by adding the counter electrode potential (“<Ece>/V”) and cell potential (“Ewe-Ece/V”).
This function is not called automatically - it needs to be called manually on the measurements loaded from the afflicted files. It requires that the counter electrode potential was recorded.
- Parameters
measurement (ECMeasurement) – The measurement with the column to be replaced
The autolab
module
This module implements the reader for ascii exports from autolab’s Nova software
- class ixdat.readers.autolab.NovaASCIIReader[source]
A reader for ascii files exported by Autolab’s Nova software
- read(path_to_file, cls=None, name=None, tstamp=None, timestring=None, timestring_form='%d/%m/%Y %H:%M:%S', **kwargs)[source]
Read the ASCII export from Autolab’s Nova software
- Parameters
path_to_file (Path) – The full absolute or relative path including the suffix
name (str) – The name to use if not the file name
cls (Measurement subclass) – The Measurement class to return an object of. Defaults to ECMeasurement and should probably be a subclass thereof in any case.
tstamp (float) – timestamp of the measurement, if known
timestring (str) – timestring describing the timestamp of the measurement
timestring_form (str) – form of the timestring. Default is “%d/%m/%Y %H:%M:%S”
**kwargs (dict) – Key-word arguments are passed to cls.__init__
The ivium
module
This module implements the reader for the text export of Ivium’s software
- class ixdat.readers.ivium.IviumDataReader[source]
Class for reading single ivium files
- read(path_to_file, cls=None, name=None, cycle_number=0, **kwargs)[source]
Read the ASCII export from the Ivium software
- Parameters
path_to_file (Path) – The full abs or rel path including the suffix (.txt)
cls (Measurement subclass) – The Measurement class to return an object of. Defaults to ECMeasurement.
name (str) – The name to use if not the file name
cycle_number (int) – The cycle number of the data in the file (default is 0)
**kwargs (dict) – Key-word arguments are passed to cls.__init__
- Returns
technique measurement object with the ivium data
- Return type
cls
- class ixdat.readers.ivium.IviumDatasetReader[source]
Class for reading sets of ivium files exported together
- read(path_to_file, cls=None, name=None, **kwargs)[source]
Return a measurement containing the data of an ivium dataset,
An ivium dataset is a group of ivium files exported together. They share a folder and a base name, and are suffixed “_1”, “_2”, etc.
- Parameters
path_to_file (Path or str) – Path(path_to_file).parent is interpreted as the folder where the files of the ivium dataset is. Path(path_to_file).name up to the first “_” is interpreted as the shared start of the files in the dataset. You can thus use the base name of the exported files or the full path of any one of them.
cls (Measurement class) – The measurement class. Defaults to ECMeasurement.
name (str) – The name of the dataset. Defaults to the base name of the dataset
kwargs – key-word arguments are included in the dictionary for cls.from_dict()
Returns cls or ECMeasurement: A measurement object with the ivium data
The chi
module
A reader for text exports from the RGA Software of Stanford Instruments
Mass Spectrometry and sub-techniques
These are readers which by default return an MSMeasurement
.
(See Mass Spectrometry)
The pfeiffer
module
This module implements the reader for Pfeiffer Vacuum’s PV Mass Spec software
- class ixdat.readers.pfeiffer.PVMassSpecReader[source]
A reader for (advanced) MID files exported from PVMassSpec (’… - Bin.dat’)
- read(path_to_file, cls=None, name=None, **kwargs)[source]
Return a Measurement with the (advanced) MID data in the PVMassSpec file
- Parameters
path_to_file (Path or str) – a path to the file exported by PVMassSpec with (advanced) MID data. This file is typically exported with a name that ends in ‘- Bin.dat’, and with the timestamp in the file name. Note that the file can be renamed, as the original name is in the file, and the timestamp is read from there.
cls (Measurement subclass) – The technique class of which to return an object. Defaults to MSMeasurement.
name (str) – The name of the measurement. Defaults to Path(path_to_file).name
kwargs – key-word args are used to initiate the measurement via cls.as_dict()
Return cls: The measurement object
- class ixdat.readers.pfeiffer.PVMassSpecScanReader[source]
A reader for mass spectra files exported from PVMassSpec (’… - Scan.dat’)
The rgasoft
module
A reader for text exports from the potentiostat software of CH Instruments
EC-MS and sub-techniques
These are readers which by default return an ECMSMeasurement
.
(See Electrochemistry - Mass Spectrometry (EC-MS))
The zilien
module
Readers for files produces by the Zilien software from Spectro Inlets.
Zilien tsv files have two data header lines to define each of the data columns. The first one is referred to as “series header” and explains what the data describes, and the second one is called “column header” and specifies the specific column. It is done in order to keep the columns headers more readable. Typically, a series header will specify the measuring device (e.g. “iongauge value”) or MS channel (e.g. “C0M2”) and will apply for two or more column headers where the first is time (“Time [s]”, “time/s”) and the subsequent are the corresponding value(s) (“Pressure [mbar]” or “M2-H2 [A]” etc.). Zilien files version 2 and higher may also include all the data from an integrated Biologic dataset. These are grouped under the series header “EC-lab”.
- class ixdat.readers.zilien.ZilienSpectrumReader(path_to_spectrum=None)[source]
A reader for individual Zilien spectra TODO: A Zilien reader which loads all spectra at once in a SpectrumSeries object
- read(path_to_spectrum, cls=None, t_zero=None, **kwargs)[source]
Read a Zilien spectrum. FIXME: This reader was written hastily and could be designed better.
- Parameters
path_to_spectrum (Path or str) – the path to the spectrum file
cls (Spectrum class) – Defaults to MSSpectrum
t_zero (float) – The unix timestamp which the mass scan start time is referenced to. Should be the tstamp of the corresponding Zilien measurement. If the Spectrum is read individually, it needs to be input or defaults to time.time(), i.e., now.
kwargs – Key-word arguments are passed on ultimately to cls.__init__
- class ixdat.readers.zilien.ZilienTMPReader(path_to_tmp_dir=None)[source]
A class for stitching the files in a Zilien tmp directory to an ECMSMeasurement
This is necessary because Zilien often crashes, leaving only the tmp directory. This is less advanced but more readable than the Spectro Inlets stitching solution.
- class ixdat.readers.zilien.ZilienTSVReader[source]
Class for reading files saved by Spectro Inlets’ Zilien software
- read(path_to_file, cls=None, name=None, include_mass_scans=None, **kwargs)[source]
Read a Zilien file
- Parameters
path_to_file (Path or str) – The path of the file to read
cls (Measurement) – The measurement class to read the file as. Zilien tsv files can be read both as an EC-MS measurement, an MS measurement (which will exclude the EC series from the measurement) and as an EC measurement (which will exclude the MS series from the measurement). To avoid importing classes, this behavior can also be controlled by setting the technique argument to either ‘EC-MS’, ‘MS’ or ‘EC’. The default is determined according to what is parsed from the dataset.
name (str) – The name of the measurement. Will default to the part of the filename before the ‘.tsv’ extension
include_mass_scans (bool) – Whether to include mass scans (if available) and thereby return a SpectroMSMeasurement which can be indexed to give the spectrum objects. (Defaults to True if technique not specified.)
kwargs – All remaining keyword-arguments will be passed onto the __init__ of the Measurement
- ixdat.readers.zilien.determine_class(technique)[source]
Choose appropriate measurement class according to a given technique.
- ixdat.readers.zilien.module_demo()[source]
Module demo here.
- To run this module in PyCharm, open Run Configuration and set
Module name = ixdat.readers.zilien,
- and not
Script path = …
- ixdat.readers.zilien.parse_metadata_line(line)[source]
Parse a single metadata line and return the name, value
- ixdat.readers.zilien.series_list_from_tmp(path_to_file)[source]
Return [ValueSeries, TimeSeries] with the data in a zilien tmp .tsv file
The ec_ms_pkl
module
- class ixdat.readers.ec_ms_pkl.EC_MS_CONVERTER[source]
Imports old .pkl files obtained from the legacy EC-MS package
- ixdat.readers.ec_ms_pkl.measurement_from_ec_ms_dataset(ec_ms_dict, name=None, cls=<class 'ixdat.techniques.ec_ms.ECMSMeasurement'>, reader=None, technique=None, **kwargs)[source]
Return an ixdat Measurement with the data from an EC_MS data dictionary.
This loops through the keys of the EC-MS dict and searches for MS and EC data. Names the dataseries according to their names in the original dict. Omits any other data as well as metadata.
- Parameters
ec_ms_dict (dict) – The EC_MS data dictionary
name (str) – Name of the measurement
cls (Measurement class) – The class to return a measurement of
reader (Reader object) – The class which read ec_ms_dataset from file
technique (str) – The name of the technique
SEC and sub-techniques
These are readers which by default return a SpectroECMeasurement
.
(See Spectro-Electrochemistry)
The msrh_sec
module
- class ixdat.readers.msrh_sec.MsrhSECReader[source]
A reader for SEC saved in three files: spectra vs U; wavelengths; current vs U
- read(path_to_file, path_to_ref_spec_file, path_to_U_J_file, scan_rate, tstamp=None, cls=None)[source]
Read potential-dep. SEC data from 3 csv’s to return a SpectroECMeasurement
The function is well-commented so take a look at the source
- Parameters
path_to_file (Path or str) – The full path to the file containing the spectra data. This file has voltage in the first row, and a first column with an arbitrary counter which has to be replaced by wavelength.
path_to_ref_spec_file (Path or str) – The full path to the file containing the wavelenth data, together usually with the adsorption-free spectrum. The length of the columns should be the same as in the spectrum data but in practice is a few points longer. The excess points at the starts of the columns are discarded.
path_to_U_J_file (Path or str) – The full path to the file containing the current data vs potential. The columns may be reversed in order. In the end the potential in the spectra file will be retained and the potential here used to interpolate the current onto the spectra file’s potential.
scan_rate (float) – Scan rate in [mV/s]. This is used to figure out the measurement’s time variable, as time is bizarrely not included in any of the data files.
tstamp (float) – Timestamp. If None, the user will be prompted for the measurement start time or whether to use the file creation time. This is necessary because tstamp is also not included in any of the files but is central to how ixdat organizes data. If you’re sure that tstamp doesn’t matter for you, put e.g. tstamp=1 to suppress the prompt.
cls (Measurement subclass) – The class of measurement to return. Defaults to SpectroECMeasurement.
Other techniques
The avantage
module
- class ixdat.readers.avantage.AvantageAVGReader(path_to_file=None)[source]
A class for importing a .avg file exported by DataSpace_BatchDump.exe
- read(path_to_file, cls=None, **kwargs)[source]
Load data stored as text by Advantage’s default exporting mode
Copied from pyThetaProbe, written by Anna Winiwarter and Soren Scott in 2019 TODO: Improve this code. See suggestions here:
Written for simple intensity-vs-energy, but with possible future expansion in mind. Returns the dataset as a python dictionary.
- Parameters
path_to_file (str or Path) – Path to the .avg data
cls (Spectrum subclass) – Class of spectrum to return an object of
kwargs – Additional keyword arguments are passed to cls.__init__
The xrdml
module
This module defines the reader of .xrdml files from, for example, Empyrion XRD
The qexafs
module
Readers for ‘qexafs’ and ‘TRXRF’ files exported by Diamond’s B18-Core