resistics.time.reader module

class resistics.time.reader.TimeReader(dataPath: str)[source]

Bases: resistics.common.base.ResisticsBase

Base class for data readers

DataReader is the base class for the time data readers. It implements much of the non-format dependent methods.

The DataReaders for the different formats are meant to give a consistent output. Conventions used include:

  • The start time is the time of the first sample

  • The end time is the time of the last sample

Physical data is always returned in:

  • Electrical channels in mV/km

  • Magnetic channels in mV

  • To get magnetic fields in nT, calibration needs to be performed

Attributes
dataPathstr

Path to the data folder.

headersDict

Dictionary mapping header words to values

chansList[str]

List of channels

chanHeadersList

Headers specific to channels

chanMapDict

Map from channel name to index for chanHeaders

commentsList[str]

List of comments associated with data

headerFList[str]

List of header files (with extension .hdr)

dataFList[str]

List of data files (with extension .npy)

dataByteOffsetint

Number of bytes to offset before reading

dataByteSizeint

Byte size of a single data value

Methods

__init__(dataPath)

Initialise with path to the data directory

setParameters()

Set parameters specific to a data format

checkFiles()

Check to see header and data files found in data directory

getComments()

Get a deepcopy of the comments

getHeaders()

Get copy of header dictionary

getChannels()

Get a list of the channels

getNumChannels()

Get number of channels

getSampleFreq(integer = False)

Get sampling frequency in Hz as float or optionally as integer

getSampleRate()

Get sample rate in s

getNumSamples()

Get the number of samples

getStartDatetime()

Get the data start time as a datetime object

getStopDatetime()

Get the data stop time as a datetime object

getGain1()

Get gain stage 1 for each channel in an array

getGain2()

Get gain stage 2 for each channel in an array

getChanMap()

Get the channel map

getChanHeader(chan, header)

Get a channel header

getChanType(chan)

Get channel type (electric, magnetic)

getChanGain1(chan)

Get gain stage 1 for chan

getChanGain2(chan)

Get gain stage 2 for chan

getChanSamples(chan)

Get number of samples in channel

getChanSampleFreq(chan)

Get sampling frequency for a channel

getChanLSB(chan)

Get the channel least significant bit

getChanScalingApplied(chan)

Get a bool flag designating whether LSB has been applied

getChanDataFile(chan)

Get the channel data file

getChanDx(chan)

Get the distance between x electrodes

getChanDy(chan)

Get the distance between y electrodes

getChanDz(chan)

Get the distance between z electrodes

getChanSensor(chan)

Get sensor value of a channel

getChanSerial(chan)

Get serial value of a channel

getChanChopper(chan)

Get chopper value for a channel

getSensors(chans)

Get sensor values for a list of channels

getSerials(chans)

Get serial values for a list of channels

getChoppers(chans)

Get chopper values for a list of channels

setHeader(headerName, headerVal)

Set a header value

setChanHeader(chan, headerName, headerVal)

Set a chan header value

getUnscaledSamples(**kwargs)

Get raw, unscaled data

getUnscaledData(startTime, endTime, **kwargs)

Get raw, unscaled data for a date range

getPhysicalSamples(**kwargs)

Get data in physical units

getPhysicalData(startTime, endTime, **kwargs)

Get data in physical units for a date range

parseGetDataKeywords(keywords)

Parse get data keywords

time2sample(timeStart, timeEnd)

Convert dates to samples

sample2time(sampleStart, sampleEnd)

Convert samples to datetimes

getDataTimes(timeStart, timeEnd)

Check data times and return checked and corrected ones

readHeader()

Read header data (implemented in child classes)

formatHeaderData()

Format the header data

intHeaders()

Return a list of headers to be formatted as int

floatHeaders()

Return a list of headers to be formatted as float

boolHeaders()

Return a list of headers to be formatted as bool

prepare()

Prepare class information after reading header files

checkChan(chan)

Check channel exists in data

printList()

Class information as list of strings

printCommentsList()

Comment information as list of strings

printComments()

Print comments to terminal

boolHeaders(self) → Tuple[List[str], List[str]][source]

List of headers which are expected to have boolean values

Returns
boolGlobalList[str]

List of global boolean headers

boolChanList[str]

List of chan boolean headers

checkChan(self, chan: str) → None[source]

Check channel exists in data

Parameters
chanstr

Channel to check

checkFiles(self) → bool[source]

Check to make sure data files found

floatHeaders(self) → Tuple[List[str], List[str]][source]

List of headers which are expected to have boolean values

Returns
floatGlobalList[str]

List of global float headers

floatChanList[str]

List of chan float headers

formatHeaderData(self) → None[source]

Format header data to the correct formats

getChanChopper(self, chan) → bool[source]

Get channel chopper

The chopper is an amplifier present in some instruments. There might be different calibration files for chopper on or off.

Returns
bool

Flag designating whether chopper is on or off

getChanDataFile(self, chan) → str[source]

Get the data file for the channel

Returns
str

Data file for the channel

getChanDx(self, chan)[source]

Get the electric channel spacing in the x direction

Returns
float

Electric channel spacing in x direction in metres

getChanDy(self, chan)[source]

Get the electric channel spacing in the y direction

Returns
float

Electric channel spacing in y direction in metres

getChanDz(self, chan)[source]

Get the electric channel spacing in the z direction

Returns
float

Electric channel spacing in z direction in metres

getChanGain1(self, chan) → int[source]

Get channel gain 1

Returns
int

Channel gain 1

getChanGain2(self, chan) → int[source]

Get channel gain 2

Returns
int

Channel gain 2

getChanHeader(self, chan, header)[source]

Get channel headers

Todo

Write this documentation

getChanHeaders(self, chans: List[str] = []) → Tuple[List, Dict][source]

Get channel headers

Parameters
chans: List[str], optional

List of channels for which chan headers are wanted

Returns
chanHeadersList

List of channel headers

chanMapDict

Map from channel to index for the chanHeaders

getChanLSB(self, chan)[source]

Get channel least significant bit

Returns
float

Channel least significant bit

getChanMap(self)[source]

Get the channel map

Todo

Write this documentation

getChanSampleFreq(self, chan) → float[source]

Get channel sampling frequency

Returns
float

Sampling frequency in Hz

getChanSamples(self, chan) → int[source]

Get channel number of samples

Returns
int

Channel number of samples

getChanScalingApplied(self, chan) → bool[source]

A flag to mark whether a channel has the lsb applied

Returns
bool

Flag to designate whether channel lsb applied

getChanSensor(self, chan) → str[source]

Get channel sensor type

Returns
str

Channel sensor type

getChanSerial(self, chan) → str[source]

Get channel serial number

Returns
str

Channel serial number

getChanType(self, chan) → str[source]

Get the channel type (electric or magnetic)

Returns
str

String of channel type

getChannels(self) → List[str][source]

Get channels in data

Returns
List[str]

Data channels

getChoppers(self, chans: List[str]) → Dict[str, bool][source]

Get choppers for multiple chans

Returns
Dict[str, str]

Dictionary with channels as keys and the serials numbers as values

getComments(self) → List[str][source]

Get a deepcopy of the comments

Returns
List[str]

Dataset comments as a list of strings

getDataTimes(self, timeStart, timeEnd)[source]

Checks and converts a date range to make sure it’s within data start and end

Parameters
timeStartdatetime, str

Start time of date range

timeEnddatetime, str

End time of date range

Returns
timeStartdatetime, str

Checked and corrected start time of data range

timeEnddatetime, str

Checked and corrected end time of data range

getGain1(self) → numpy.ndarray[source]

Get value of gain 1

Returns
np.ndarray

Array of gains for channels

getGain2(self) → numpy.ndarray[source]

Get value of gain 2

Returns
np.ndarray

Array of gains for channels

getHeaders(self) → Dict[source]

Get the data headers

Returns
Dict

Data headers

getNumChannels(self) → int[source]

Get number of channels in the data

Returns
int

Number of channels

getNumSamples(self) → int[source]

Get number of samples

Returns
int

Number of samples

getPhysicalData(self, startTime, endTime, **kwargs)[source]

Get physical data from data file between a start and end data

Calculates the start and end sample given the data range and returns getPhysicalSamples for that sample range.

Parameters
startTimedatetime

Start time of data to read

endTimedatetime

End time of data to read

chansList[str]

List of channels to return if not all are required

remaveragebool

Remove average from the data

remzerosbool

Remove zeroes from the data

remnans: bool

Remove NanNs from the data

Returns
TimeData

Time data object

getPhysicalSamples(self, **kwargs)[source]

Get data scaled to physical values

Depending on the data format, the scalings required to convert to field physical units is different. The method in the base DataReader class covers ATS and internal file format.

resistics will always provide physical samples in field units. That means

  • Electrical channels in mV/km

  • Magnetic channels in mV

  • To get magnetic fields in nT, calibration needs to be performed

If the channel header scaling_applied is set to True, no scaling of the unscaled data is done. This is to cover the internal data format where all scalings may already have been applied.

Parameters
chansList[str]

List of channels to return if not all are required

startSampleint

First sample to return

endSampleint

Last sample to return

remaveragebool

Remove average from the data

remzerosbool

Remove zeroes from the data

remnans: bool

Remove NanNs from the data

Returns
TimeData

Time data object

Notes

The raw data units for ATS data are in counts. To get data in field units, ATS data is first multipled by the least significat bit (lsb) defined in the header files,

\[data = data * lsb,\]

giving data in mV. The lsb includes the gain removal, so no separate gain removal needs to be performed.

For electrical channels, there is additional step of dividing by the electrode spacing, which is provided in metres. The extra factor of a 1000 is to convert this to km to give mV/km for electric channels

\[data = \frac{1000 * data}{electrodeSpacing}\]

Finally, to get magnetic channels in nT, the magnetic channels need to be calibrated.

getSampleFreq(self, integer: bool = False)[source]

Get data sampling frequency in Hz

Returns
float, int

Sampling frequency

getSampleRate(self) → float[source]

Get data sampling rate in s

Returns
float

Sampling rate in s

getSensors(self, chans: List[str]) → Dict[str, str][source]

Get sensors for multiple chans

Returns
Dict[str, str]

Dictionary with channels as keys and the sensor types as values

getSerials(self, chans: List[str]) → Dict[str, str][source]

Get serials for multiple chans

Returns
Dict[str, str]

Dictionary with channels as keys and the serials numbers as values

getStartDatetime(self) → datetime.datetime[source]

Get datetime of first sample

Returns
datetime

Date and time of first sample

getStopDatetime(self) → datetime.datetime[source]

Get datetime of last sample

Returns
datetime

Date and time of last sample

getUnscaledData(self, startTime, endTime, **kwargs) → resistics.time.data.TimeData[source]

Get raw data from data file between a start and end date

Calculates the start and end sample given the data range and returns getUnscaledSamples for that sample range.

Parameters
startTimedatetime

Start time of data to read

endTimedatetime

End time of data to read

chansList[str], optional

List of channels to return if not all are required

Returns
TimeData

Time data object

getUnscaledSamples(self, **kwargs) → resistics.time.data.TimeData[source]

Get raw data from data file

Depending on the data format, this could be raw counts or in some physical unit. The method implemented in the base DataReader can read from ATS and internal files. SPAM and Phoenix data readers have their own implementations.

The raw data units for ATS and internal data formats are as follows:

  • ATS data format has raw data in counts.

  • The raw data unit of the internal format is dependent on what happened to the data before writing it out in the internal format. If the channel header scaling_applied is set to True, no scaling happens in either getUnscaledSamples or getPhysicalSamples. However, if the channel header scaling_applied is set to False, the internal format data will be treated like ATS data, meaning raw data in counts.

Parameters
chansList[str], optional

List of channels to return if not all are required

startSampleint, optional

First sample to return

endSampleint, optional

Last sample to return

Returns
TimeData

Time data object

intHeaders(self) → Tuple[List[str], List[str]][source]

List of headers which are expected to have integer values

Returns
intGlobalList[str]

List of global integer headers

intChanList[str]

List of chan integer headers

parseGetDataKeywords(self, keywords) → Dict[source]

Parse the get data keywords

Parameters
keywordsDict

The keywords passed to get data methods

Returns
Dict

A dictionary of parsed keywords with defaults where nothing is provided by the user

prepare(self) → None[source]

Set some intial values

This method does some checks and prepares some of the storage for the channels.

Notes

The end time of the data will be checked. Different data formats record an end time after the last sample. For example, ATS end time appears to be one sample after the number of samples. The end time is checked by using the number of samples and the sampling frequency.

The internal convention is that the start and end times should reflect the times of the first and last sample

printComments(self) → None[source]

Print out dataset comments

printCommentsList(self) → List[str][source]

Dataset comments as a list of strings

Returns
outList[str]

List of strings with information

printList(self) → List[str][source]

Class information as a list of strings

Returns
outList[str]

List of strings with information

readHeader(self) → None[source]

Function to read header data and populate reader information

This is implemented in child classes as all header formats are different

sample2time(self, sampleStart: int, sampleEnd: int) → Tuple[datetime.datetime, datetime.datetime][source]

Converts a start and end sample to start and end times

Note: The first sample is zero

Parameters
sampleStartint

The starting sample for the sample range

sampleEndint

The ending sample for the sample range

Returns
timeStartdatetime, str

Corresponding start time of date range

timeEnddatetime, str

Corresponding end time of date range

setChanHeader(self, chan: str, headerName: str, headerVal: Any) → None[source]

Set a channel header value

Parameters
channelstr

The channel

headerNamestr

The name of the header to set

headerValAny

Header value

setHeader(self, headerName: str, headerVal: Any) → None[source]

Set a header value

Parameters
headerNamestr

The name of the header to set

headerValAny

Header value

setParameters(self) → None[source]

Set data reader parameters

This will vary for the different data formats. By default, setup for the internal data format.

time2sample(self, timeStart: Union[str, datetime.datetime], timeEnd: Union[str, datetime.datetime]) → Tuple[int, int][source]

Converts a start and end time to start and end samples

Note: The first sample is zero

Parameters
timeStartdatetime, str

Start time of date range

timeEnddatetime, str

End time of date range

Returns
sampleStartint

The correspoding start sample for timeStart

sampleEndint

The corresponding end sample for timeEnd