crowsetta.formats.seq.simple.SimpleSeq#

class crowsetta.formats.seq.simple.SimpleSeq(onsets_s: ndarray, offsets_s: ndarray, labels: ndarray, annot_path: Path, notated_path=None)[source]#

Bases: object

Class meant to represent any simple sequence-like annotation format.

The annotations can be a csv or txt file; the format should have 3 columns that represent the onset and offset times in seconds and the labels of the segments in the annotated sequences.

The default is to assume a comma-separated values file with a header ‘onset_s, offset_s, label’, but this can be modified with keyword arguments.

This format also assumes that each annotation file corresponds to one annotated source file, i.e. a single audio or spectrogram file.

name#

Shorthand name for annotation format: 'simple-seq'.

Type:: str

ext#

Extension of files in annotation format: ('.csv', '.txt')

Type:: str

onsets_s#

Vector of floats corresponding to beginning of segments, i.e. onsets, in seconds

Type:: numpy.ndarray

offsets_s#

Vector of floats corresponding to ends of segments, i.e. offsets, in seconds

Type:: numpy.ndarray

labels#

Vector of string labels for segments

Type:: numpy.ndarray

annot_path#

Path to file from which annotations were loaded.

Type:: str, pathlib.Path

notated_path#

path to file that annot_path annotates. E.g., an audio file, or an array file that contains a spectrogram generated from audio. Optional, default is None.

Type:: str. pathlib.Path

__init__(onsets_s: ndarray, offsets_s: ndarray, labels: ndarray, annot_path: Path, notated_path=None) → None#: Method generated by attrs for class SimpleSeq.

Methods

`__init__`(onsets_s, offsets_s, labels, annot_path)	Method generated by attrs for class SimpleSeq.
`from_file`(annot_path[, notated_path, ...])	Load annotations from a file in the 'simple-seq' format.
`to_annot`([round_times, decimals])	Convert this annotation to a `crowsetta.Annotation`.
`to_file`(annot_path[, to_csv_kwargs])	Save this 'simple-seq' annotation to a csv file.
`to_seq`([round_times, decimals])	Convert this annotation to a `crowsetta.Sequence`.

Attributes

`onsets_s`
`offsets_s`
`labels`
`annot_path`
`notated_path`
`ext`
`name`

Load annotations from a file in the ‘simple-seq’ format.

The annotations can be a csv or txt file; the format should have 3 columns that represent the onset and offset times in seconds and the labels of the segments in the annotated sequences.

The default is to assume a comma-separated values file with a header ‘onset_s, offset_s, label’, but this can be modified with keyword arguments.

This format also assumes that each annotation file corresponds to one annotated source file, i.e. a single audio or spectrogram file.

Parameters:

annot_path (str, pathlib.Path) – Path to an annotation file, with one of the extensions {‘.csv’, ‘.txt’}.
notated_path (str, pathlib.Path) – Path to file that annot_path annotates. E.g., an audio file, or an array file that contains a spectrogram generated from audio. Optional, default is None.
columns_map (dict-like) – Maps column names in header of annot_path to the standardized names used by this format. E.g., {'begin_time': 'onset_s', 'end_time': 'offset_s', 'text': 'label'}. Optional, default is None–assumes that columns have the standardized names.
read_csv_kwargs (dict) – Keyword arguments passed to pandas.read_csv(). Default is None, in which case all defaults for pandas.read_csv() will be used.

Examples

>>> example = crowsetta.data.get('simple-seq')
>>> simple = crowsetta.formats.seq.SimpleSeq.from_file(example.annot_path,
>>>                                                    columns_map={'start_seconds': 'onset_s',
>>>                                                                 'stop_seconds': 'offset_s',
>>>                                                                 'name': 'label'},
>>>                                                    read_csv_kwargs={'index_col': 0})

to_annot(round_times: bool = True, decimals: int = 3) → Annotation[source]#

Convert this annotation to a crowsetta.Annotation.

Parameters:

round_times (bool) – If True, round onsets_s and offsets_s. Default is True.
decimals (int) – Number of decimals places to round floating point numbers to. Only meaningful if round_times is True. Default is 3, so that times are rounded to milliseconds.

Returns:

annot

Return type:

crowsetta.Annotation

Examples

>>> example = crowsetta.data.get('simple-seq')
>>> simple = crowsetta.formats.seq.SimpleSeq.from_file(example.annot_path,
>>>                                                    columns_map={'start_seconds': 'onset_s',
>>>                                                                 'stop_seconds': 'offset_s',
>>>                                                                 'name': 'label'},
>>>                                                    read_csv_kwargs={'index_col': 0})
>>> annot = simple.to_annot()

Notes

The round_times and decimals arguments are provided to reduce differences across platforms due to floating point error, e.g. when loading annotation files and then sending them to a csv file, the result should be the same on Windows and Linux.

to_file(annot_path: str | bytes | PathLike | Path, to_csv_kwargs: Mapping | None = None) → None[source]#

Save this ‘simple-seq’ annotation to a csv file.

Parameters:

annot_path (str, pathlib.Path) – Path with filename of csv file that should be saved
to_csv_kwargs (dict-like) – keyword arguments passed to pandas.DataFrame.to_csv(). Default is None, in which case defaults for pandas.to_csv() will be used, except index is set to False.

to_seq(round_times: bool = True, decimals: int = 3) → Sequence[source]#

Convert this annotation to a crowsetta.Sequence.

Parameters:

round_times (bool) – If True, round onsets_s and offsets_s. Default is True.
decimals (int) – Number of decimals places to round floating point numbers to. Only meaningful if round_times is True. Default is 3, so that times are rounded to milliseconds.

Returns:

seq

Return type:

crowsetta.Sequence

Examples

>>> example = crowsetta.data.get('simple-seq')
>>> simple = crowsetta.formats.seq.SimpleSeq.from_file(example.annot_path,
>>>                                                    columns_map={'start_seconds': 'onset_s',
>>>                                                                 'stop_seconds': 'offset_s',
>>>                                                                 'name': 'label'},
>>>                                                    read_csv_kwargs={'index_col': 0})
>>> seq = simple.to_seq()

Notes

crowsetta.formats.seq.simple.SimpleSeq

Contents

crowsetta.formats.seq.simple.SimpleSeq#