crowsetta.Transcriber#
- class crowsetta.Transcriber(format: str | crowsetta.interface.SeqLike | crowsetta.interface.BBoxLike)[source]#
Bases:
object
The
crowsetta.Transcriber
class provides a way to work with all annotation formats incrowsetta
, without needing to know the names of classes that represent formats (e.g.,crowsetta.formats.seq.AudSeq
orcrowsetta.formats.bbox.Raven
.)When you make a
Transcriber
instance, you specify its format as a string name, one of the names returned bycrowsetta.formats.as_list()
.You can then use this
Transcriber
instance to load multiple annotation files in thatformat
, by calling thefrom_file()
method repeatedly, e.g., in a for loop or list comprehension. This will create multiple instances of the classes that represent annotation format, one instance for each annotation file. With method chaining you can convert each loaded file at the same time to :class:`crowsetta.Annotation`s (the data structure used to work with annotations and convert between formats), and save annotations to comma-separated values (csv) files or other file formats. See examples below.- format#
If a string, name of annotation format that the
Transcriber
will use. Must be one of the shorthand string names returned bycrowsetta.formats.as_list()
. If a class, must be one of the classes incrowsetta.formats
that the shorthand strings refer to. You can register your own class usingcrowsetta.formats.register_format()
. All format classes must be either sequence-like or bounding-box-like, i.e., registered as eithercrowsetta.interface.seq.SeqLike
orcrowsetta.interface.bbox.BBoxLike
.- Type:
str or class
- from_file : Loads annotations from a file
Examples
An example of loading a sequence-like format with the
from_file()
method.>>> import crowsetta >>> scribe = crowsetta.Transcriber(format='aud-seq') >>> example = crowsetta.data.get('aud-seq') >>> audseq = scribe.from_file(example.annot_path) >>> annot = audseq.to_annot() >>> annot Annotation(annot_path=PosixPath('/home/pimienta/.local/share/crowsetta/5.0.0rc1/audseq/405_marron1_June_14_2016_69640887.audacity.txt'), notated_path=None, seq=<Sequence with 61 segments>) # noqa
An example of loading a bounding box-like format with the
from_file()
method. Notice this format has a parameterannot_col
we need to specify for it to load correctly. We can pass this additional parameter into thefrom_file
method as a keyword argument.>>> import crowsetta >>> scribe = crowsetta.Transcriber(format='raven') >>> example = crowsetta.data.get('raven') >>> raven = scribe.from_file(example.annot_path, annot_col='Species') >>> annot = raven.to_annot() >>> annot Annotation(annot_path=PosixPath('/home/pimienta/.local/share/crowsetta/5.0.0rc1/raven/Recording_1_Segment_02.Table.1.selections.txt'), notated_path=None, bboxes=[BBox(onset=154.387792767, offset=154.911598217, low_freq=2878.2, high_freq=4049.0, label='EATO'), BBox(onset=167.526598245, offset=168.17302044, low_freq=2731.9, high_freq=3902.7, label='EATO'), BBox(onset=183.609636834, offset=184.097751553, low_freq=2878.2, high_freq=3975.8, label='EATO'), BBox(onset=250.527480604, offset=251.160710509, low_freq=2756.2, high_freq=3951.4, label='EATO'), BBox(onset=277.88724277, offset=278.480895806, low_freq=2707.5, high_freq=3975.8, label='EATO'), BBox(onset=295.52970757, offset=296.110168316, low_freq=2951.4, high_freq=3975.8, label='EATO')]) # noqa
An example of loading a set of annotations in the
NotMat
format, converting them toAnnotation
instances at the same time with method chaining, and then finally saving them as a csv file, using theGenericSeq
format.>>> import pathlib >>> import crowsetta >>> notmat_paths = sorted(pathlib.Path('./data/bfsongrepo').glob('*.not.mat') >>> scribe = crowsetta.Transcriber('notmat') >>> # next line, use method chaining to load NotMat and convert to crowsetta.Annotation all at once >>> annots = [scribe.from_file(notmat_path).to_annot() for notmat_path in notmat_paths] >>> generic_seq = crowsetta.formats.seq.GenericSeq(annots) >>> generic_seq.to_csv('./data/bfsongrepo/notmats.csv')
- __init__(format: str | crowsetta.interface.SeqLike | crowsetta.interface.BBoxLike)[source]#
Initialize a new
crowsetta.Transcriber
instance.- Parameters:
format (str or class) – If a string, name of annotation format that the
Transcriber
will use. Must be one of the shorthand string names returned bycrowsetta.formats.as_list()
. If a class, must be one of the classes incrowsetta.formats
that the shorthand strings refer to. You can register your own class usingcrowsetta.formats.register_format()
. All format classes must be either sequence-like or bounding-box-like, i.e., registered as eithercrowsetta.interface.seq.SeqLike
orcrowsetta.interface.bbox.BBoxLike
.
Methods
__init__
(format)Initialize a new
crowsetta.Transcriber
instance.from_file
(annot_path, *args, **kwargs)Load annotations from a file.
- from_file(annot_path, *args, **kwargs) crowsetta.interface.SeqLike | crowsetta.interface.BBoxLike [source]#
Load annotations from a file.
- Parameters:
annot_path (str, pathlib.Path) – Path to file containing annotations.
- Returns:
annotations – An instance of the class referred to by
self.format
, with annotations loaded fromannot_path
- Return type:
class-instance
Examples
An example of loading a sequence-like format with the
from_file()
method.>>> import crowsetta >>> scribe = crowsetta.Transcriber(format='aud-seq') >>> example = crowsetta.data.get('aud-seq') >>> audseq = scribe.from_file(example.annot_path) >>> annot = audseq.to_annot() >>> annot Annotation(annot_path=PosixPath('/home/pimienta/.local/share/crowsetta/5.0.0rc1/audseq/405_marron1_June_14_2016_69640887.audacity.txt'), notated_path=None, seq=<Sequence with 61 segments>) # noqa
An example of loading a bounding box-like format with the
from_file()
method. Notice this format has a parameterannot_col
we need to specify for it to load correctly. We can pass this additional parameter into thefrom_file()
method as a keyword argument.>>> import crowsetta >>> scribe = crowsetta.Transcriber(format='raven') >>> example = crowsetta.data.get('raven') >>> raven = scribe.from_file(example.annot_path, annot_col='Species') >>> annot = raven.to_annot() >>> annot Annotation(annot_path=PosixPath('/home/pimienta/.local/share/crowsetta/5.0.0rc1/raven/Recording_1_Segment_02.Table.1.selections.txt'), notated_path=None, bboxes=[BBox(onset=154.387792767, offset=154.911598217, low_freq=2878.2, high_freq=4049.0, label='EATO'), BBox(onset=167.526598245, offset=168.17302044, low_freq=2731.9, high_freq=3902.7, label='EATO'), BBox(onset=183.609636834, offset=184.097751553, low_freq=2878.2, high_freq=3975.8, label='EATO'), BBox(onset=250.527480604, offset=251.160710509, low_freq=2756.2, high_freq=3951.4, label='EATO'), BBox(onset=277.88724277, offset=278.480895806, low_freq=2707.5, high_freq=3975.8, label='EATO'), BBox(onset=295.52970757, offset=296.110168316, low_freq=2951.4, high_freq=3975.8, label='EATO')]) # noqa
An example of loading a set of annotations in the
NotMat
format, converting them toAnnotation
instances at the same time with method chaining, and then finally saving them as a csv file, using theGenericSeq
format.>>> import pathlib >>> import crowsetta >>> notmat_paths = sorted(pathlib.Path('./data/bfsongrepo').glob('*.not.mat') >>> scribe = crowsetta.Transcriber('notmat') >>> # next line, use method chaining to load NotMat and convert to crowsetta.Annotation all at once >>> annots = [scribe.from_file(notmat_path).to_annot() for notmat_path in notmat_paths] >>> generic_seq = crowsetta.formats.seq.GenericSeq(annots) >>> generic_seq.to_csv('./data/bfsongrepo/notmats.csv')