crowsetta.Transcriber#
- class crowsetta.Transcriber(format: str | crowsetta.interface.SeqLike | crowsetta.interface.BBoxLike)[source]#
Bases:
objectThe
crowsetta.Transcriberclass provides a way to work with all annotation formats incrowsetta, without needing to know the names of classes that represent formats (e.g.,crowsetta.formats.seq.AudSeqorcrowsetta.formats.bbox.Raven.)When you make a
Transcriberinstance, you specify its format as a string name, one of the names returned bycrowsetta.formats.as_list().You can then use this
Transcriberinstance to load multiple annotation files in thatformat, by calling thefrom_file()method repeatedly, e.g., in a for loop or list comprehension. This will create multiple instances of the classes that represent annotation format, one instance for each annotation file. With method chaining you can convert each loaded file at the same time to :class:`crowsetta.Annotation`s (the data structure used to work with annotations and convert between formats), and save annotations to comma-separated values (csv) files or other file formats. See examples below.- format#
If a string, name of annotation format that the
Transcriberwill use. Must be one of the shorthand string names returned bycrowsetta.formats.as_list(). If a class, must be one of the classes incrowsetta.formatsthat the shorthand strings refer to. You can register your own class usingcrowsetta.formats.register_format(). All format classes must be either sequence-like or bounding-box-like, i.e., registered as eithercrowsetta.interface.seq.SeqLikeorcrowsetta.interface.bbox.BBoxLike.- Type:
str or class
- from_file : Loads annotations from a file
Examples
An example of loading a sequence-like format with the
from_file()method.>>> import crowsetta >>> scribe = crowsetta.Transcriber(format='aud-seq') >>> example = crowsetta.example('aud-seq') >>> audseq = scribe.from_file(example.annot_path) >>> annot = audseq.to_annot() >>> annot Annotation(annot_path=PosixPath('/home/pimienta/.local/share/crowsetta/5.0.0rc1/audseq/405_marron1_June_14_2016_69640887.audacity.txt'), notated_path=None, seq=<Sequence with 61 segments>) # noqa
An example of loading a bounding box-like format with the
from_file()method. Notice this format has a parameterannot_colwe need to specify for it to load correctly. We can pass this additional parameter into thefrom_filemethod as a keyword argument.>>> import crowsetta >>> scribe = crowsetta.Transcriber(format='raven') >>> example = crowsetta.example('raven') >>> raven = scribe.from_file(example.annot_path, annot_col='Species') >>> annot = raven.to_annot() >>> annot Annotation(annot_path=PosixPath('/home/pimienta/.local/share/crowsetta/5.0.0rc1/raven/Recording_1_Segment_02.Table.1.selections.txt'), notated_path=None, bboxes=[BBox(onset=154.387792767, offset=154.911598217, low_freq=2878.2, high_freq=4049.0, label='EATO'), BBox(onset=167.526598245, offset=168.17302044, low_freq=2731.9, high_freq=3902.7, label='EATO'), BBox(onset=183.609636834, offset=184.097751553, low_freq=2878.2, high_freq=3975.8, label='EATO'), BBox(onset=250.527480604, offset=251.160710509, low_freq=2756.2, high_freq=3951.4, label='EATO'), BBox(onset=277.88724277, offset=278.480895806, low_freq=2707.5, high_freq=3975.8, label='EATO'), BBox(onset=295.52970757, offset=296.110168316, low_freq=2951.4, high_freq=3975.8, label='EATO')]) # noqa
An example of loading a set of annotations in the
NotMatformat, converting them toAnnotationinstances at the same time with method chaining, and then finally saving them as a csv file, using theGenericSeqformat.>>> import pathlib >>> import crowsetta >>> notmat_paths = sorted(pathlib.Path('./data/bfsongrepo').glob('*.not.mat') >>> scribe = crowsetta.Transcriber('notmat') >>> # next line, use method chaining to load NotMat and convert to crowsetta.Annotation all at once >>> annots = [scribe.from_file(notmat_path).to_annot() for notmat_path in notmat_paths] >>> generic_seq = crowsetta.formats.seq.GenericSeq(annots) >>> generic_seq.to_csv('./data/bfsongrepo/notmats.csv')
- __init__(format: str | crowsetta.interface.SeqLike | crowsetta.interface.BBoxLike)[source]#
Initialize a new
crowsetta.Transcriberinstance.- Parameters:
format (str or class) – If a string, name of annotation format that the
Transcriberwill use. Must be one of the shorthand string names returned bycrowsetta.formats.as_list(). If a class, must be one of the classes incrowsetta.formatsthat the shorthand strings refer to. You can register your own class usingcrowsetta.formats.register_format(). All format classes must be either sequence-like or bounding-box-like, i.e., registered as eithercrowsetta.interface.seq.SeqLikeorcrowsetta.interface.bbox.BBoxLike.
Methods
__init__(format)Initialize a new
crowsetta.Transcriberinstance.from_file(annot_path, *args, **kwargs)Load annotations from a file.
- from_file(annot_path, *args, **kwargs) crowsetta.interface.SeqLike | crowsetta.interface.BBoxLike[source]#
Load annotations from a file.
- Parameters:
annot_path (str, pathlib.Path) – Path to file containing annotations.
- Returns:
annotations – An instance of the class referred to by
self.format, with annotations loaded fromannot_path- Return type:
class-instance
Examples
An example of loading a sequence-like format with the
from_file()method.>>> import crowsetta >>> scribe = crowsetta.Transcriber(format='aud-seq') >>> example = crowsetta.example('aud-seq') >>> audseq = scribe.from_file(example.annot_path) >>> annot = audseq.to_annot() >>> annot Annotation(annot_path=PosixPath('/home/pimienta/.local/share/crowsetta/5.0.0rc1/audseq/405_marron1_June_14_2016_69640887.audacity.txt'), notated_path=None, seq=<Sequence with 61 segments>) # noqa
An example of loading a bounding box-like format with the
from_file()method. Notice this format has a parameterannot_colwe need to specify for it to load correctly. We can pass this additional parameter into thefrom_file()method as a keyword argument.>>> import crowsetta >>> scribe = crowsetta.Transcriber(format='raven') >>> example = crowsetta.example('raven') >>> raven = scribe.from_file(example.annot_path, annot_col='Species') >>> annot = raven.to_annot() >>> annot Annotation(annot_path=PosixPath('/home/pimienta/.local/share/crowsetta/5.0.0rc1/raven/Recording_1_Segment_02.Table.1.selections.txt'), notated_path=None, bboxes=[BBox(onset=154.387792767, offset=154.911598217, low_freq=2878.2, high_freq=4049.0, label='EATO'), BBox(onset=167.526598245, offset=168.17302044, low_freq=2731.9, high_freq=3902.7, label='EATO'), BBox(onset=183.609636834, offset=184.097751553, low_freq=2878.2, high_freq=3975.8, label='EATO'), BBox(onset=250.527480604, offset=251.160710509, low_freq=2756.2, high_freq=3951.4, label='EATO'), BBox(onset=277.88724277, offset=278.480895806, low_freq=2707.5, high_freq=3975.8, label='EATO'), BBox(onset=295.52970757, offset=296.110168316, low_freq=2951.4, high_freq=3975.8, label='EATO')]) # noqa
An example of loading a set of annotations in the
NotMatformat, converting them toAnnotationinstances at the same time with method chaining, and then finally saving them as a csv file, using theGenericSeqformat.>>> import pathlib >>> import crowsetta >>> notmat_paths = sorted(pathlib.Path('./data/bfsongrepo').glob('*.not.mat') >>> scribe = crowsetta.Transcriber('notmat') >>> # next line, use method chaining to load NotMat and convert to crowsetta.Annotation all at once >>> annots = [scribe.from_file(notmat_path).to_annot() for notmat_path in notmat_paths] >>> generic_seq = crowsetta.formats.seq.GenericSeq(annots) >>> generic_seq.to_csv('./data/bfsongrepo/notmats.csv')