crowsetta is a tool to work with any format for annotating vocalizations, like birdsong or human speech. The goal of crowsetta is to make sure that your ability to work with a dataset of vocalizations does not depend on your ability to work with any given format for annotating that dataset.


Data types that help you write clean code

What crowsetta gives you is not yet another format for annotation (I promise!). Instead you get some nice data types that make it easier to work with any format: namely, Sequences made up of Segments. The code block below shows some of the features of these data types.

>>> from crowsetta import Segment, Sequence
>>> a_segment = Segment.from_keyword(
...     label='a',
...     onset_Hz=16000,
...     offset_Hz=32000,
...     file='bird21.wav'
...     )
>>> another_segment = Segment.from_keyword(
...     label='b',
...     onset_Hz=36000,
...     offset_Hz=48000,
...     file='bird21.wav'
...     )
>>> list_of_segments = [a_segment, another_segment]
>>> seq = Sequence.from_segments(segments=list_of_segments)
>>> print(seq)
<Sequence with 2 segments>
>>> for segment in seq.segments: print(segment)
Segment(label='a', file='bird21.wav', onset_s=None, offset_s=None, onset_Hz=16000, offset_Hz=32000)
Segment(label='b', file='bird21.wav', onset_s=None, offset_s=None, onset_Hz=36000, offset_Hz=48000)
>>> seq.file
>>> seq.onsets_Hz
array([16000, 36000])

You load annotation from your format of choice into Sequences of Segments (most conveniently with the Transcriber, as explained below) and then use the Sequences however you need to in your program.

For example, if you want to loop through the Segments of each Sequence to pull syllables out of a spectrogram, you can do something like this:

>>> list_of_sequences = my_sequence_loading_function(file='annotation.txt')
>>> syllables_from_sequences = []
>>> for a_sequence in list_of_sequences:
...     # get name of the audio file associated with the Sequence
...     audio_file = a_sequence.file
...     # then create a spectrogram from that audio file
...     spect = some_spectrogram_making_function(audio_file)
...     syllables = []
...     for segment in a_sequence.segments:
...         ## spectrogram is a 2d numpy array so we index into using onset and offset from segment
...         syllable = spect[:, segment.onset_s:segment.offset_s]
...         syllables.append(syllable)
...     syllables_from_sequences.append(syllables)

This code is succinct, compared to the data munging code you usually write when dealing with audio files and annotation formats. It reads like idiomatic Python. For a deeper dive into why this is useful, see Background.

A Transcriber that makes it convenient to work with any annotation format

As mentioned, crowsetta provides you with a Transcriber that comes equipped with convenience functions to do the work of loading and saving annotations for you.

>>> annotation_files = [
...     '~/Data/bird1_day1/song1_2018-12-07_072135.not.mat',
...     '~/Data/bird1_day1/song2_2018-12-07_072316.not.mat',
...     '~/Data/bird1_day1/song3_2018-12-07_072749.not.mat'
... ]
>>> from crowsetta import Transcriber
>>> scribe = Transcriber()
>>> seq = scribe.to_seq(file=annotation_files, format='notmat')
>>> len(seq)
>>> print(seq[0])
<Sequence with 55 segments>

Easily use the Transcriber with your own annotation format

You can even easily tell the Transcriber to use your own in-house format, like so:

>>> my_config = {
...     'myformat_name': {
...         'module': '/home/MyUserName/Documents/Python/'
...         'to_seq': 'myformat2seq',
...         'to_csv': 'myformat2csv'}
...     }
... }
>>> scribe = crowsetta.Transcriber(user_config=my_config)
>>> seq = scribe.toseq(file='my_annotation.mat', file_format='myformat_name')

For more about how that works, please see How to use crowsetta with your own annotation format.

Save and load annotations in plain text files

If you need it to, crowsetta can save your Sequences of Segments as a plain text file in the comma-separated values (csv) format. This file format was chosen because it is widely considered to be a very robust way to share data.

from crowsetta import Transcriber
scribe = Transcriber(user_config=your_config)

An example csv looks like this:


Now that you have that, you can load it into a pandas dataframe or an Excel spreadsheet or an SQL database, or whatever you want.

You might find this useful in any situation where you want to share audio files of song and some associated annotations, but you don’t want to require the user to install a large application in order to work with the annotation files.

Getting Started

Install crowsetta by running:

$ pip install crowsetta

If you are new to the library, start with Tutorial.

To see an example of using crowsetta to work with your own annotation format, see How to use crowsetta with your own annotation format.

Project Information

crowsetta was developed for use with the songdeck and hybrid-vocal-classifier libraries.


If you are having issues, please let us know.


The project is licensed under the BSD license.


You can see project history and work in progress in the CHANGELOG.


If you use crowsetta, please cite the DOI: