Background

Why is crowsetta needed?

The target audience of crowsetta is anyone that works with birdsong or any other vocalization that is annotated in some way, meaning someone took the time to figure out where elements of the vocalizations start and stop, and has assigned labels to those elements. Maybe you are a neuroscientist trying to figure out how songbirds learn their song, or why mice emit ultrasonic calls. Or maybe you’re an ecologist studying dialects of finches distributed across Asia, or maybe you are a linguist studying accents in the Antilles, or a speech pathologist looking for phonetic changes that indicate early onset Alzheimer’s disease, etc., etc., …

To run a computational analysis on this kind of data, you’ll need to get the annotation out of a file, which often means you’ll end up writing something like this:

from scipy.io import loadmat  # function from scipy library for loading Matlab data files
annot = loadmat('bird1_experiment1_annotation_2018-11-17_083521.mat', squeeze_me=True)
onsets = annot['onsets']  # unpack from dictionary
onsets = np.asarray(onsets)  # convert to an array
onsets = onsets / 1000  # convert from milliseconds to seconds

This is verbose and not easy to read. You could do some of it in one line …

onsets = np.asarray(annot['onsets']) / 1000

… but now the next time you read that one-liner, you will have to mentally unpack it.

Such code quickly turns into boilerplate that you will write any time you need to work with this data. It becomes repetitive and presents many opportunities for easy-to-miss bugs (e.g. a line with a variable named offset where you meant to type onset of some syllable or phoneme or whatever, because you cut and pasted the line above it, and forgot to change off to on).

And things can become even more complicated if you have to deal with annotation stored in other formats, such as a database. Here’s an example of one way

import pymyseql

What would be nice is to have data types that represent annotation in a concise way, and that we can manipulate like we would some native Python data type like a list or a dictionary. crowsetta provides such data types: Sequences and Segments.

How crowsetta works

Internally, crowsetta takes whatever format you give it for a pile of files, and turns that into a bunch of Sequences made up of Segments. For someone working with birdsong, the Sequences will be single audio files / song bouts, and the Segments will be syllables in those song bouts (99.9% of the time). Then, if you need it to, crowsetta can spit out your Sequences of Segments in a simple text file with a comma-separated value (csv) format. This file format was chosen because it is widely considered to be the most robust way to share data.

An example csv looks like this:

label,onset_s,offset_s,onset_Hz,offset_Hz,audio_file,annot_file,sequence,annotation
i,1.278,1.351,40888,43239,gy6or6_baseline_230312_0808.138.cbin,gy6or6_baseline_230312_0808.138.cbin.not.mat,0,0
i,1.452,1.536,46478,49146,gy6or6_baseline_230312_0808.138.cbin,gy6or6_baseline_230312_0808.138.cbin.not.mat,0,0
i,1.605,1.712,51370,54774,gy6or6_baseline_230312_0808.138.cbin,gy6or6_baseline_230312_0808.138.cbin.not.mat,0,0
i,1.823,1.902,58336,60860,gy6or6_baseline_230312_0808.138.cbin,gy6or6_baseline_230312_0808.138.cbin.not.mat,0,0

Now that you have that, you can load it into a pandas dataframe or an Excel spreadsheet or a SQL database, or whatever you want.