Adding csv with Tempo, Ornamentation, Rhythm, creating txt-based "dialogue" #2

morganrivers · 2024-09-09T18:53:26Z

Hi,

I wanted to use the data from the paper, but I saw that a lot of the useful categorization had not really been implemented on all the data. I didn't change any of your code, just added some code and datasets building on your stuff.

So first I created an Augmented csv that contains the Rhythm and Ornamentation and Tempo directly in there, so future researchers can just access those values directly for each coda.

Also I like the book of whale pdfs, but I thought it would be nice to condense the existing plots you generate into a text-based format.
Of course it's more useful in training LLM's, but also it's good for human readability in some ways too.
So I made a text-based dialogue that looks like this (also I put the part of the book that this text corresponds with):

File: sw061b
Whale 1:  r3  r5  c3 \c3 -c3 -c3 \c3.
In chorus, whales 1, 2: -c3  a3.
In chorus, whales 1, 2: -c3  C4.
In chorus, whales 1, 2: /c3  a4.
Whale 2: /a4.
In chorus, whales 1, 2:  c4  a5.
In chorus, whales 1, 2: \c4  a4.
In chorus, whales 1, 2: -c4 \a4.
In chorus, whales 1, 2: \c4  E5.
Whale 2:  a4 /a4.
Whale 1: /c4.
Whale 2: -a4.
Whale 1: -c4.
Whale 2: /a4.
Whale 1: -c4.
In chorus, whales 1, 2: \c4 \a4.
In chorus, whales 1, 2: \c4 \a4.
Whale 2: /a4 \a4 -a4.
Whale 1:  r5.
In chorus, whales 1, 2: \R5 -a4.

(No vocalizations, 25 seconds)

Whale 2:  a3.
In chorus, whales 1, 2:  r5  a4.
In chorus, whales 1, 2:  r4  a3.
In chorus, whales 1, 2:  R5 -a3.
In chorus, whales 1, 2:  r4 -a3.
Whale 1: /r4  r5.
In chorus, whales 1, 2:  c3  r5.
Whale 1: -c3  Q4.

The the / or - or \ indicates Rubato, the letters distinguish the 17 possible Rhythms (a->0,...,r->17), the capitalization indicates ornament, and the number indicates tempo 1 through 5.

I converted the whole dataset into this format.

You can look at the two python files and the csv and txt file I added for more specifics.

…classified

…re like the script to a play, than a soulless csv file

…so, rubato seems very wrong, at first glance. But, chorusing works with arbitrary whales!

…ses.

morganrivers · 2024-10-02T14:01:38Z

Having looked at this in more detail, the sequence of the pickle files does not seem to perfectly match the chronological sequence of timestamped whale data, so while the "script" generally matches the book of whale pdf's, it does have some subtle issues. A colleague and I have been working on re-interpreting the raw ICI's into rhythm and ornamentation categories, and have trained a very small transformer to predict ICI's. This separate repository should soon produce a separate script, but more accurately.
whale-gpt

Incidentally, if you could possibly provide any more data with timestamps, that would be really amazing! LLM's are of course very data hungry. We would love to have more click data (critically, tagged with the whale originating, and timestamp of each click or coda).

morganrivers added 8 commits September 7, 2024 16:48

made a dataset with all 540 combinations encoded in text format, and …

e384b96

…classified

now we have a nice dialogue (data/whale_dialogues.txt) which looks mo…

217d3d8

…re like the script to a play, than a soulless csv file

fixed some issues with dialogues to make them more like the books. Al…

7981c6c

…so, rubato seems very wrong, at first glance. But, chorusing works with arbitrary whales!

added a pause

d62865b

more bug fixing, refactoring. Added timings printout between long pau…

44deb2a

…ses.

removed rubato from dataset/ data augmentation as was not working

1595582

rubato works

09dd39e

more cleanup removing rubato from parts of code

6e33ffd

small bug with tempos and ornaments corrected

2ef6013

NickleDave mentioned this pull request Dec 11, 2024

Availability of acoustic recordings #1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding csv with Tempo, Ornamentation, Rhythm, creating txt-based "dialogue" #2

Adding csv with Tempo, Ornamentation, Rhythm, creating txt-based "dialogue" #2

Uh oh!

morganrivers commented Sep 9, 2024

Uh oh!

morganrivers commented Oct 2, 2024

Uh oh!

Uh oh!

Adding csv with Tempo, Ornamentation, Rhythm, creating txt-based "dialogue" #2

Are you sure you want to change the base?

Adding csv with Tempo, Ornamentation, Rhythm, creating txt-based "dialogue" #2

Uh oh!

Conversation

morganrivers commented Sep 9, 2024

Uh oh!

morganrivers commented Oct 2, 2024

Uh oh!

Uh oh!