Skip to content

Arcangelo Corelli - Trio Sonatas (A corpus of annotated scores) (v2.0)

Compare
Choose a tag to compare
@johentsch johentsch released this 17 Jan 21:39
· 25 commits to main since this release

This corpus of annotated MuseScore files has been created within
the DCML corpus initiative and employs
the DCML harmony annotation standard. It was relased together with and as part
of the "workflow paper"

Hentschel, J., Moss, F. C., Neuwirth, M., & Rohrmeier, M. A. (2021). A semi-automated workflow paradigm for the distributed creation and curation of expert annotations. Proceedings of the 22nd International Society for Music Information Retrieval Conference, ISMIR, 262–269. https://doi.org/10.5281/ZENODO.5624417

The corpus comprises 36 Sonate a tre, divided into 149 separate movements. Together they make up for
three of the four famous cycles of 12 trio sonatas each:

Opus Cycle Publication Included
1 12 sonate da chiesa Rome 1681 Yes
2 12 sonate da camera Rome 1685 No
3 12 sonate da chiesa Rome 1689 Yes
4 12 sonate da camera Rome 1694 Yes

Versions

Version 2.0

  • TSV files now come with the column quarterbeats, which measures in quarter notes each event's position as its
    distance from the beginning
  • Extracted notes now come with the columns name and octave.
  • Column volta (containing first and second endings) removed from pieces that don't have any.
  • metadata.tsv has been enriched with further columns, in particular information about each movement's dimensions,
    including dimensions upon unfolding repeats (for instance, last_mn has the number of
    measures, last_mn_unfolded the number of measures when playing all repeats)
  • The folder reviewed contains two files per movement:
    • A copy of the score where all out-of-label notes have been colored in red;
      are shown in these files in a diff-like manner (removed in red, added in green).
    • A copy of the harmonies TSV with six added columns that reflect the coloring of out-of-label notes ("coloring
      reports")
  • As long as the ms3 review has any complaints, it stores them in the file warnings.log. Currently, it is
    showing
    those labels where over 60% of the notes in the segment have been colored in red and probably need revisiting (
    Pull Requests welcome)
  • TSV files are automatically kept up to date using the new GitHub action
    dcml_corpus_workflow which is the successor of the implementation
    used in the creation of this dataset.

Version 1.1

This release marks the moment where all 149 movements include a reviewed set of annotations that adhere to version
2.3.0 of the DCML harmony annotation standard. The metadata have not been
completed yet and the data were extracted one last time with the now deprecated version 0.4.11 of the
MuseScore parser ms3 for matters of completeness and homogeneity. The purpose is
mainly to substantiate the claim that the "semi-annotated workflow paradigm", as it had been implemented at publication
time (see the ISMIR paper cited above), can indeed be put to effective use in the creation of a large dataset. This
version is, however, to be followed by a version with upgraded tabular data based on the more mature ms3 > 1.0.0.

Version 1.0

The first release reflects the state of the dataset when finalizing chapter 4 of the workflow paper cited above.