Wojciech Matejuk - Modelling emotional nuance of musical performance | PyData London 2024

Modeling emotional nuances of piano performances using MIDI data, machine learning, and large language models, uncovering the relationship between pianist, instrument, and music.

Key takeaways
  • The speaker proposes modeling emotional nuances of piano performances using MIDI data and machine learning.
  • The piano keyboard interface has 88 keys, and each key press has a specific pitch, velocity, and timing.
  • The phrase “emotional nuances” refers to the relationship between pianist and instrument, and pianist and music.
  • The speaker suggests training large language models (LLMs) on MIDI data and fine-tuning them using piano performances.
  • The speaker mentions the importance of understanding the mathematical nature of music and the need for a community of data scientists to work together on this problem.
  • The speaker introduces the concept of tokenization in natural language processing (NLP) and applies it to MIDI data.
  • The speaker uses a BPE tokenizer to create sub-word units for tokens.
  • The speaker shares results of initial experiments using GPT-2 and shows that the model can generate continuations of musical fragments.
  • The speaker mentions the potential for using algorithmic music composition for musical analysis and generation.
  • The speaker shares plans for future experiments, including using diffusion models and vector quantized variational autoencoders.
  • The speaker emphasizes the importance of understanding the relationship between pianist and instrument, and pianist and music.
  • The speaker provides examples of musical scores and their interpretations, highlighting the importance of understanding harmony and musical structure.
  • The speaker mentions the challenges of modeling emotional nuances of piano performances and the need for a community of data scientists to work together on this problem.