Emotion & Genre Classifier
2024
2-person ML project · music mood & genre from audio + lyrics
Overview
A teammate and I built an ML pipeline that takes a single audio file and spits out the song’s mood, dominant emotion, and genre. We wanted to use both the sound and the lyrics—so the system fetches lyrics (via Genius), runs emotion models on the audio and the text, and does zero-shot genre prediction. One function, classify_song(), ties it all together: you drop in a file and get back mood tags, valence–arousal scores, and a genre label.
Figuring out how to wire the pieces together was the interesting part. We had to decide where each model lived in the flow, how to normalize the outputs, and how to handle missing or messy lyrics. I focused a lot on making the pipeline clear and easy to extend so we could swap or add models without breaking everything.
It was a small team and we both had a hand in the architecture and the code. Getting the first end-to-end run to work—from raw audio to a coherent set of labels—felt like we’d actually built something you could hand to someone and say, “here, try it.”
Contributions
- •Co-designed system architecture and integration flow
- •Built unified classify_song() pipeline for audio and lyrics
- •Integrated audio emotion, lyric emotion, and zero-shot genre prediction
Key Features
- 1.Multimodal Pipeline: Audio emotion recognition, lyric emotion classification, zero-shot genre prediction
- 2.Unified classify_song(): Single entry point: mood tags, valence-arousal, dominant emotion, genre from one audio file
- 3.Lyric Integration: Automated lyric retrieval and emotion classification via Genius API
Implementation
- •Language: Python
- •Models: HuggingFace Transformers, Music2Emotion
- •API: Genius (lyrics)