Cross‑Cultural Dynamics of Emotional Expression in Lyrics

an interactive exploration of how lyrics express emotions across languages and time

Project Overview

This research project explores how emotions are expressed in song lyrics across different languages and cultures. Using computational analysis and data visualization, we examine patterns in emotional expression, linguistic features, and structural elements of lyrics from around the world.

Our interactive visualizations allow you to explore:

  • How lyrical tone has changed over time across different languages
  • The relationship between linguistic structure and emotional expression
  • Cross-cultural patterns in how emotions are conveyed through lyrics
  • The correlation between different emotional dimensions in various languages

Explore Our Visualizations

Emotion Across Culture

Explore how emotional expression in lyrics varies across different languages and cultures, including historical trends, valence-arousal profiles, and cross-linguistic comparisons.

View Visualizations

Emotion & Linguistics

Examine the relationship between linguistic features and emotional expression, including structural patterns, emotional episode analysis, and feature correlations.

View Visualizations

Research Background

Our research builds on a growing body of work in affective computing, computational linguistics, and musicology. We apply natural language processing techniques to analyze lyrics from multiple languages, examining how different cultures encode emotions in their musical expressions.

Key research questions include:

  • How do emotional expressions in lyrics differ across languages?
  • What temporal trends can be observed in emotional tone across decades?
  • How do linguistic structures correlate with emotional states?
  • Can we identify universal patterns in how emotions are expressed in lyrics?

Historical Trends in Lyrical Tone (1950s--2010s)

Percentage use of emotional and thematic word categories--negative sentiment, positive sentiment, neutral language, money, sex, and swear words--showing how lyrical tone has shifted across decades.

Trends of Lyrical Tone

Loading data...

Emotional Dimensions and Trends in Lyrics by Language

Each language's average valence (positivity), arousal (energy), and dominance (sense of control) as measured in song lyrics, alongside the regression coefficient of each emotion dimension over time. The trend coefficient is obtained via a linear regression of the yearly emotion scores, indicating the average annual increase or decrease in that emotional attribute.

Emotional Profiles & Trends by Language

Loading visualization...

Valence--Arousal Profiles of Song Lyrics Across Languages

Each point represents a language's average valence (positivity) and arousal (energy/intensity) scores, placing languages in a two‑dimensional emotional space for cross‑cultural comparison.

Average Valence Across Languages

Temporal Emotion Trends Within Languages

This visualization allows for an in-depth look at how emotional expression has evolved over time within specific languages. By selecting a language and an emotional dimension, you can observe detailed trends in how that linguistic community has expressed emotions through music across decades.

📊

Select a language and emotion metric to get started

You'll see how that emotional attribute in lyrics has changed over time.

Clustering Analysis: Mapping Structural Patterns to Emotional Episodes

This Sankey diagram presents a fascinating bridge between lyrics structure and emotion. Using K-means clustering on lyrical structures, we identify patterns that map to emotional episodes (EDR, CB, PEP). The diagram displays how structure clusters correspond to recognizable emotional states.

🎵 Episode Model: Functional Emotional Experiences of Music

This section introduces the Episode Model proposed by Eerola et al. (2024), which conceptualizes emotional engagement with music as functionally situated episodes. Each episode reflects how listeners use music to serve specific affective goals—ranging from relaxation to emotional insight.

In our research, we use this model to investigate how these emotional episodes are expressed linguistically in lyrics. By mapping line-level emotional experiences to episode categories, we aim to uncover relationships between linguistic structure (e.g., syntax, repetition, metaphor) and . This bridges music psychology and computational text analysis.

1. Enjoyment–Distraction–Relaxation (EDR)

Function: Mood enhancement, stress relief, and pleasurable immersion.

Core Affect: Increased positive valence; arousal may rise or fall.

Context: Everyday listening to unwind, distract, or evoke bodily enjoyment (e.g., feel-good music, dancing).

2. Connection–Belonging (CB)

Function: Fostering emotional bonds and social inclusion.

Core Affect: Positive valence associated with sociality or shared identity.

Context: Communal settings, rituals, or solitary listening with a sense of imagined social presence.

3. Focus–Motivation (FM)

Function: Supporting goal-directed activities and cognitive engagement.

Core Affect: Positive valence; arousal aligned with task demands.

Context: Listening during work, study, or physical activity to boost energy or maintain flow.

4. Personal Emotional Processing (PEP)

Function: Emotional self-reflection, coping, and autobiographical resonance.

Core Affect: Varied valence; includes sadness, catharsis, or emotional clarity.

Context: Deep, introspective listening tied to identity, memories, and expressive lyrics.

5. Aesthetic–Interest–Awe (AIA)

Function: Evoking awe, aesthetic appreciation, and spiritual/emotional elevation.

Core Affect: Complex or transcendent affective states (e.g., chills, being moved).

Context: Rare but intense encounters with music’s beauty, novelty, or profundity.

Reference: Eerola, T., Kirts, C., & Saarikallio, S. (2024). Episode Model: The functional approach to emotional experiences of music. Psychology of Music. https://doi.org/10.1177/03057356241279763

Lexicon-Based VAD Correlation Matrix

This heatmap visualizes the correlation between different Valence, Arousal, and Dominance (VAD) lexicons across languages, highlighting similarities and differences in how emotional dimensions are captured in various linguistic resources.

VAD Lexicon Correlation Analysis

Loading visualization...

🔍 Detailed Interpretation

Here we break down the lexical VAD distributions and provide interpretive insights based on their shapes and skews.

📈 Valence

  • Right-skewed toward positivity — most lyrics fall in the 0.5–0.7 range.
  • Slight bimodal shape (small bumps) could indicate two emotional “modes” (e.g., joy vs. calmness).
  • Very few extremely negative lyrics (valence < 0.3) — likely a reflection of:
    • Lexical limitations (some angry/sad words not covered)
    • Genre tendencies (e.g., fewer nihilistic tracks)

📈 Arousal

  • Highly peaked around ~0.4 — most songs cluster at moderate arousal levels.
  • Long tail to the right — very few high-energy lyrics.
  • This suggests your corpus emphasizes reflective or emotionally balanced music more than hyper-aroused content (e.g., metal, EDM).

📈 Dominance

  • Similar to arousal, with a dominant peak near 0.5 and a secondary bump.
  • Notably, there’s a small kink/spike at ~0.58, possibly reflecting lexicon bias or some repeated emotional themes (e.g., confident/empowered lyrics).
  • Overall reflects a good range of perceived agency/control in lyrics.

UMAP Clustering: Visualizing Lyrical Structure and Emotional Episodes

This visualization uses UMAP (Uniform Manifold Approximation and Projection) to represent high-dimensional lyrical data in a 2D space. The left plot shows data points colored by KMeans clusters, while the right plot shows the same points colored by their dominant emotional episodes, revealing patterns in how structural clusters relate to emotional content.

UMAP Projection Colored by KMeans Cluster
UMAP of Lyrics Colored by Dominant Episode

Confusion Matrix: VAD to Episode Classification

This visualization shows the relationship between Valence-Arousal-Dominance (VAD) values and emotional episode classifications. The matrix displays how accurately VAD values predict different emotional episodes, highlighting patterns of confusion between similar emotional states.

Confusion Matrix (VAD → Episode)

Correlation Between Structural and Emotional Features

This visualization explores the relationships between lyrical structure (verse length, rhyme patterns, repetition) and emotional dimensions. The heatmap reveals how formal aspects of lyrics correlate with the emotions they express, offering insights into how lyrical form shapes emotional impact.

Emotional Episode Signature Features

This visualization displays the signature features of different emotional episodes (EDR, CB, PEP). Each bar represents the relative strength of a selected feature across emotional episode types, helping identify the distinctive characteristics of each episode.

Episode Signature Features

Loading episode data...

Linguistic Signatures of Emotional Episodes

This visualization compares distinctive linguistic features across different emotional episode categories, revealing how specific language patterns characterize each emotional state.

Episode Signature Feature Comparison

Feature Signatures Across Emotional Episodes

This visualization compares linguistic and structural features across different emotional episodes in lyrics. Each bar represents the average value of a specific feature (such as metaphor density or pronoun usage) for each emotional episode category.

Lyrical Feature Correlations

This visualization presents the Spearman rank correlation coefficients between different lyrical features. The heatmap reveals relationships between structural elements (like repetition and metaphor density) and emotional dimensions (valence, arousal, and dominance), helping identify which features tend to occur together in lyrics.