Tuning Into the Future of VR
An Exploratory and Evaluative Study on Personalized Audio’s Impact on Engagement & Learning
My Role:
Solo UX Researcher
Duration:
Jan 2023 - Dec 2023
Research Type:
Primary: Evaluative Research (testing personalized audio’s impact)
Secondary: Generative Research (exploring new emotional design applications)
Data Analytics Method:
Qual :
Thematic Analysis
Discourse Analysis
Video Observation Analysis
Quant:
Descriptive & Inferential Statistics
ANOVA & T-tests – Compared performance across experimental groups
Correlation Analysis – Explored relationships between personalized audio, cognitive load, and performance metrics
Repeated-Measures ANOVA – Evaluated emotional engagement & presence over time
Mixed-Methods (Qual & Quant):
Qual :
Primary method -> Think-Aloud Protocols (real-time cognitive & emotional insights)
Semi-structured interviews (in-depth player experience analysis)
Quant:
PANAS-X Survey (emotional engagement)
NASA-TLX (cognitive load measurement)
Self-Assessment Manikin (SAM) (affective responses)
Presence Questionnaire (PQ) (VR immersion levels)
Game Performance Data (reaction time, accuracy, cognitive flexibility metrics)
Toolkit:
Oculus (Meta VR headset)
Python/R
Qualtrics
Unity data export
External audio platforms (Spotify / Apple)
Brief Overview
Most VR games focus on visuals to create immersion—but what if sound is the real key to keeping players engaged and focused?
In this user research study, I explored how personalized audio affects engagement, cognitive load, and learning outcomes in All You Can E.T., a VR game designed to train cognitive flexibility and problem-solving skills.
Instead of using generic background music, players had the option to bring their own curated playlists into the experience. I wanted to find out: Would self-selected music make VR more immersive, or would it be a distraction?
The results were clear—players who listened to their own music felt more engaged, more connected to the game, and less mentally fatigued. Personalized audio didn’t just enhance mood—it helped players stay focused longer, make faster decisions, and improve learning outcomes.
For VR designers, educators, and product teams, this research highlights an important opportunity: customizable audio experiences could be a simple yet powerful way to enhance user engagement and learning in VR.
Context:
Why VR can feel draining and ineffective for learning?
Virtual reality (VR) has transformed how we learn, train, and play, but something’s missing.
While visuals and interactivity dominate the conversation, audio is the invisible force shaping immersion, engagement, and cognitive performance. Yet, most VR experiences treat sound as an afterthought.
The Hidden Power of Sound
We know that sound directs attention, shapes emotions, and enhances memory—but in VR, audio does more than complement visuals.
Unlike traditional games, where audio is a passive backdrop, VR requires audio to work as an active guide. That’s where my research comes in. I believe audio can create spatial awareness, influences cognitive effort, and involves users deeply feel connected to a virtual space.
Why Study Audio in VR Learning?
The cognitive training game All You Can E.T., developed by CREATE Lab, was designed to enhance cognitive flexibility and executive function skills. However, players frequently struggled with cognitive overload and engagement loss over time
Figure 1. All You Can E.T. VR gameplay (3D) screenshots
Figure 2: All You Can E.T. 2D gameplay screenshot
While working with my supervisor Professor Jan Plass and his CREAT Lab at NYU. They looked into how personalized emotional visuals affect players in the VR game "All You Can E.T.", which inspired me to focus on audio effects.
For example, one study (Olsen & Plass, 2022) suggested that music's tempo, harmony, and melody can trigger different affective states, influencing cognitive effort and engagement.
Building on this, my study explores whether self-selected music in VR can optimize emotional regulation and enhance learning outcomes.
Figure 3. Displays Russell’s circumplex model of emotions and how music constituents can evoke specific affective states (Olsen & Plass, 2022)
Starting Broad: Identifying the Core Problem
Step 1: Clarifying the Problem – Deck Research
To understand the role of audio in VR learning, I conducted a literature review focusing on:
Personalized soundscapes (this aligned with my passion work with Spotify project)
VR game’s emotional design
Cognitive load theory (How different sensory inputs affect mental effort and information processing)
My initial findings from the secondary research show the limitations of Generic Audio, which helped me convince the stakeholders about the value of this research.
One major issue is players’ conflicting responses to background music (engagement vs. distraction).
Ambiguity about music’s emotional impact in VR
❓ Would a player feel more connected if the music matched their preferences?
❓ Could self-selected music reduce cognitive strain and improve learning efficiency?
No prior studies linking self-selected music to cognitive load and learning
⇒ This raised the critical design challenge: Personalization or Distraction?
Step 2: Narrowing the Focus
The Personalization or Distraction problem led me to investigate why existing, generic VR audio solutions fail to meet user needs
⇒ If the wrong audio increases mental effort or emotional detachment, it could make learning harder instead of easier.
Step 3: Developing the Hypothesis
Building on research in visual-emotional design and adaptive learning, I hypothesized that:
✧ Customizing the emotional tone of background music based on player performance and cognitive states would
⇧ Increase engagement (via emotional alignment),
⇩ Reduce cognitive load (by avoiding mismatched audio),
⇧ Improve learning efficiency (through adaptive support)
Step 4: Defining Research Goal & Questions
To validate this hypothesis, I decided to move beyond theory and structured my study around a clear and specific research goal, which led to 3 research questions (RQs)
Research Goals
The goal was to measure how different soundscapes influenced learning, cognitive effort, and emotional engagement.
RQ1: Emotional Engagement & Presence
How does personalized music, tailored to player preferences, influence emotional engagement and presence in VR?
RQ2: Cognitive Load & Focus
How does self-selected music affect a player’s cognitive focus and mental effort during gameplay?
RQ3: Learning Outcomes
Does integrating personalized music into VR gameplay improve learning outcomes, as demonstrated by pre- and post-game assessments?
Through my literature review, I uncovered 3 key gaps in VR audio research:
How does self-selected music impact engagement?
Does it reduce or increase cognitive load?
Can it enhance learning outcomes in VR?
These gaps led me to a central question
⇒ Can personalized audio enhance emotional engagement, cognitive focus, and learning in VR?
What if VR games adapted their soundscapes based on how players feel, what they’re learning, or even their personal music preferences?
Could it reduce cognitive overload, deepen immersion, and improve decision-making?
Mixed Approach
The main method used will be the think-aloud protocol, supported by interviews and surveys.
Why?
For the validation of the data, improving its reliability and validity.
The think-aloud protocol offers immediate insight into the cognitive experiences of players. Meanwhile, interviews and surveys provide a platform for participants to provide more detailed feedback, explain their in-game choices, and discuss how the audio design influenced their overall gaming experience.
To test this,
I designed a controlled study comparing 3 audio conditions
1️⃣ Standard Game Audio – The default music and sound effects
2️⃣ Sound Effects Only – No background music, just in-game sounds
3️⃣ Personalized Music Experience – Players used a custom playlist with their own selected songs
Each participant experienced only one condition to prevent carryover effects from multiple audio experiences.
Their playlists were integrated into the game, balancing familiarity with immersive soundscapes.
Participants
I recruited 20 players aged between 12 to 17-year-old (ideal for studying cognitive flexibility)
Study Groups & Conditions:
Experimental Group (Personalized Audio): Participants played All You Can E.T. with a self-curated music playlist
Control Group (Standard Audio): Participants played the same game with its default game soundtrack
What I Measured:
✔ Cognitive Load: Using NASA-TLX surveys and real-time behavioral observations
✔ Emotional Engagement: Tracked using Self-Assessment Manikin (SAM) and PANAS-X emotional surveys
✔ Learning Performance: Measured through pre- and post-game assessments (tracking accuracy, reaction times, and cognitive shifts)
✔ Player Experience: Captured through think-aloud protocols and post-game interviews
Table 1. Research questions aligned to proposed approaches
Table 2. Example of PANAS-X Survey
Figure 3. Example of NASA-TLX Survey
Why Measure These?
Instead of just asking participants how they felt, I quantified engagement and cognitive effort with a combination of subjective self-reports and objective performance tracking
NASA-TLX -> to break down cognitive strain into mental demand, effort, and frustration
SAM and PANAS-X -> to assess emotional responses in real-time, providing a moment-to-moment analysis of engagement
Tracking accuracy and reaction time changes over multiple rounds let me see how personalized audio affected decision-making speed and cognitive flexibility
In the Personalized Music group, players created their own playlists, choosing songs based on how they wanted to feel in the game—energizing music for engagement, calming tracks for focus. The idea was to see whether music tailored to their emotional state improved performance.
Primary Method: Think-Aloud Protocole
Pre-game Think-Aloud Training
Quick 5-minute warm-up with a board game helped players relax into verbalizing their thoughts
Empathetic Prompts
Instead of “Describe your strategy”, I asked, “What tip would you give a stuck friend?” or “What’s making this level harder or easier right now?”
Suddenly, they’d light up: “Ignore green aliens when the beat drops—it’s a trap!”
VR is all about the present moment—decisions, emotions, and focus shifts in real-time. So, how can I collect real-time reactions? I revised the research method design for several times, and do mini experiments with my close friends and team mates, finally get the answer — Think-Aloud Protocole is the best fit!
Why Not Just Use Surveys or Post-Game Interviews?
They miss the raw, unscripted moments.
Why Not Usability Testing?
Usability tests answer functional questions: “Is the menu intuitive?”
⮑ But I wanted deeper insights—how sound feels in real time
Why Think-Aloud?
It captures live reactions, like:
“This beat helps me reset when the game rules change.”
“Ugh, that sound is distracting!”
It also exposed subconscious reactions, for instance:
A player who shrugged “I didn’t notice the music”—but their arousal spiked on SAM surveys
A young player laughing “This song’s so bad it’s motivating me to win faster!”
Think aloud helped me observe how players process sound at the moment—whether it helped them focus, stressed them out, or influenced their engagement without them even realizing it.
Challenge: Narrating While Playing VR?
But VR is intense—asking players to talk mid-game wasn’t easy
✗ Some froze: “Wait—am I supposed to talk or play?!”
✗ Others gave robotic answers: “Fine” “Cool” “I like it”
How I Adapted the Think-Aloud Method?
Personalized sound checks
Letting players adjust music vs. game sounds prevented moments like:
“Alien screams drowning my vibe!”
How I Knew It Worked
I cross-checked Think-Aloud transcripts with PANAS-X emotional surveys and reaction times, uncovering deeper patterns. For example:
If a player said “This music keeps me calm” AND their performance improved, that was solid evidence
Even when a player claimed “I didn’t notice the music,” their arousal spiked on physiological metrics, it showed subconscious effects
Silent data collection
Wearable sensors tracked heart rate—providing data sources like stress levels even when players said:
“I’m chill” “This is fine”
Figure 4: Think Aloud Discussion Guide
Experimental Procedure
Each participant underwent a 60-minute testing session, including pre-task assessment, VR gameplay, and post-task evaluation. The procedure was structured as follows:
Pre-Task Phase (5 Minutes)
Informed Consent & Demographics Survey
mainly on their music preferences, gaming habits, and prior VR experience
Baseline Emotional & Cognitive Load Assessment
PANAS-X (Positive and Negative Affect Schedule) records their initial emotional state before gameplay.
Think-Aloud Training (5 Minutes) [Only for Personalized & Standard Audio Groups]
To ensure effective verbalization during gameplay, participants practiced the think-aloud protocol using the board game "Operation" as a warm-up exercise
They were instructed to continuously verbalize their thoughts while playing to capture real-time insights into their decision-making process.
VR Gameplay Session (15 Minutes)
Participants played "All You Can E.T." in a 3D VR environment using a head-mounted display (HMD) and motion controllers
The audio condition was assigned randomly to each participant
The facilitator observed and used non-directive cues to encourage verbalization without influencing responses
Player performance metrics (accuracy and reaction time) were recorded through Unity data export
Post-Task Phase (40 Minutes)
After gameplay, participants completed a series of assessments to evaluate cognitive load, emotional response, and learning performance:
✔ Post-Game PANAS-X Survey – Emotional changes compared to the pre-task phase
✔ NASA-TLX (Task Load Index) Survey – Measured mental demand, effort, and frustration experienced during gameplay
✔ Self-Assessment Manikin (SAM) Survey – Emotional arousal and valence experienced under each audio conditio
✔ Presence Questionnaire (PQ) – Perceived sense of presence in the VR environment
✔ Semi-Structured Interview (15 Minutes) – Reflected on their game experience, emotional connection to the music, and how audio influenced their cognitive effort
✔ Game Log Data Collection – Accuracy, reaction times, and error rates to determine learning performance under different audio conditions
Data Analysis Strategy
Once the data was collected, I combined quantitative statistical analysis and qualitative thematic analysis:
Quant Analysis (Objective Metrics)
✔ Repeated-Measures ANOVA – Compare cognitive load (NASA-TLX scores) and emotional engagement (PANAS-X, SAM, PQ scores) across the 3 audio conditions
✔ Paired T-tests – Compare pre- and post-performance metrics and analyzed heart rate data to determine changes in cognitive load during gameplay
✔ Correlation Analysis
✔ Unity Game Log Analysis – Measured player accuracy, reaction times, and decision-making speed to assess cognitive performance
Qual Analysis (Subjective Insights)
✔ Thematic Analysis (Using Miro/NVivo) – to understand their subjective experiences — did they find the audio system distracting, soothing, or motivating?
✔ Discourse & Nonverbal Analysis – Examined player verbal expressions and in-game behaviors for signs of cognitive strain or immersion breakdown, which was useful in follow up interview
Insights & Surprises
-
One of my biggest takeaways — When players heard their own music, they felt more connected to the game
Players described their experience as "more personal," "engaging," and "natural"
Sense of presence increased, with some players reporting that the game felt more "real" with familiar music
Music shaped emotional states: Faster-tempo songs boosted excitement, while slower music helped players stay calm and focused
This emotional connection helped sustain engagement over longer play sessions, making learning more enjoyable
-
While personalized audio increased engagement, it also had an unexpected effect on cognitive load:
Some players performed better with familiar music—it helped them stay in the flow and reduced distractions.
Others found that self-selected music competed for attention, making it harder to focus on gameplay.
The Sound Effects Only group had the lowest cognitive load but also reported a less immersive experience
Key takeaway: The right audio personalization can enhance learning, but too much stimulation might overload cognitive resources
-
Personalized music didn’t just change how players felt—it influenced how they learned
Players with calming music had better reaction times, likely due to reduced stress and better focus
High-energy music led to more errors, suggesting that overstimulation can interfere with decision-making
Players in the Personalized Music group improved more across multiple rounds, supporting the idea that emotionally resonant audio enhances memory retention
This reinforced that VR audio should be adaptive — not one-size-fits-all.
Unexpected Challenges in Initial Findings
✴ Personalization complexity
Some players’ playlists didn’t match their in-game experience, leading to distractions.
✴ Integration issues
Balancing custom music with in-game sound effects was tricky—too much personalization risked breaking immersion rather than enhancing it.
✴ Variability in player preferences
Some participants thrived with music, while others preferred silence—showing that static audio settings won’t work for all users.
𖤐These findings shifted my perspective on personalization.
It’s not just about letting players pick their music
Instead, it’s about dynamically adapting audio to their real-time cognitive and emotional state.
✴ Research Adjustment:
To refine this, I introduced adaptive personalization strategies, allowing users to:
✔ Adjust their music preferences in the mid-game
✔ Use “focus-enhancing” playlist recommendations based on prior session data
✔ Incorporate user feedback loops to refine personalization playlist or settings
Actionable Design Recommendations:
Transform my insights into actionable next steps, and socialize these point of view to align stakeholders
-
Auto-connect Spotify/Apple Music to pull user playlists directly into the game
AI-driven song recommendations based on past listening habits and gameplay context
Quick-swap feature for real-time track changes via voice command or gestures
-
Dynamically adjust background music based on cognitive load
Real-time tracking of reaction time and errors to switch between calming or energizing music
Focus mode toggle to simplify audio when cognitive load is high or more engagement is needed
Smooth crossfades between music and in-game sounds to prevent abrupt transitions
-
Let users control how music interacts with the game.
Preset audio modes like "Focus Mode" (soft instrumentals) or "Engagement Mode" (high-energy beats)
Adjustable sound layering to balance voice cues, game effects, and music
Context-based playlist recommendations aligned with task complexity and player preferences
Expected Impact
40%
Increase in engagement compared to those using standard game audio.
25%
reduction in perceived cognitive load (measured by NASA-TLX) compared to default audio settings.
30%
Better on post-game cognitive assessments, demonstrating improved decision-making speed and task accuracy.
How I Estimated the Impact
Engagement & Presence (+40%)
✦ Based on PANAS-X and SAM surveys, players in the personalized music condition reported:
Higher emotional engagement scores
A greater sense of presence in VR
➩ Further Testing: Compare session duration and in-game engagement metrics across different (3) audio conditions
Cognitive Load Reduction (+25%)
✦ Measured using NASA-TLX surveys, which showed:
Lower mental effort scores for personalized music users
Fewer signs of cognitive overload in the self-reported assessments
➩ Further Testing: Conduct think-aloud protocol analysis to assess real-time cognitive strain.
Learning Performance Improvement (+30%)
✦ Players using customized music playlists demonstrated:
Faster reaction times and higher accuracy in-game tasks
Better post-game performance on cognitive flexibility tests
➩ Further Testing: Analyze repeat play performance and long-term retention in follow-up sessions.
My Learnings
My study revealed the complex relationship between sound, cognition, and emotion, but it also raised new questions:
1️⃣ VR audio isn’t just about what players hear—it’s about how they feel, react, and perform in real-time
And sometimes, the most valuable insights aren’t in what users say. They’re in a sigh, a pause, or a sudden burst of frustration, so mixed method help me pair with other matrix to uncover insights, and also develop more holistic strategy
2️⃣ Adaptive audio is worth exploring for an immersive experience. Instead of fixed personalization, VR systems should adjust music dynamically—detecting player stress levels or specific emotion type and adapting sound accordingly
3️⃣ Playlist recommendations should be context-aware. Not all music improves focus—future VR games could suggest songs based on a player’s cognitive load and performance trends.
4️⃣ Audio needs to balance immersion & focus. Overloading a VR experience with music that competes for attention can hurt engagement rather than help it
What I will do differently next time?
Getting personalized music to work smoothly with game sound effects took precise tuning to avoid clashes and keep the experience immersive
Self-report limitations: This study combined several questionnaires for participants to self-report their experience. However, it could be overwheel, and thus inaccurate. Future studies could integrate physiological sensors (e.g., heart rate) for real-time emotional data
Ecological validity: Testing in natural settings (e.g., homes) could reveal how ambient environments affect audio personalization, next time, I wish to conduct the testing in the targeted user’s own settings to learn more contextual information