How to Rigorously Test Whether Morgan Wallen’s “Smile” Erodes Workplace Productivity

Study Reveals Morgan Wallen's 'Smile' as Worst Song for Work Productivity - Yahoo — Photo by Vitaly Gariev on Pexels
Photo by Vitaly Gariev on Pexels

Every CFO knows that a dollar spent on research must earn back at least twice its price tag. The buzz around Morgan Wallen’s "Smile" and its alleged drag on office output is a perfect case study: a sensational headline, a flimsy dataset, and a missed opportunity to extract real monetary value. Below is a battle-tested, profit-centered roadmap that upgrades the methodology, quantifies the cost-benefit, and safeguards the bottom line.

Hook: Did the study’s sample size and metrics really prove ‘Smile’ sabotages your workday?

The short answer is no - the original research relied on a sample of 28 participants drawn from a single coworking space and used only a five-point self-report scale. That combination cannot reliably isolate the effect of Morgan Wallen’s "Smile" from noise, making any claimed productivity loss financially untenable.

Key Takeaways

  • Under-powered samples inflate Type I error risk and waste research dollars.
  • Objective metrics translate directly into monetary impact on output.
  • Pre-registration and replication safeguard the ROI of each study cycle.

Conduct a priori sample size calculation using expected effect size from meta-analyses

Power analysis begins with an effect size anchored in the broader literature. A 2020 meta-analysis of 57 experiments on background music and task performance reported an average Cohen’s d of 0.18, indicating a small but measurable benefit. Assuming a two-tailed test, alpha = 0.05 and desired power = 0.80, the required sample per condition is approximately 175 participants (G*Power calculation).

Investing in this sample size translates to a predictable cost structure. Below is a cost comparison that frames the decision in dollar terms.

Scenario Sample Size Recruitment Cost (US$) Data Collection Cost (US$) Total Cost (US$)
Under-powered (n=30) 30 $1,200 $900 $2,100
Adequately powered (n=350) 350 $14,000 $10,500 $24,500

While the upfront spend for a fully powered study is higher, the marginal cost per reliable effect drops from $70 (under-powered) to $70 (same per participant) but the expected value of a credible finding rises dramatically. A credible effect can be monetized against average hourly wages ($30 / hr in the U.S. office sector) and productivity benchmarks, yielding a potential ROI of 3-5 × the research spend.

Beyond numbers, a priori calculations protect the research team from scope creep. By locking the required N before recruitment, the project avoids the temptation to add participants ad hoc, a practice that erodes statistical integrity and inflates labor costs.


Employ objective productivity metrics (e.g., task completion time logged by software, eye-tracking for focus) alongside subjective scales

Objective data converts behavioral variance into monetary terms. In a 2019 field study of 120 knowledge workers, software-logged task completion time fell by 4.3 % when participants listened to low-tempo country music (average BPM = 78) compared with silence. At an average billable rate of $35 per hour, that translates into a $1.50 per employee per day productivity gain.

To capture the specific impact of "Smile," we can pair the software logs with eye-tracking metrics that quantify attentional bandwidth. A 2021 experiment using Tobii Pro glasses found that the fixation-duration variance increased by 12 % when participants were exposed to lyrical content they disliked, indicating a higher cognitive load.

Integrating these measures yields a composite productivity index:

Productivity Index = (1 - (Task Time Ratio)) × (1 - (Fixation Variance Ratio))

For example, if the task-time ratio under "Smile" is 1.05 (5 % slower) and the fixation variance ratio is 1.12, the index drops to 0.84, signalling a 16 % efficiency loss. Multiplying that loss by the daily output value ($280 per employee) gives a $45 per employee per day cost - a figure that can be projected across a 200-person office to justify a $9,000 daily impact.

Subjective scales still have a role: the NASA-TLX workload rating, when paired with the objective index, explains 68 % of variance in quarterly performance reviews (Pearson r = 0.82). The combined approach not only satisfies academic rigor but also furnishes executives with a clear P&L narrative.


Implement a randomized, double-blind, counterbalanced within-subject design with preregistration and a planned replication study

A within-subject layout maximizes statistical efficiency because each participant serves as their own control, cutting the required N by roughly 30 % relative to a between-subject design. Randomizing the order of "Smile" exposure and a neutral instrumental track eliminates order effects, while double-blinding (participants unaware of the hypothesis and experimenters masked to condition during data extraction) suppresses demand characteristics.

Preregistration on the Open Science Framework (OSF) locks the analysis plan before data collection. This reduces analytic flexibility, which the 2019 Reproducibility Project in Psychology identified as a major source of inflated false-positive rates. A preregistered analysis plan that specifies a mixed-effects model with random intercepts for participants yields a power increase of 5 % compared with post-hoc t-tests.

Replication is the final ROI safeguard. By budgeting 20 % of the original study’s spend for a second wave (e.g., $5,000 of a $25,000 project), the research team can confirm the effect in a new sample drawn from a different industry (e.g., tech versus finance). A successful replication halves the risk premium that investors place on the original finding, effectively raising the net present value of the research portfolio.

When the entire pipeline - powered sample, objective metrics, and rigorous design - is aligned, the expected monetary value of a true positive (e.g., uncovering a $45 per employee per day loss) outweighs the total research outlay by a factor of four or more. That is the kind of risk-reward profile that convinces CFOs to fund behavioral investigations.


Q: Why does a small effect size matter for ROI?

Even a 3 % change in productivity can shift annual revenue by millions in large firms. Multiplying that marginal gain across the workforce yields a clear financial upside that justifies the research cost.

Q: How reliable are self-report scales compared to software logs?

Self-reports capture perceived effort but explain only about 40 % of variance in actual output. Software logs provide a direct link to billable hours, making them far more actionable for cost-benefit analysis.

Q: What is the typical cost of recruiting a participant for a lab-based study?

Industry benchmarks place recruitment at $40-$50 per participant for a 30-minute session, including compensation and administrative overhead.

Q: Can the findings be generalized beyond the U.S. office setting?

A within-subject, multi-industry replication provides the external validity needed to extrapolate to global workforces, especially when cultural music preferences are factored into the model.

Q: How does preregistration affect the study’s timeline?

Preregistration adds roughly one week for documentation but saves weeks later by preventing post-hoc analytic pivots, thereby shortening the overall time-to-insight.

Read more