Evidence-based quantified-self metrics for health domains beyond training load. Companion article to Training Metrics and Automated Coaching. Covers sleep science, cross-domain correlations, what can be computed from personal data, and actionable thresholds with clinical grounding.


Sleep Metrics

Sleep is the highest-leverage health behaviour available to most people — yet it’s the most frequently untracked. Evidence from polysomnography (PSG) validation studies, longitudinal cohorts, and randomised controlled trials establishes a clear set of metrics that matter.

Core Objective Sleep Metrics

MetricDefinitionHow Computed
Total Sleep Time (TST)Actual minutes asleepTIB minus WASO minus sleep latency
Time in Bed (TIB)Lights-off to lights-on windowDevice-measured or self-reported
Sleep Efficiency (SE)Proportion of TIB spent asleep(TST / TIB) × 100
Sleep Latency (SL)Minutes to fall asleep after lights outDevice or diary
Wake After Sleep Onset (WASO)Total minutes awake after first sleepDevice or PSG
Sleep Onset TimeClock time at sleep startTracked for regularity analysis
Sleep Offset TimeClock time at wakeTracked for regularity analysis

Sleep Stages

Polysomnography defines four stages. Consumer devices estimate these from accelerometry and photoplethysmography (PPG):

  • N1 (light) — Transition stage; typically <5% of TST. Easy to wake from.
  • N2 (light) — Dominant stage; ~50-60% of TST. Sleep spindles. Memory consolidation.
  • N3 (deep/slow-wave) — 15-20% of TST. Physical restoration, immune function, growth hormone release. Declines with age.
  • REM — 20-25% of TST. Dreams, emotional processing, procedural memory. Increases across the night; longest episodes in hours 6-8.

Consumer device accuracy vs. PSG (2024 validation studies):

DeviceFour-stage kappaN3 sensitivityREM sensitivityNotes
Oura Ring Gen 30.6579.5%76-78%Closest to PSG totals
Apple Watch Series 80.6050.5%69-83%Best relative MAPE (6.5%); overestimates light by 45 min
Fitbit Sense 20.5561.7%61-62%Underestimates deep by 15 min
WHOOP 4.0 / Fitbit Charge 50.21-0.53Variable59-62%Garmin poor (REM 33%)

All consumer devices achieve >90% sensitivity for sleep-vs-wake detection but struggle with wake specificity (29-52%). Sleep stage totals should be treated as estimates — trends matter more than nightly stage counts.

Sleep Regularity Metrics

Beyond duration, the timing and consistency of sleep are independently predictive of health outcomes. Two validated instruments:

Sleep Regularity Index (SRI)

Measures probability that sleep/wake state is the same at any two time points 24 hours apart. Derived from actigraphy or device data.

  • SRI = 100: Perfect regularity (same sleep pattern every day)
  • SRI < 50: Marked irregularity; associated with poor health outcomes
  • Highest regularity quartile: ~30% lower all-cause mortality and 38% lower cardiometabolic mortality vs. lowest quartile (Oxford study, n=72,269; Sleep, 2024)
  • Sleep regularity was a better predictor of all-cause mortality than sleep duration in direct comparison

Social Jetlag (SJL)

Misalignment between biological clock and social schedule. Computed from Munich Chrono Type Questionnaire (MCTQ):

SJL = |mid-sleep on free days − mid-sleep on workdays|
  • Mean SJL in working adults: ~2 hours
  • Each additional hour of SJL: 11-31% increase in cardiovascular risk (dose-response)
  • 4 hours SJL: elevated C-reactive protein, blood pressure, insulin resistance

  • Night shift workers: 54% experience ≥4 hours SJL

Practical proxy: Standard deviation of sleep onset time across 7 days. A useful “regularity score” computable from Sleep as Android or any continuous sleep tracker.

HRV During Sleep

Heart Rate Variability (RMSSD — root mean square of successive differences) reflects parasympathetic nervous system activity. Sleep is the optimal measurement window because it removes confounders from exercise, caffeine, and meals.

The sleep-HRV relationship:

  • Sleep deprivation significantly reduces RMSSD (p < 0.05) and increases LF/HF ratio (SMD = 1.47, p = 0.0007)
  • Poor sleep quality correlates with reduced overall HRV in 83% of 35+ reviewed studies
  • Pre-sleep RMSSD predicts chronic insomnia with 96-97% accuracy (AUC = 0.997)
  • Pre-sleep RMSSD moderately predicts sleep efficiency (R² = 0.481, p < 0.001)

Measurement protocol (morning, supine):

  1. Immediately upon waking, before rising
  2. Lie flat (supine); breathe naturally — do not deep breathe
  3. 1-5 minutes suffices; lnRMSSD (log-transformed) preferred for trend analysis
  4. Chest strap ECG is gold standard; Oura Ring provides validated PPG-based RMSSD
  5. Build 30-60 day personal baseline; compare current 7-day rolling average to baseline

No universal RMSSD threshold exists — individual baselines vary 10-300 ms. Interpretation is always relative to personal norm:

  • Downward trend for 7+ days below baseline = recovery issue or illness brewing
  • Each standard deviation drop below baseline is a meaningful signal

Cross-Domain Health Correlations

Sleep ↔ Training Performance

The strongest evidence base of any lifestyle↔performance correlation:

Sleep manipulationPerformance effect
10-hour extension (basketball)Sprint speed +4.5%, free throw % +9%, 3-point % +9% (Stanford RCT)
10-hour extension (swimming)Faster reaction time off blocks, improved turn times, kick strokes
9+ hours (tennis)Serving accuracy 36% → 42%
Sleep extension (general)Reaction time improved 15%
Single night partial restriction (4 hrs)Decline in average and max anaerobic power output
Chronic poor sleepAthletes 3× more likely to report poor performance
Sleep deprivationShooting accuracy can drop 50% vs. 10% gain with extension (60% differential)

The mechanism is multifactorial: reduced reaction time, impaired cognitive decision-making, decreased pain tolerance, blunted growth hormone release (which peaks in N3), and elevated cortisol (catabolic).

Agent application: A sleep quality flag in the days before a key workout or event should modify effort targets and expectations downward.

HRV ↔ Recovery Readiness

(See also Training Metrics.) The key cross-domain insight:

  • HRV is bidirectionally linked to sleep: poor sleep suppresses HRV; elevated training stress suppresses HRV and sleep quality
  • Combined poor sleep + poor HRV = multiplicative, not additive, risk for illness and injury
  • HRV downward trend + elevated RHR + shortened TST for 3+ nights = strong overreaching signal

Resting HR ↔ Recovery

Resting Heart Rate (RHR) — most reliable when measured at the same time each morning before rising:

  • Gradual decline over weeks-months: Indicates improving aerobic fitness (VO₂max adaptation)
  • ≥10% acute elevation above personal baseline: Recovery impairment signal; body working overtime (fighting infection, poorly recovered from training, dehydration, poor sleep)
  • <10% elevation: Minor fluctuation; examine context (hydration, diet, stress)

RHR is noisier than HRV (more influenced by genetics, hydration, caffeine) so use 7-day rolling baseline comparison, not single readings.

Weight ↔ Physical Activity

Longitudinal evidence from 33-year cohort data and RCT meta-analyses:

RelationshipEvidence
Diet + exercise vs. diet alone20% greater weight loss at 1-year follow-up
High PA (>2,500 kcal/week)2.9 kg regain at 30 months vs. >6 kg for lower activity
Over 33 years, active vs. inactiveMen: 5.6 kg vs. 9.1 kg gained; women: 3.8 kg vs. 9.5 kg
Aerobic vs. resistance for fat lossAerobic: −1.76 kg fat mass; resistance: −0.83 kg
Exercise preserves lean massDiet-only restriction reduces lean body mass and VO₂max
Compensatory behaviourIncreased exercise → reduced non-exercise activity (partial offset)

Critical nuance: fitness is a stronger predictor of mortality than weight loss. Increased cardiorespiratory fitness reduces mortality and cardiovascular risk more than intentional weight reduction alone.

Sleep Regularity ↔ Metabolic Health

Irregular sleep patterns disrupt the hormonal oscillations of cortisol, insulin, and glucagon. Circadian misalignment (even 4 days of ±1 hour shift while maintaining 8-hour duration) produces measurable sympathetic nervous system activation.

  • Irregular sleep → higher BMI, insulin resistance, type 2 diabetes risk
  • Each 1-hour shift in sleep timing → 31% cardiovascular risk increase
  • CPAP treatment for sleep apnea improves both sleep quality and HRV, suggesting reversibility

What We Can Compute From Existing Data

Assessment of the personal MCP server domains (as of 2026-02-18):

Available Domains

DomainStatusRecordsDate RangeKey Fields
fitness (Strava)✅ Available500+Apr 2020 – Feb 2026activity_type, distance, duration, avg_heartrate*, max_heartrate*
weight (Withings)✅ Available500+Feb 2020 – Feb 2026weight_kg, bmi, fat_ratio*, muscle_mass*
music (Last.fm)✅ Availablescrobbles, timestamps
gaming (Steam/RetroAch)✅ Availablesessions, achievements
films (Letterboxd)✅ Availableviews, ratings
boardgames (BGG)✅ Availableplays
sleep (Sleep as Android)❌ Unavailable0(data file not loaded)

*Fields present in schema but currently null — Strava heart rate requires device capture at activity time; Withings body composition requires Body+ or higher model with wet feet contact.

Current Data Gaps

Heart rate data in fitness: All sampled activities show average_heartrate: null. This means:

  • Cannot currently compute aerobic decoupling
  • Cannot compute exercise HR zones or training stress score
  • Cannot track resting HR trends via Strava
  • Fix: Pair a HR monitor (chest strap or GPS watch) with Strava recording

Body composition in weight: All sampled records show fat_ratio: null, muscle_mass_kg: null. This means:

  • Currently tracking only raw weight + BMI
  • Cannot distinguish fat loss from muscle loss/gain
  • Fix: Withings Body+ or Body Comp model (bioimpedance analysis) with bare feet on wet pads

Sleep data: Domain loaded as unavailable (file path /data/sleep not populated). Cannot currently track any sleep metrics.

  • Fix: Export Sleep as Android data, add to personal MCP server, enable domain

Computable Metrics from Current Data

Despite gaps, meaningful analytics are possible today:

From Fitness Data

# Activity Frequency (days/week, rolling 4 weeks)
Active days = count(distinct date[].week) in 28-day window

# Weekly Volume Trend (proxy for training load)
Weekly minutes = sum(moving_time_seconds/60) grouped by ISO week

# Activity Type Balance
Ratio of Ride:Hike:Walk:WeightTraining across rolling 4-week window

# ACWR Proxy (no HR, use duration as load proxy)
Acute load (7d) = sum(moving_time_seconds, last 7 days)
Chronic load (28d) = mean weekly sum(moving_time_seconds, last 28 days)
ACWR = Acute / Chronic
Alert if > 1.3

# Rest Day Distribution
Days between activities (detect extended gaps >5 days = low streak periods)

# Consistency Score
Active weeks / Total weeks in date range

From Weight Data

# 7-Day Moving Average (noise filter)
trend[n] = mean(weight[n-6 .. n])
Display trend line, not daily values

# Rate of Change (weekly)
weekly_delta = trend[day7] - trend[day0]
Alert if > +0.5 kg/week sustained 3+ weeks (assuming non-training context)
Alert if > -1.0 kg/week (too aggressive, lean mass risk)

# BMI Trajectory
Track BMI category transitions over 90-day windows

# Long-term Weight Arc
Rolling 90-day regression slope — positive/negative/flat determination

Cross-Domain Correlations (Fitness × Weight)

# Exercise-Week vs Weight-Change Correlation
For each week:
  - Compute weekly active minutes (fitness)
  - Compute week-over-week weight change (weight)
  
Pearson r between these two variables across all overlapping weeks
Expected: negative correlation (more activity → less weight gain or more loss)

# Active vs Sedentary Month Comparison
Months with >8 active days: mean weight change
Months with <4 active days: mean weight change
Expected: statistically significant difference

# Weight Trend After Gaps
Track weight trend in 2 weeks following >7-day inactivity gaps
Expected: weight gain correlation with training gaps

From Music/Gaming Data (Behavioural Proxies)

While not direct health metrics, late-night gaming/listening sessions correlate with:

  • Sleep delay (entertainment displacing sleep)
  • Later sleep onset time (computable from last-scrobble/session timestamp)
# Approximate Sleep Delay Proxy
Late-night activity (gaming/music after midnight) on day N
→ Check weight trend and next-day activity on days N+1, N+2

Actionable Thresholds Summary

Sleep

MetricTargetConcernAction
Total Sleep Time7-9 hrs/night<6 hrs or >9 hrsFlag; U-shaped mortality curve peaks at <5 hrs (HR 1.74 all-cause) and >9 hrs (HR 2.11 cardiovascular)
Sleep Efficiency≥85%<80%Cognitive behavioural therapy for insomnia (CBT-I) most effective; restrict time in bed initially
Sleep Latency<20 min>30 min chronicSleep hygiene review; rule out anxiety, delayed circadian phase
WASO<20 min>45 minSleep maintenance issue; rule out sleep apnea, pain, or nocturia
N3 (deep sleep)15-20% of TST<10%Age-related (expected decline); worsened by alcohol, sedatives
REM20-25% of TST<15%Often suppressed by alcohol, SSRIs, certain antihistamines
Sleep Onset ConsistencySD of onset ≤30 minSD >60 minSocial jetlag; target consistent bedtime even on weekends
Sleep Regularity IndexSRI > 80SRI < 50Significant circadian disruption; prioritise schedule consistency
Social Jetlag<1 hr>2 hrsEach additional hour = 11-31% cardiovascular risk increase

HRV / Autonomic

MetricInterpretationThreshold
Morning RMSSDRelative to personal baseline>1 SD below 7-day average = concern
7-day RMSSD trendSustained declineDecline >7 consecutive days below baseline = recovery issue
LF/HF ratioSympathetic/parasympathetic balanceRising trend = autonomic stress; no absolute threshold
lnRMSSDNormalised for trend analysisUse log-transformed value; track SD bands, not absolute value

No population-wide RMSSD threshold is evidence-based — individual baselines vary too widely. Establish personal norm over 30-60 days before acting on single readings.

Resting Heart Rate

LevelInterpretationAction
Gradual decline over monthsImproving fitnessContinue training stimulus
Stable within personal rangeNormalNo action
≥10% above baseline (3+ days)Recovery/illness/overtraining signalReduce training load; assess sleep, hydration, illness
Acute single-day spikeInsufficient dataNote context; observe trend

Weight

SignalThresholdInterpretation
7-day trend rate> +0.5 kg/weekCaloric surplus; assess via activity + intake
7-day trend rate> −1.0 kg/weekExcessive deficit; lean mass risk
Monthly comparisonActive vs sedentary monthsExpect ~0.5-1 kg difference over consistent tracking
BMI trajectory90-day regression slopeAny positive slope + declining activity = intervention signal

Training / Activity

(From companion article Training Metrics and Automated Coaching)

MetricTargetAlert
ACWR (duration proxy)0.8–1.3>1.3 for 5+ days = overreaching risk
TSB−10 to −30Below −30 sustained >7 days = deload
Active weeks/month≥3 of 4<2 active weeks/month = deconditioning
Weekly volume change≤10-15% increaseConnective tissue risk above this rate

Instrumentation Recommendations

For meaningful cross-domain health analytics, the following additions close the largest data gaps:

Priority 1: Heart Rate During Exercise

Gap: Strava activities show no HR data. Solution: Any ANT+/BLE chest strap (Garmin, Wahoo, Polar) or GPS watch paired to Strava records. Unlocks: Training stress score, zone distribution, HRV-informed load management, aerobic decoupling (cardiac drift).

Priority 2: Sleep Tracking

Gap: Sleep domain not loaded in personal MCP. Solution: Sleep as Android (already named in domain config) — needs data export path populated, or manual sync. Unlocks: TST, sleep efficiency, staging, onset consistency, SRI proxy, late-night activity correlation. Alternatives: Oura Ring Gen 3 (most validated consumer device; kappa 0.65 vs. PSG); WHOOP 4.0.

Priority 3: Body Composition

Gap: Withings scale shows no fat/muscle data (all null). Solution: Withings Body+ or Body Comp model (BIA electrodes). Alternatively, manual tape measurements (waist-hip ratio is a validated cardiovascular risk proxy independent of BMI). Unlocks: Fat mass vs. lean mass trends, visceral fat tracking, whether weight changes are compositionally healthy.

Priority 4: Morning HRV

Gap: No current HRV measurement. Solution: Polar H10 chest strap + HRV4Training app (validated, 1-minute morning reading); or Oura Ring (passive overnight measurement). Unlocks: Autonomic recovery signal; integrates with training load for deload decisions.


Data Model for Cross-Domain Correlation Analysis

When sleep, weight, and fitness data coexist, the following correlation matrix becomes computable:

Variables (daily/weekly):
A = daily_active_minutes (fitness)
W = weight_trend_delta (weight)
S = total_sleep_time (sleep, not yet available)
E = sleep_efficiency (sleep, not yet available)
H = morning_rmssd (HRV, not yet available)
R = resting_heartrate (HR monitor, not yet available)
L = late_night_activity (music/gaming proxy, available)

Validated correlations to compute:
- A ↔ W (inverse): more activity → less weight gain
- S ↔ A (positive): better sleep → more/harder training
- H ↔ A (positive next day): higher HRV → higher training load appropriate
- S ↔ H (positive): longer sleep → higher next-morning HRV
- L ↔ S (inverse): late-night activity → shorter sleep (proxy)
- R ↔ H (negative): higher RHR → lower HRV (stress)

The current dataset supports A↔W. All others require additional sensors.


See Also

Sources

Primary Sources

  • Buysse et al. (1989). Pittsburgh Sleep Quality Index: validation. Psychiatry Research. (89.6% sensitivity, 86.5% specificity at PSQI > 5)
  • Cardinali et al. / Chellappa et al. studies on circadian misalignment and cardiovascular autonomic changes.
  • Lunsford-Avery et al. (2018). Sleep Regularity Index derivation and validation in young adults.
  • Piotrowicz et al. on HRV-sleep interaction and morning measurement protocols.

Secondary Sources

  • Oxford population cohort (2024). Sleep regularity and mortality (n=72,269). Sleep. DOI: 10.1093/sleep/zsad285. “Highest regularity: ~30% lower all-cause mortality, 38% lower cardiometabolic mortality.”
  • JAMA Network Open cohort (2021). Sleep duration trajectories and cardiovascular outcomes (n=52,599).
  • Frontiers in Public Health (2022). U-shaped sleep duration and mortality meta-analysis.
  • Stanford Basketball sleep extension study (Mah et al.). Sprint speed +4.5%, shooting accuracy +9%.
  • Social jetlag and cardiovascular risk: PMC9286443; AASM 2017 statement.
  • Consumer device validation: PMC11511193 (2024); Oura Ring validation study (2024, ouraring.com); Sleep Advances (2024).
  • HRV-sleep meta-analysis: PMC12394884 (2025). RMSSD and sleep deprivation.
  • Weight-activity longitudinal: PMC4578965, PMC5556592 (33-year cohort).
  • Daily weigh-in evidence: PMC4380831; JMIR 2021/e25529.
  • Altini, M. “How should you measure your morning HRV?” (substack, evidence-based protocol).

Further Reading

  • Walker, Matthew. Why We Sleep (2017). Accessible synthesis of sleep science.
  • Marco Altini’s newsletter: marcoaltini.substack.com — most rigorous consumer HRV commentary.
  • Oura Ring Science pages — validation methodology transparency.
  • National Sleep Foundation consensus statements on sleep duration (Hirshkowitz et al., 2015).
  • Mah et al. (2011). “The effects of sleep extension on the athletic performance of collegiate basketball players.” SLEEP.