Evidence-based quantified-self metrics for health domains beyond training load. Companion article to Training Metrics and Automated Coaching. Covers sleep science, cross-domain correlations, what can be computed from personal data, and actionable thresholds with clinical grounding.
Sleep Metrics
Sleep is the highest-leverage health behaviour available to most people — yet it’s the most frequently untracked. Evidence from polysomnography (PSG) validation studies, longitudinal cohorts, and randomised controlled trials establishes a clear set of metrics that matter.
Core Objective Sleep Metrics
| Metric | Definition | How Computed |
|---|---|---|
| Total Sleep Time (TST) | Actual minutes asleep | TIB minus WASO minus sleep latency |
| Time in Bed (TIB) | Lights-off to lights-on window | Device-measured or self-reported |
| Sleep Efficiency (SE) | Proportion of TIB spent asleep | (TST / TIB) × 100 |
| Sleep Latency (SL) | Minutes to fall asleep after lights out | Device or diary |
| Wake After Sleep Onset (WASO) | Total minutes awake after first sleep | Device or PSG |
| Sleep Onset Time | Clock time at sleep start | Tracked for regularity analysis |
| Sleep Offset Time | Clock time at wake | Tracked for regularity analysis |
Sleep Stages
Polysomnography defines four stages. Consumer devices estimate these from accelerometry and photoplethysmography (PPG):
- N1 (light) — Transition stage; typically <5% of TST. Easy to wake from.
- N2 (light) — Dominant stage; ~50-60% of TST. Sleep spindles. Memory consolidation.
- N3 (deep/slow-wave) — 15-20% of TST. Physical restoration, immune function, growth hormone release. Declines with age.
- REM — 20-25% of TST. Dreams, emotional processing, procedural memory. Increases across the night; longest episodes in hours 6-8.
Consumer device accuracy vs. PSG (2024 validation studies):
| Device | Four-stage kappa | N3 sensitivity | REM sensitivity | Notes |
|---|---|---|---|---|
| Oura Ring Gen 3 | 0.65 | 79.5% | 76-78% | Closest to PSG totals |
| Apple Watch Series 8 | 0.60 | 50.5% | 69-83% | Best relative MAPE (6.5%); overestimates light by 45 min |
| Fitbit Sense 2 | 0.55 | 61.7% | 61-62% | Underestimates deep by 15 min |
| WHOOP 4.0 / Fitbit Charge 5 | 0.21-0.53 | Variable | 59-62% | Garmin poor (REM 33%) |
All consumer devices achieve >90% sensitivity for sleep-vs-wake detection but struggle with wake specificity (29-52%). Sleep stage totals should be treated as estimates — trends matter more than nightly stage counts.
Sleep Regularity Metrics
Beyond duration, the timing and consistency of sleep are independently predictive of health outcomes. Two validated instruments:
Sleep Regularity Index (SRI)
Measures probability that sleep/wake state is the same at any two time points 24 hours apart. Derived from actigraphy or device data.
- SRI = 100: Perfect regularity (same sleep pattern every day)
- SRI < 50: Marked irregularity; associated with poor health outcomes
- Highest regularity quartile: ~30% lower all-cause mortality and 38% lower cardiometabolic mortality vs. lowest quartile (Oxford study, n=72,269; Sleep, 2024)
- Sleep regularity was a better predictor of all-cause mortality than sleep duration in direct comparison
Social Jetlag (SJL)
Misalignment between biological clock and social schedule. Computed from Munich Chrono Type Questionnaire (MCTQ):
SJL = |mid-sleep on free days − mid-sleep on workdays|
- Mean SJL in working adults: ~2 hours
- Each additional hour of SJL: 11-31% increase in cardiovascular risk (dose-response)
-
4 hours SJL: elevated C-reactive protein, blood pressure, insulin resistance
- Night shift workers: 54% experience ≥4 hours SJL
Practical proxy: Standard deviation of sleep onset time across 7 days. A useful “regularity score” computable from Sleep as Android or any continuous sleep tracker.
HRV During Sleep
Heart Rate Variability (RMSSD — root mean square of successive differences) reflects parasympathetic nervous system activity. Sleep is the optimal measurement window because it removes confounders from exercise, caffeine, and meals.
The sleep-HRV relationship:
- Sleep deprivation significantly reduces RMSSD (p < 0.05) and increases LF/HF ratio (SMD = 1.47, p = 0.0007)
- Poor sleep quality correlates with reduced overall HRV in 83% of 35+ reviewed studies
- Pre-sleep RMSSD predicts chronic insomnia with 96-97% accuracy (AUC = 0.997)
- Pre-sleep RMSSD moderately predicts sleep efficiency (R² = 0.481, p < 0.001)
Measurement protocol (morning, supine):
- Immediately upon waking, before rising
- Lie flat (supine); breathe naturally — do not deep breathe
- 1-5 minutes suffices; lnRMSSD (log-transformed) preferred for trend analysis
- Chest strap ECG is gold standard; Oura Ring provides validated PPG-based RMSSD
- Build 30-60 day personal baseline; compare current 7-day rolling average to baseline
No universal RMSSD threshold exists — individual baselines vary 10-300 ms. Interpretation is always relative to personal norm:
- Downward trend for 7+ days below baseline = recovery issue or illness brewing
- Each standard deviation drop below baseline is a meaningful signal
Cross-Domain Health Correlations
Sleep ↔ Training Performance
The strongest evidence base of any lifestyle↔performance correlation:
| Sleep manipulation | Performance effect |
|---|---|
| 10-hour extension (basketball) | Sprint speed +4.5%, free throw % +9%, 3-point % +9% (Stanford RCT) |
| 10-hour extension (swimming) | Faster reaction time off blocks, improved turn times, kick strokes |
| 9+ hours (tennis) | Serving accuracy 36% → 42% |
| Sleep extension (general) | Reaction time improved 15% |
| Single night partial restriction (4 hrs) | Decline in average and max anaerobic power output |
| Chronic poor sleep | Athletes 3× more likely to report poor performance |
| Sleep deprivation | Shooting accuracy can drop 50% vs. 10% gain with extension (60% differential) |
The mechanism is multifactorial: reduced reaction time, impaired cognitive decision-making, decreased pain tolerance, blunted growth hormone release (which peaks in N3), and elevated cortisol (catabolic).
Agent application: A sleep quality flag in the days before a key workout or event should modify effort targets and expectations downward.
HRV ↔ Recovery Readiness
(See also Training Metrics.) The key cross-domain insight:
- HRV is bidirectionally linked to sleep: poor sleep suppresses HRV; elevated training stress suppresses HRV and sleep quality
- Combined poor sleep + poor HRV = multiplicative, not additive, risk for illness and injury
- HRV downward trend + elevated RHR + shortened TST for 3+ nights = strong overreaching signal
Resting HR ↔ Recovery
Resting Heart Rate (RHR) — most reliable when measured at the same time each morning before rising:
- Gradual decline over weeks-months: Indicates improving aerobic fitness (VO₂max adaptation)
- ≥10% acute elevation above personal baseline: Recovery impairment signal; body working overtime (fighting infection, poorly recovered from training, dehydration, poor sleep)
- <10% elevation: Minor fluctuation; examine context (hydration, diet, stress)
RHR is noisier than HRV (more influenced by genetics, hydration, caffeine) so use 7-day rolling baseline comparison, not single readings.
Weight ↔ Physical Activity
Longitudinal evidence from 33-year cohort data and RCT meta-analyses:
| Relationship | Evidence |
|---|---|
| Diet + exercise vs. diet alone | 20% greater weight loss at 1-year follow-up |
| High PA (>2,500 kcal/week) | 2.9 kg regain at 30 months vs. >6 kg for lower activity |
| Over 33 years, active vs. inactive | Men: 5.6 kg vs. 9.1 kg gained; women: 3.8 kg vs. 9.5 kg |
| Aerobic vs. resistance for fat loss | Aerobic: −1.76 kg fat mass; resistance: −0.83 kg |
| Exercise preserves lean mass | Diet-only restriction reduces lean body mass and VO₂max |
| Compensatory behaviour | Increased exercise → reduced non-exercise activity (partial offset) |
Critical nuance: fitness is a stronger predictor of mortality than weight loss. Increased cardiorespiratory fitness reduces mortality and cardiovascular risk more than intentional weight reduction alone.
Sleep Regularity ↔ Metabolic Health
Irregular sleep patterns disrupt the hormonal oscillations of cortisol, insulin, and glucagon. Circadian misalignment (even 4 days of ±1 hour shift while maintaining 8-hour duration) produces measurable sympathetic nervous system activation.
- Irregular sleep → higher BMI, insulin resistance, type 2 diabetes risk
- Each 1-hour shift in sleep timing → 31% cardiovascular risk increase
- CPAP treatment for sleep apnea improves both sleep quality and HRV, suggesting reversibility
What We Can Compute From Existing Data
Assessment of the personal MCP server domains (as of 2026-02-18):
Available Domains
| Domain | Status | Records | Date Range | Key Fields |
|---|---|---|---|---|
fitness (Strava) | ✅ Available | 500+ | Apr 2020 – Feb 2026 | activity_type, distance, duration, avg_heartrate*, max_heartrate* |
weight (Withings) | ✅ Available | 500+ | Feb 2020 – Feb 2026 | weight_kg, bmi, fat_ratio*, muscle_mass* |
music (Last.fm) | ✅ Available | — | — | scrobbles, timestamps |
gaming (Steam/RetroAch) | ✅ Available | — | — | sessions, achievements |
films (Letterboxd) | ✅ Available | — | — | views, ratings |
boardgames (BGG) | ✅ Available | — | — | plays |
sleep (Sleep as Android) | ❌ Unavailable | 0 | — | (data file not loaded) |
*Fields present in schema but currently null — Strava heart rate requires device capture at activity time; Withings body composition requires Body+ or higher model with wet feet contact.
Current Data Gaps
Heart rate data in fitness: All sampled activities show average_heartrate: null. This means:
- Cannot currently compute aerobic decoupling
- Cannot compute exercise HR zones or training stress score
- Cannot track resting HR trends via Strava
- Fix: Pair a HR monitor (chest strap or GPS watch) with Strava recording
Body composition in weight: All sampled records show fat_ratio: null, muscle_mass_kg: null. This means:
- Currently tracking only raw weight + BMI
- Cannot distinguish fat loss from muscle loss/gain
- Fix: Withings Body+ or Body Comp model (bioimpedance analysis) with bare feet on wet pads
Sleep data: Domain loaded as unavailable (file path /data/sleep not populated). Cannot currently track any sleep metrics.
- Fix: Export Sleep as Android data, add to personal MCP server, enable domain
Computable Metrics from Current Data
Despite gaps, meaningful analytics are possible today:
From Fitness Data
# Activity Frequency (days/week, rolling 4 weeks)
Active days = count(distinct date[].week) in 28-day window
# Weekly Volume Trend (proxy for training load)
Weekly minutes = sum(moving_time_seconds/60) grouped by ISO week
# Activity Type Balance
Ratio of Ride:Hike:Walk:WeightTraining across rolling 4-week window
# ACWR Proxy (no HR, use duration as load proxy)
Acute load (7d) = sum(moving_time_seconds, last 7 days)
Chronic load (28d) = mean weekly sum(moving_time_seconds, last 28 days)
ACWR = Acute / Chronic
Alert if > 1.3
# Rest Day Distribution
Days between activities (detect extended gaps >5 days = low streak periods)
# Consistency Score
Active weeks / Total weeks in date range
From Weight Data
# 7-Day Moving Average (noise filter)
trend[n] = mean(weight[n-6 .. n])
Display trend line, not daily values
# Rate of Change (weekly)
weekly_delta = trend[day7] - trend[day0]
Alert if > +0.5 kg/week sustained 3+ weeks (assuming non-training context)
Alert if > -1.0 kg/week (too aggressive, lean mass risk)
# BMI Trajectory
Track BMI category transitions over 90-day windows
# Long-term Weight Arc
Rolling 90-day regression slope — positive/negative/flat determination
Cross-Domain Correlations (Fitness × Weight)
# Exercise-Week vs Weight-Change Correlation
For each week:
- Compute weekly active minutes (fitness)
- Compute week-over-week weight change (weight)
Pearson r between these two variables across all overlapping weeks
Expected: negative correlation (more activity → less weight gain or more loss)
# Active vs Sedentary Month Comparison
Months with >8 active days: mean weight change
Months with <4 active days: mean weight change
Expected: statistically significant difference
# Weight Trend After Gaps
Track weight trend in 2 weeks following >7-day inactivity gaps
Expected: weight gain correlation with training gaps
From Music/Gaming Data (Behavioural Proxies)
While not direct health metrics, late-night gaming/listening sessions correlate with:
- Sleep delay (entertainment displacing sleep)
- Later sleep onset time (computable from last-scrobble/session timestamp)
# Approximate Sleep Delay Proxy
Late-night activity (gaming/music after midnight) on day N
→ Check weight trend and next-day activity on days N+1, N+2
Actionable Thresholds Summary
Sleep
| Metric | Target | Concern | Action |
|---|---|---|---|
| Total Sleep Time | 7-9 hrs/night | <6 hrs or >9 hrs | Flag; U-shaped mortality curve peaks at <5 hrs (HR 1.74 all-cause) and >9 hrs (HR 2.11 cardiovascular) |
| Sleep Efficiency | ≥85% | <80% | Cognitive behavioural therapy for insomnia (CBT-I) most effective; restrict time in bed initially |
| Sleep Latency | <20 min | >30 min chronic | Sleep hygiene review; rule out anxiety, delayed circadian phase |
| WASO | <20 min | >45 min | Sleep maintenance issue; rule out sleep apnea, pain, or nocturia |
| N3 (deep sleep) | 15-20% of TST | <10% | Age-related (expected decline); worsened by alcohol, sedatives |
| REM | 20-25% of TST | <15% | Often suppressed by alcohol, SSRIs, certain antihistamines |
| Sleep Onset Consistency | SD of onset ≤30 min | SD >60 min | Social jetlag; target consistent bedtime even on weekends |
| Sleep Regularity Index | SRI > 80 | SRI < 50 | Significant circadian disruption; prioritise schedule consistency |
| Social Jetlag | <1 hr | >2 hrs | Each additional hour = 11-31% cardiovascular risk increase |
HRV / Autonomic
| Metric | Interpretation | Threshold |
|---|---|---|
| Morning RMSSD | Relative to personal baseline | >1 SD below 7-day average = concern |
| 7-day RMSSD trend | Sustained decline | Decline >7 consecutive days below baseline = recovery issue |
| LF/HF ratio | Sympathetic/parasympathetic balance | Rising trend = autonomic stress; no absolute threshold |
| lnRMSSD | Normalised for trend analysis | Use log-transformed value; track SD bands, not absolute value |
No population-wide RMSSD threshold is evidence-based — individual baselines vary too widely. Establish personal norm over 30-60 days before acting on single readings.
Resting Heart Rate
| Level | Interpretation | Action |
|---|---|---|
| Gradual decline over months | Improving fitness | Continue training stimulus |
| Stable within personal range | Normal | No action |
| ≥10% above baseline (3+ days) | Recovery/illness/overtraining signal | Reduce training load; assess sleep, hydration, illness |
| Acute single-day spike | Insufficient data | Note context; observe trend |
Weight
| Signal | Threshold | Interpretation |
|---|---|---|
| 7-day trend rate | > +0.5 kg/week | Caloric surplus; assess via activity + intake |
| 7-day trend rate | > −1.0 kg/week | Excessive deficit; lean mass risk |
| Monthly comparison | Active vs sedentary months | Expect ~0.5-1 kg difference over consistent tracking |
| BMI trajectory | 90-day regression slope | Any positive slope + declining activity = intervention signal |
Training / Activity
(From companion article Training Metrics and Automated Coaching)
| Metric | Target | Alert |
|---|---|---|
| ACWR (duration proxy) | 0.8–1.3 | >1.3 for 5+ days = overreaching risk |
| TSB | −10 to −30 | Below −30 sustained >7 days = deload |
| Active weeks/month | ≥3 of 4 | <2 active weeks/month = deconditioning |
| Weekly volume change | ≤10-15% increase | Connective tissue risk above this rate |
Instrumentation Recommendations
For meaningful cross-domain health analytics, the following additions close the largest data gaps:
Priority 1: Heart Rate During Exercise
Gap: Strava activities show no HR data. Solution: Any ANT+/BLE chest strap (Garmin, Wahoo, Polar) or GPS watch paired to Strava records. Unlocks: Training stress score, zone distribution, HRV-informed load management, aerobic decoupling (cardiac drift).
Priority 2: Sleep Tracking
Gap: Sleep domain not loaded in personal MCP. Solution: Sleep as Android (already named in domain config) — needs data export path populated, or manual sync. Unlocks: TST, sleep efficiency, staging, onset consistency, SRI proxy, late-night activity correlation. Alternatives: Oura Ring Gen 3 (most validated consumer device; kappa 0.65 vs. PSG); WHOOP 4.0.
Priority 3: Body Composition
Gap: Withings scale shows no fat/muscle data (all null). Solution: Withings Body+ or Body Comp model (BIA electrodes). Alternatively, manual tape measurements (waist-hip ratio is a validated cardiovascular risk proxy independent of BMI). Unlocks: Fat mass vs. lean mass trends, visceral fat tracking, whether weight changes are compositionally healthy.
Priority 4: Morning HRV
Gap: No current HRV measurement. Solution: Polar H10 chest strap + HRV4Training app (validated, 1-minute morning reading); or Oura Ring (passive overnight measurement). Unlocks: Autonomic recovery signal; integrates with training load for deload decisions.
Data Model for Cross-Domain Correlation Analysis
When sleep, weight, and fitness data coexist, the following correlation matrix becomes computable:
Variables (daily/weekly):
A = daily_active_minutes (fitness)
W = weight_trend_delta (weight)
S = total_sleep_time (sleep, not yet available)
E = sleep_efficiency (sleep, not yet available)
H = morning_rmssd (HRV, not yet available)
R = resting_heartrate (HR monitor, not yet available)
L = late_night_activity (music/gaming proxy, available)
Validated correlations to compute:
- A ↔ W (inverse): more activity → less weight gain
- S ↔ A (positive): better sleep → more/harder training
- H ↔ A (positive next day): higher HRV → higher training load appropriate
- S ↔ H (positive): longer sleep → higher next-morning HRV
- L ↔ S (inverse): late-night activity → shorter sleep (proxy)
- R ↔ H (negative): higher RHR → lower HRV (stress)
The current dataset supports A↔W. All others require additional sensors.
See Also
- Training Metrics and Automated Coaching — companion article on TSB/CTL/ATL framework and automated coaching logic
- Model Context Protocol — the MCP standard used to expose personal data domains to agents
- API Sync Pattern — architecture for continuous data collection (Strava, Withings, and other integrations)
- Last.fm API — music listening data, used here as a behavioral proxy for late-night activity
- Data Visualization for Agents — tools for rendering correlation analysis and health dashboards
Sources
Primary Sources
- Buysse et al. (1989). Pittsburgh Sleep Quality Index: validation. Psychiatry Research. (89.6% sensitivity, 86.5% specificity at PSQI > 5)
- Cardinali et al. / Chellappa et al. studies on circadian misalignment and cardiovascular autonomic changes.
- Lunsford-Avery et al. (2018). Sleep Regularity Index derivation and validation in young adults.
- Piotrowicz et al. on HRV-sleep interaction and morning measurement protocols.
Secondary Sources
- Oxford population cohort (2024). Sleep regularity and mortality (n=72,269). Sleep. DOI: 10.1093/sleep/zsad285. “Highest regularity: ~30% lower all-cause mortality, 38% lower cardiometabolic mortality.”
- JAMA Network Open cohort (2021). Sleep duration trajectories and cardiovascular outcomes (n=52,599).
- Frontiers in Public Health (2022). U-shaped sleep duration and mortality meta-analysis.
- Stanford Basketball sleep extension study (Mah et al.). Sprint speed +4.5%, shooting accuracy +9%.
- Social jetlag and cardiovascular risk: PMC9286443; AASM 2017 statement.
- Consumer device validation: PMC11511193 (2024); Oura Ring validation study (2024, ouraring.com); Sleep Advances (2024).
- HRV-sleep meta-analysis: PMC12394884 (2025). RMSSD and sleep deprivation.
- Weight-activity longitudinal: PMC4578965, PMC5556592 (33-year cohort).
- Daily weigh-in evidence: PMC4380831; JMIR 2021/e25529.
- Altini, M. “How should you measure your morning HRV?” (substack, evidence-based protocol).
Further Reading
- Walker, Matthew. Why We Sleep (2017). Accessible synthesis of sleep science.
- Marco Altini’s newsletter: marcoaltini.substack.com — most rigorous consumer HRV commentary.
- Oura Ring Science pages — validation methodology transparency.
- National Sleep Foundation consensus statements on sleep duration (Hirshkowitz et al., 2015).
- Mah et al. (2011). “The effects of sleep extension on the athletic performance of collegiate basketball players.” SLEEP.