Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Bayesian Panel Data Model: Analysis of HALE and Life Expectancy Gender Gaps (Extended Through 2023 with IHME HALE)

Purpose

This report presents results from a Bayesian hierarchical panel model that analyzes gender gaps in Healthy Life Expectancy (HALE) and Life Expectancy using both temporal variation and cross-country variation simultaneously. Unlike the cross-sectional Elastic Net models, this panel approach leverages data from all country-year combinations, providing more statistical power and allowing us to assess whether predictors that matter cross-sectionally also matter within countries over time.

This extended analysis uses IHME HALE data (2000-2023) instead of WHO HALE data, providing methodological consistency with the IHME predictor indicators and extending the temporal range through 2023. The Life Expectancy model now uses OWID data (2000-2023), which combines Human Mortality Database and UN World Population Prospects, extending coverage through 2023 to match the IHME HALE temporal range.

Key Questions Addressed:

Model Design

Data Structure

The panel dataset transforms from country-level (one row per country) to panel structure (one row per country-year):

Rationale for Using IHME HALE

Why switch from WHO HALE to IHME HALE?

  1. Methodological Consistency: All predictor variables (alcohol, suicide, homicide, cardiovascular, etc.) come from IHME’s Global Burden of Disease (GBD) database. Using IHME HALE ensures that the target variable and predictors are methodologically consistent, using the same data collection processes, estimation methods, and quality standards.

  2. Extended Temporal Coverage: IHME HALE data extends through 2023, providing two additional years of post-acute-COVID data compared to WHO’s 2021 cutoff. This allows us to assess whether COVID-19’s effects on gender gaps persisted or attenuated in 2022-2023.

  3. Data Quality: The correlation between WHO and IHME HALE is very high (r > 0.95), indicating excellent agreement. Both sources are high-quality, but IHME provides the advantages above.

  4. Reproducibility: Using a single data source (IHME) for all cause-specific mortality measures and HALE improves transparency and reproducibility.

Note on OWID Life Expectancy Data: The Life Expectancy model now uses OWID data, which combines Human Mortality Database (HMD) and UN World Population Prospects. This provides extended temporal coverage through 2023 (vs 2021 for WHO), matching the IHME HALE temporal range. OWID LE shows high correlation with WHO LE (r = 0.993) and provides 100% complete data for all OECD countries.

Standardization Strategy

Predictors (Standardized - Full Z-Scores):

Targets (Centered Only, Do Not Scale):

Model Specification

Model Structure: Bayesian hierarchical model with country-level random intercepts and shared slopes.

Notation:

Model:

y*_{it} ~ N(α_i + X*_{it}β, σ)
α_i ~ N(0, σ_α)

Priors:

Why This Model?

  1. Answers the primary scientific question: Does alcohol matter because countries differ from each other, or because countries that reduce alcohol mortality see their gaps narrow? This model can answer both.

  2. Seamlessly extends the cross-sectional Elastic Net model: Provides posterior distributions for β instead of penalized point estimates, with natural interpretation as global “effect size” averaged over space and time.

  3. Preserves counterfactual framework: Produces posterior predictive distributions for country-level counterfactuals.

  4. Computationally feasible: Hierarchical linear model runs efficiently in PyMC using the nutpie sampler.

  5. Uses both within-country and between-country variation: Leverages both sources of information.

  6. Controls for time-invariant country-level factors: Random intercepts account for country-specific characteristics.

  7. Includes full COVID period: By extending through 2023 for HALE, we can assess whether COVID-19’s effects persisted or attenuated.

Model Implementation

Software and Methods

Data Preparation

The panel datasets include:

HALE Model (IHME data, 2000-2023):

Life Expectancy Model (OWID data, 2000-2023):

Predictors (both models):

Note on COVID-19 Predictor: COVID-19 death rates are included as a predictor to assess how the pandemic affected gender gaps. COVID-19 data is available for 2020-2023 (IHME), with zeros for all years before 2020. Both HALE and LE models now include the full COVID period and post-acute recovery phase (2020-2023), enabling assessment of whether pandemic effects persisted or attenuated.

Results: HALE Gap Model

Model Specification:

Model Diagnostics

Convergence and Sampling Quality: The model converged successfully with:

Predictor Coefficients (Beta)

The following table shows the posterior distributions of predictor coefficients. Since predictors are standardized (z-scores), coefficients represent the effect of a 1-standard-deviation change in the predictor on the gender gap in HALE (in years).

Predictormeansdhdi_3%hdi_97%mcse_meanmcse_sdess_bulkess_tailr_hat
Gap_Alcohol0.1310.0290.0770.184003.88e+033e+031
Gap_Suicide0.3630.0360.2920.4270.0010.0013.44e+032.74e+031
Gap_Homicide0.3090.020.270.347004.8e+032.49e+031
Gap_RoadTraffic0.4640.0220.4250.506004.02e+032.9e+031
Gap_Cardiovascular-0.2730.023-0.316-0.228002.91e+032.99e+031
Gap_Diabetes-0.130.02-0.167-0.093003.72e+032.61e+031
Gap_Neoplasms0.2370.0450.1530.3210.0010.0012.42e+032.49e+031
Gap_ChronicRespiratory0.3680.0330.3030.4270.0010.0013.18e+032.96e+031
Gap_LiverDisease0.1910.0280.1380.241003.9e+032.77e+031
Gap_UnintentionalInjury0.1950.0330.1310.2550.0010.0013.39e+032.85e+031
Gap_DrugDisorder0.0560.0170.0240.086004.75e+033.16e+031
Gap_COVID0.060.0080.0440.075005.03e+033.3e+031

Key Findings:

Strongest Positive Effects (larger gender gaps in predictor → larger HALE gap, i.e., women live longer):

  1. Gap_RoadTraffic (β = 0.464, 94% HDI: [0.425, 0.506]): The strongest predictor. Countries with larger male-female gaps in road traffic mortality have larger gender gaps in HALE.

  2. Gap_ChronicRespiratory (β = 0.368, 94% HDI: [0.303, 0.427]): Gender gaps in chronic respiratory disease mortality show a strong association, notably stronger than in the WHO-based 2021 model (β = 0.301).

  3. Gap_Suicide (β = 0.363, 94% HDI: [0.292, 0.427]): The third strongest predictor. Gender gaps in suicide mortality are strongly associated with gender gaps in HALE.

  4. Gap_Homicide (β = 0.309, 94% HDI: [0.270, 0.347]): Gender gaps in homicide mortality are associated with HALE gaps, weaker than in the WHO-based model (β = 0.384).

  5. Gap_Neoplasms (β = 0.237, 94% HDI: [0.153, 0.321]): Gender gaps in cancer mortality contribute to HALE gaps, substantially weaker than in the WHO-based model (β = 0.349).

Moderate Positive Effects:

Negative Effects (larger gender gaps in predictor → smaller HALE gap):

Interpretation:

Predictor Importance on the Original Scale

Standardized coefficients allow direct comparison of effect sizes, but they do not account for how much each predictor typically varies across countries and years. To capture both effect size and real-world variation, we compute an importance measure:

Importance = |β_standardized| × SD_original

This quantity reflects how much a predictor can contribute to explaining variation in gender gaps given the amount of variation that predictor exhibits in the data.

PredictorSD_originalBeta_standardizedImportance
Gap_Cardiovascular36.9-0.273 [-0.316, -0.228]10.093 [8.497, 11.76]
Gap_Neoplasms38.20.237 [0.153, 0.321]9.03 [5.738, 12.162]
Gap_Homicide13.50.309 [0.27, 0.347]4.167 [3.649, 4.682]
Gap_ChronicRespiratory10.70.368 [0.303, 0.427]3.945 [3.271, 4.607]
Gap_Suicide9.550.363 [0.292, 0.427]3.471 [2.802, 4.103]
Gap_UnintentionalInjury15.20.195 [0.131, 0.255]2.97 [2.015, 3.895]
Gap_RoadTraffic5.910.464 [0.425, 0.506]2.744 [2.509, 2.99]
Gap_LiverDisease9.720.191 [0.138, 0.241]1.853 [1.334, 2.342]
Gap_Alcohol6.210.131 [0.077, 0.184]0.813 [0.48, 1.15]
Gap_COVID10.60.06 [0.044, 0.075]0.638 [0.475, 0.802]
Gap_Diabetes3.53-0.13 [-0.167, -0.093]0.458 [0.328, 0.588]
Gap_DrugDisorder2.790.056 [0.024, 0.086]0.156 [0.069, 0.243]

Key Findings:

Results: Life Expectancy Gap Model

Model Specification:

Note: The Life Expectancy model has been updated to use OWID LE data (2000-2023), which provides extended temporal coverage matching the IHME HALE model. OWID LE shows high correlation with WHO LE (r = 0.993) and extends the analysis through the post-acute COVID recovery period.

Model Diagnostics

Convergence and Sampling Quality: The model converged successfully with:

Predictor Coefficients (Beta)

Predictormeansdhdi_3%hdi_97%mcse_meanmcse_sdess_bulkess_tailr_hat
Gap_Alcohol0.1390.0340.0740.2020.0010.0013.83e+032.69e+031
Gap_Suicide0.3640.0410.2890.4410.0010.0013.1e+032.7e+031
Gap_Homicide0.440.0230.3980.485004.03e+032.85e+031
Gap_RoadTraffic0.4460.0250.3980.492003.41e+033.07e+031
Gap_Cardiovascular-0.1880.027-0.24-0.14003.18e+033.3e+031
Gap_Diabetes-0.1060.023-0.148-0.06004.4e+033.21e+031
Gap_Neoplasms0.3130.0540.2030.4090.0010.0012.16e+032.22e+031
Gap_ChronicRespiratory0.2960.0380.230.3740.0010.0013.01e+032.57e+031
Gap_LiverDisease0.250.0320.1960.3170.00103.82e+032.82e+031
Gap_UnintentionalInjury0.1620.0390.0890.2350.0010.0013.13e+032.79e+031
Gap_DrugDisorder0.0910.0190.0550.128004.91e+032.94e+031
Gap_COVID0.1080.010.0890.127004.22e+032.71e+031.01

Key Findings: The pattern of coefficients for Life Expectancy is broadly similar to HALE, with some notable differences:

Comparison with WHO-Based 2021 Model

Key Changes in HALE Model

Data Source Changes:

Coefficient Comparison

Comparing the IHME-based 2023 model to the WHO-based 2021 model:

PredictorWHO 2021 (β)IHME 2023 (β)ChangeInterpretation
Gap_Neoplasms0.3490.237-0.112Largest drop - cancer gap effects weaker in IHME data or changing over time
Gap_Homicide0.3840.309-0.075Major drop - may reflect narrowing homicide gaps in recent years or methodological differences
Gap_Suicide0.4240.363-0.061Moderate decrease with IHME data
Gap_ChronicRespiratory0.3010.368+0.067Increased - respiratory disease gaps more important with IHME data/extended period
Gap_UnintentionalInjury0.1520.195+0.043Increased importance
Gap_DrugDisorder0.0810.056-0.025Reduced effect
Gap_Cardiovascular-0.252-0.273-0.021Slightly stronger protective (competing risk) effect
Gap_LiverDisease0.2090.191-0.018Small decrease
Gap_Alcohol0.1450.131-0.014Small decrease
Gap_RoadTraffic0.4760.464-0.012Small decrease, remains strongest predictor
Gap_COVID0.0540.060+0.006Slightly higher with 2 more years of data
Gap_Diabetes-0.129-0.130-0.001Nearly identical - remarkably stable

Key Observations:

  1. Coefficient Stability: Despite changing data sources (WHO → IHME HALE) and adding 2 years, most coefficients remain within 0.02-0.04 of their previous values, indicating robust relationships across datasets and time periods.

  2. Notable Shifts:

    • Neoplasms (-0.112): Largest decrease suggests either methodological differences between WHO and IHME HALE or evolving cancer dynamics from 2021-2023

    • Homicide (-0.075): Major decrease may reflect narrowing violence gaps in recent years or measurement differences between data sources

    • Chronic Respiratory (+0.067): Increased importance, possibly due to COVID-19’s lingering respiratory impact through 2023 or IHME methodology

  3. Remarkably Stable Predictors:

    • Diabetes (-0.001): Nearly identical coefficient across data sources suggests very stable competing-risk relationship

    • Road Traffic (-0.012): Minimal change, remains the strongest predictor

    • Cardiovascular (-0.021): Competing risk effect remains consistent

    • COVID-19 (+0.006): Similar effect with longer temporal coverage (2020-2023 vs 2020-2021)

  4. Model Performance:

    • Both models achieve excellent fit (R² > 0.98)

    • IHME-based model has 74 more observations (888 vs 814) due to extended temporal range

    • Both models have similar number of effective parameters (~55-56)

    • Slightly different WAIC reflects different data sources and temporal coverage, not worse fit quality

Life Expectancy Model: Extended Through 2023

The Life Expectancy model has been updated with OWID LE data (2000-2023), providing:

Key LE Model Coefficients (2023, sorted by magnitude):

The LE model coefficients are broadly consistent with the HALE model, with COVID-19 showing a larger effect on LE gaps (β = 0.108) than on HALE gaps (β = 0.060), suggesting the pandemic affected overall lifespan more than healthy lifespan.

R² and Residual Analysis

R² Summary

The Bayesian panel models achieve excellent fit:

ModelR² (mean)R² (94% HDI lower)R² (94% HDI upper)MAE (years)Residual Std (years)
HALE Gap0.9820.9810.9820.1740.227
Life Expectancy Gap0.9780.9780.9790.20.267

Key Findings:

Interpretation:

Residual Analysis

Residual analysis for the IHME-based HALE model shows:

StatisticValue (years)
Mean-0.0001
Std0.227
Min-0.945
25%-0.137
Median0.0024
75%0.14
Max0.661
MAE0.174

Key Findings:

Residual Diagnostics:

Residuals vs. predicted values for HALE gap model (IHME, 2000-2023).

Figure 1:Residuals vs. predicted values for HALE gap model (IHME, 2000-2023).

Residuals vs. year for HALE gap model (IHME, 2000-2023).

Figure 2:Residuals vs. year for HALE gap model (IHME, 2000-2023).

Counterfactual Analysis: United States

This section presents counterfactual analysis for the United States using 2023 as the reference year for both HALE and Life Expectancy (the latest available year in both the IHME HALE and OWID LE datasets).

For each gap predictor, we compute what would happen to the predicted gap if we adjusted that predictor to the best attainable value observed across all country-years, while keeping all other predictors constant. The analysis uses posterior distributions to quantify uncertainty.

Key Findings: USA HALE Gap (2023)

Gap-Closing Factors (negative values = reduce HALE gap):

  1. Road Traffic (-0.868 years [-0.943, -0.792]): The largest opportunity for reducing the HALE gap. If the USA could achieve Iceland’s 2017 road traffic gender gap (1.92), the HALE gap would shrink by nearly 1 year.

  2. Suicide (-0.522 years [-0.618, -0.429]): The second-largest factor. Achieving Greece’s 2002 suicide gender gap (4.05) would reduce the HALE gap by over half a year.

  3. Drug Disorders (-0.467 years [-0.715, -0.180]): A major contributor. Achieving Japan’s 2013 drug disorder gap (essentially zero) would reduce the HALE gap by nearly half a year.

  4. Homicide (-0.203 years [-0.229, -0.178]): Reducing the homicide gender gap to zero would reduce the HALE gap by about 0.2 years.

  5. Liver Disease (-0.163 years [-0.209, -0.122]): Achieving Iceland’s 2001 liver disease gap would provide a modest reduction.

  6. Alcohol (-0.150 years [-0.211, -0.086]): Achieving Colombia’s 2016 alcohol gap would reduce the HALE gap by about 0.15 years.

  7. Neoplasms (-0.145 years [-0.198, -0.093]): Eliminating the cancer gender gap would provide a modest reduction.

  8. Unintentional Injury (-0.075 years [-0.099, -0.052]): A smaller but measurable opportunity.

  9. COVID-19 (-0.007 years [-0.009, -0.005]): By 2023, COVID-19’s contribution to the gap is minimal, indicating recovery from the pandemic’s acute phase.

Gap-Widening Factors (positive values = increase HALE gap):

  1. Diabetes (+0.281 years [0.200, 0.362]): The competing risk effect. Eliminating the diabetes gender gap would actually widen the HALE gap, reflecting that diabetes primarily affects people who survive other causes.

  2. Cardiovascular (+0.227 years [0.193, 0.264]): Similar competing risk pattern. Women who survive other causes live to older ages where cardiovascular disease dominates.

  3. Chronic Respiratory (+0.217 years [0.182, 0.254]): Women have worse chronic respiratory disease outcomes, widening the gap.

Total Potential:

Comparison with 2021 WHO-Based Analysis

Notable changes when comparing IHME 2023 results to WHO 2021 results:

These changes reflect a combination of:

Key Findings: USA Life Expectancy Gap (2023)

Gap-Closing Factors (negative values = reduce LE gap):

  1. Road Traffic (-0.833 years [-0.919, -0.743]): The largest opportunity for reducing the LE gap, similar to HALE. If the USA could achieve Iceland’s 2017 road traffic gender gap, the LE gap would shrink by over 0.8 years.

  2. Drug Disorders (-0.770 years [-1.080, -0.464]): The second-largest factor for LE (vs third for HALE). Drug disorders have a larger effect on LE than HALE (0.77 vs 0.47 years), suggesting they affect lifespan more than healthy lifespan.

  3. Suicide (-0.521 years [-0.632, -0.414]): Nearly identical effect to HALE (-0.522 years), showing suicide affects both lifespan and healthy lifespan equally.

  4. Homicide (-0.289 years [-0.319, -0.262]): Larger effect on LE than HALE (0.29 vs 0.20 years), as homicides disproportionately affect younger individuals, reducing total lifespan more than healthy years.

  5. Liver Disease (-0.215 years [-0.271, -0.167]): Slightly larger effect on LE than HALE (0.22 vs 0.16 years).

  6. Neoplasms (-0.193 years [-0.252, -0.125]): Larger effect on LE than HALE (0.19 vs 0.15 years), suggesting cancer affects total lifespan more than healthy lifespan.

  7. Alcohol (-0.159 years [-0.232, -0.085]): Similar to HALE effect (0.15 years).

  8. Unintentional Injury (-0.063 years [-0.091, -0.035]): Similar to HALE effect (0.08 years).

  9. COVID-19 (-0.013 years [-0.015, -0.011]): By 2023, COVID-19’s contribution is minimal but nearly double the HALE effect (0.013 vs 0.007 years), indicating the pandemic affected lifespan more than healthy lifespan.

Gap-Widening Factors (positive values = increase LE gap):

  1. Diabetes (+0.232 years [0.132, 0.323]): Competing risk effect, smaller for LE than HALE (0.23 vs 0.28 years).

  2. Chronic Respiratory (+0.175 years [0.136, 0.220]): Competing risk effect, smaller for LE than HALE (0.18 vs 0.22 years).

  3. Cardiovascular (+0.157 years [0.116, 0.200]): Competing risk effect, smaller for LE than HALE (0.16 vs 0.23 years).

Total Potential:

Comparison: HALE vs LE Counterfactuals (2023)

Key Differences:

  1. Drug Disorders: Much larger effect on LE (-0.770 years) than HALE (-0.467 years), a difference of 0.30 years. This suggests drug-related deaths disproportionately reduce total lifespan compared to healthy years, possibly because they affect younger individuals who would otherwise have many healthy years ahead.

  2. Homicide: Larger effect on LE (-0.289 years) than HALE (-0.203 years), a difference of 0.09 years. Similar to drug disorders, homicides affect younger individuals, reducing total lifespan more than healthy years.

  3. Cardiovascular: Larger competing-risk effect for HALE (+0.227 years) than LE (+0.157 years), a difference of 0.07 years. This suggests cardiovascular disease disproportionately affects healthy years in older age.

  4. Diabetes: Larger competing-risk effect for HALE (+0.281 years) than LE (+0.232 years), a difference of 0.05 years. Similar pattern to cardiovascular disease.

  5. Suicide, Road Traffic, Alcohol: Nearly identical effects for both HALE and LE, indicating these factors affect lifespan and healthy lifespan proportionally.

Overall Pattern:

Counterfactual Effects for All Indicators

HALE Gap Counterfactuals (2023):

IndicatorCurrent gapTarget gapTarget Country-YearChange in HALE gap (years)
Cardiovascular30.700.227 [0.193, 0.264]
Neoplasms23.50-0.145 [-0.198, -0.093]
Homicide8.850-0.203 [-0.229, -0.178]
ChronicRespiratory-6.3200.217 [0.182, 0.254]
Suicide17.74.05Greece (2002)-0.522 [-0.618, -0.429]
UnintentionalInjury5.90-0.075 [-0.099, -0.052]
RoadTraffic131.92Iceland (2017)-0.868 [-0.943, -0.792]
LiverDisease9.060.729Iceland (2001)-0.163 [-0.209, -0.122]
Alcohol7.370.232Colombia (2016)-0.150 [-0.211, -0.086]
COVID1.280-0.007 [-0.009, -0.005]
Diabetes7.7300.281 [0.200, 0.362]
DrugDisorder23.60.0028Japan (2013)-0.467 [-0.715, -0.180]

Life Expectancy Gap Counterfactuals (2023):

IndicatorCurrent gapTarget gapTarget Country-YearChange in Life Expectancy gap (years)
Neoplasms23.50-0.193 [-0.252, -0.125]
Cardiovascular30.700.157 [0.116, 0.200]
Homicide8.850-0.289 [-0.319, -0.262]
Suicide17.74.05Greece (2002)-0.521 [-0.632, -0.414]
ChronicRespiratory-6.3200.175 [0.136, 0.220]
RoadTraffic131.92Iceland (2017)-0.833 [-0.919, -0.743]
UnintentionalInjury5.90-0.063 [-0.091, -0.035]
LiverDisease9.060.729Iceland (2001)-0.215 [-0.271, -0.167]
COVID1.280-0.013 [-0.015, -0.011]
Alcohol7.370.232Colombia (2016)-0.159 [-0.232, -0.085]
Diabetes7.7300.232 [0.132, 0.323]
DrugDisorder23.60.0028Japan (2013)-0.770 [-1.080, -0.464]

Counterfactual Visualizations

HALE Gap (2023):

Forest plot showing counterfactual effects for USA HALE gap (2023) with 94% credible intervals.

Figure 3:Forest plot showing counterfactual effects for USA HALE gap (2023) with 94% credible intervals.

Two-panel plot separating gap-closing (left) and gap-widening (right) factors for USA HALE gap (2023).

Figure 4:Two-panel plot separating gap-closing (left) and gap-widening (right) factors for USA HALE gap (2023).

Bar chart of counterfactual effects sorted by magnitude for USA HALE gap (2023).

Figure 5:Bar chart of counterfactual effects sorted by magnitude for USA HALE gap (2023).

Life Expectancy Gap (2023):

Forest plot showing counterfactual effects for USA Life Expectancy gap (2023) with 94% credible intervals.

Figure 6:Forest plot showing counterfactual effects for USA Life Expectancy gap (2023) with 94% credible intervals.

Two-panel plot separating gap-closing (left) and gap-widening (right) factors for USA Life Expectancy gap (2023).

Figure 7:Two-panel plot separating gap-closing (left) and gap-widening (right) factors for USA Life Expectancy gap (2023).

Bar chart of counterfactual effects sorted by magnitude for USA Life Expectancy gap (2023).

Figure 8:Bar chart of counterfactual effects sorted by magnitude for USA Life Expectancy gap (2023).

Positive-Contributing Factors Over Time

The following analysis shows how gap-closing factors (positive-contributing indicators) have evolved over time for the United States. Each factor’s contribution is computed as the reduction in the gap that would occur if that factor were set to its best attainable value.

HALE Gap - Positive Contributions Over Time (IHME, 2000-2023):

Stacked area chart showing contributions of gap-closing factors over time for USA HALE gap (2000-2023). The chart shows how different factors have contributed to explaining the HALE gap across the full temporal range.

Figure 9:Stacked area chart showing contributions of gap-closing factors over time for USA HALE gap (2000-2023). The chart shows how different factors have contributed to explaining the HALE gap across the full temporal range.

NeoplasmsHomicideSuicideUnintentionalInjuryRoadTrafficLiverDiseaseAlcoholCOVIDDrugDisorderPredicted TotalActual Total
0.1340.1520.5460.06620.9650.1630.074900.08842.492.34
0.1290.1560.550.06380.9920.1650.075300.09472.472.31
0.130.160.5530.06291.010.170.077200.1022.462.31
0.1270.1610.5440.062110.1730.075500.1092.392.28
0.1280.160.5270.06310.9860.170.074900.1132.32.24
0.1270.1690.5340.06011.020.1750.07600.122.322.26
0.1280.1750.5380.06341.030.1740.076100.1272.282.26
0.1350.170.5450.059710.1790.073800.1282.242.23
0.1380.1620.560.05620.9360.1820.075700.1282.152.18
0.1470.150.5690.05660.8450.1830.07500.1281.992.13
0.1440.1420.5780.05550.7940.1860.077900.1281.922.08
0.1440.1410.5880.05050.7970.1880.078900.1341.892.03
0.1480.1440.5910.05030.80.1930.08100.141.882.02
0.1550.1350.5910.05030.7920.1950.084600.1521.852
0.1580.1360.5960.05230.8010.1930.087700.1681.841.98
0.1520.1540.610.05460.8530.1910.093500.1961.912
0.1510.1680.640.05970.9040.1860.10200.2372.012.07
0.1490.1680.6640.06080.9020.180.10700.2661.992.06
0.150.1580.6610.06050.8620.1770.10800.2821.92.01
0.1590.1670.6590.06710.8780.1770.11600.3171.962.01
0.1590.2130.6940.07260.9970.1840.1440.1160.3852.382.32
0.160.2280.7090.07911.060.1880.1580.2270.4452.62.48
0.1520.2160.7080.07411.050.1810.1580.08290.4722.392.2
0.1450.2030.6760.07541.020.1770.1550.007240.4672.172.01
Percentage of actual HALE gap explained by positive-contributing (gap-closing) factors over time. This shows what proportion of the observed gap could be reduced by addressing these factors.

Figure 10:Percentage of actual HALE gap explained by positive-contributing (gap-closing) factors over time. This shows what proportion of the observed gap could be reduced by addressing these factors.

Life Expectancy Gap - Positive Contributions Over Time (OWID, 2000-2023):

Stacked area chart showing contributions of gap-closing factors over time for USA Life Expectancy gap (2000-2023). The extended temporal range now includes the full COVID period and post-acute recovery.

Figure 11:Stacked area chart showing contributions of gap-closing factors over time for USA Life Expectancy gap (2000-2023). The extended temporal range now includes the full COVID period and post-acute recovery.

NeoplasmsHomicideSuicideRoadTrafficUnintentionalInjuryLiverDiseaseCOVIDAlcoholDrugDisorderPredicted TotalActual Total
0.1770.2160.5450.9260.05520.21500.07950.1465.165.31
0.1710.2230.550.9520.05320.21700.07990.1565.165.25
0.1730.2290.5520.9680.05250.22300.0820.1695.175.23
0.1680.2290.5430.9590.05180.22800.08010.185.115.17
0.1690.2290.5260.9460.05260.22300.07950.1865.045.1
0.1690.2410.5330.9820.05020.2300.08070.1985.075.12
0.1690.2490.5380.9850.05290.22900.08080.2095.065.09
0.1790.2420.5450.960.04980.23500.07840.215.035.05
0.1830.2310.560.8980.04690.23900.08030.2114.954.94
0.1950.2130.5680.810.04720.2400.07970.2114.814.89
0.190.2030.5770.7620.04630.24500.08270.2114.744.82
0.1910.20.5870.7650.04220.24700.08380.2214.734.74
0.1960.2060.590.7670.0420.25300.0860.2324.744.74
0.2050.1930.5910.760.0420.25700.08980.2514.724.75
0.2090.1940.5960.7690.04370.25300.09310.2764.734.77
0.2010.2190.6090.8180.04560.25100.09930.3244.844.81
0.1990.2390.6390.8670.04990.24400.1080.394.974.96
0.1970.240.6640.8650.05070.23600.1130.4384.995
0.1990.2250.660.8270.05050.23300.1150.4664.925.01
0.2110.2380.6580.8420.0560.23300.1240.5225.015.01
0.210.3040.6930.9560.06060.2420.2090.1520.6355.595.53
0.2120.3240.7081.010.06610.2480.4090.1680.7345.965.74
0.2010.3080.7071.010.06190.2380.1490.1680.7795.655.46
0.1930.2890.6760.9780.06290.2330.0130.1640.775.374.98
Percentage of actual Life Expectancy gap explained by positive-contributing (gap-closing) factors over time through 2023.

Figure 12:Percentage of actual Life Expectancy gap explained by positive-contributing (gap-closing) factors over time through 2023.

Conclusions

Key Findings

  1. Successful Data Source Transition: The switch from WHO HALE to IHME HALE was successful, maintaining methodological consistency with all predictor variables while extending temporal coverage to 2023.

  2. Extended COVID-19 Period: Including 2022-2023 data shows that COVID-19’s effect on gender gaps persisted into the post-acute phase, with a small but consistent positive coefficient (β = 0.060).

  3. Coefficient Stability with Notable Shifts: Most coefficients remained stable when switching data sources, but three showed substantial changes:

    • Neoplasms decreased (-0.112): Cancer gaps may be less important in IHME data or evolved 2021-2023

    • Homicide decreased (-0.075): Violence gaps may be narrowing or measured differently

    • Chronic Respiratory increased (+0.067): Respiratory disease gaps became more important, possibly due to COVID-19’s long-term effects

  4. Diabetes Coefficient Nearly Identical: The diabetes coefficient was virtually unchanged (β = -0.129 → -0.130), demonstrating a remarkably robust competing-risk relationship across data sources.

  5. Model Performance: Both IHME-based HALE and OWID-based LE models achieve excellent fit (R² > 0.98), explaining nearly all systematic variation in gender gaps across all country-years.

  6. Aligned Temporal Coverage: Both models now span 2000-2023 with 888 observations each, enabling direct comparison of HALE vs LE gap drivers throughout the full COVID period and post-acute recovery phase.

Advantages of IHME HALE and OWID LE Data

IHME HALE:

  1. Methodological Consistency: All variables (HALE and predictors) come from the same IHME GBD methodology

  2. Extended Temporal Range: Two additional years (2022-2023) beyond WHO capture post-acute COVID dynamics

  3. No Extreme Outliers: Unlike WHO data (Israel 2021), IHME data showed no extreme residuals

  4. Maintained Quality: High correlation with WHO HALE (r > 0.95) confirms data quality

OWID LE:

  1. Extended Temporal Coverage: Extends through 2023, matching IHME HALE temporal range (vs 2021 for WHO LE)

  2. Complete OECD Coverage: 100% complete data for all 38 OECD countries including Turkey

  3. High-Quality Sources: Combines authoritative data from Human Mortality Database and UN World Population Prospects

  4. Validated Quality: High correlation with WHO LE (r = 0.993) confirms excellent agreement

  5. Temporal Alignment: Both HALE and LE models now span identical time periods (2000-2023)

Limitations and Future Work

  1. Data Source Differences: Some coefficient changes may reflect methodological differences between WHO HALE and IHME HALE rather than temporal evolution. Similarly, OWID LE combines multiple sources (HMD + UN WPP) vs WHO’s direct estimates. Future work could decompose these methodological effects.

  2. Limited Post-COVID Data: Only 4 years of COVID data (2020-2023) limits assessment of long-term pandemic effects. As more post-2023 data becomes available, tracking whether coefficient shifts persist will be valuable.

  3. Turkey Exclusion: Turkey is excluded from this analysis because it was identified as an outlier with very low likelihood in the Bayesian model. This decision was made based on model diagnostics, not data availability.

  4. OWID vs WHO LE Comparison: OWID LE shows high correlation with WHO LE (r = 0.993) but some country-year combinations differ by up to 3 years. Most differences are within expected bounds for different estimation methodologies.

Recommendations

  1. Continue with IHME HALE: Maintain IHME HALE as the primary target for future analyses to ensure methodological consistency with IHME predictors and maximize temporal coverage.

  2. Continue with OWID LE: Use OWID LE data for extended temporal coverage matching IHME HALE. The high correlation with WHO LE (r = 0.993) confirms data quality while providing the advantage of complete temporal alignment.

  3. Monitor Coefficient Evolution: As more post-2023 data becomes available, track whether the coefficient shifts (especially Neoplasms and Chronic Respiratory) represent lasting changes or transient effects.

  4. Update Annually: As IHME updates its GBD database and OWID incorporates new UN WPP data, rerun models to incorporate new data and assess temporal stability.

  5. Investigate Respiratory Disease: The increased importance of chronic respiratory disease gaps warrants further investigation, particularly regarding COVID-19’s long-term respiratory effects through 2023.

  6. COVID-19 Effect Monitoring: The larger COVID effect in LE (β = 0.108) vs HALE (β = 0.060) suggests pandemic impacts on lifespan exceeded impacts on healthy lifespan. Monitor whether this pattern persists or changes in future years.

  7. Age-Dependent Effects: Counterfactual analysis reveals that causes affecting younger individuals (drug disorders, homicide) have larger effects on LE than HALE, while competing-risk causes in older age (diabetes, cardiovascular) have larger effects on HALE than LE. This pattern provides insights into how different causes affect lifespan vs healthy lifespan across the life course.