Risk Factor Assessment
Evidence-based evaluation of health risks
Overview
The risk factor scoring system quantifies health behaviors and conditions that increase disease risk and reduce health scores. Our approach combines clinical evidence with age-specific adjustments to provide personalized risk assessment.
Technical Architecture
Data Structure
Input includes multiple data sources:
risk_data = {
'risk': {...}, # Risk-related survey responses
'age_days': 10950, # Age in days (30 years)
'indicators': {...}, # Clinical indicators
'familyhx': {...}, # Family health history
'exercise_score': 0.8, # From health factor
'nutrition_score': 0.7 # From health factor
}Question Mapping
Each logical question maps to a numeric key: - 'alcohol' → Key for alcohol use questions - 'smoke' → Key for smoking history - 'drugs' → Key for substance use
Responses are validated against accepted options from the database.
Risk Domains
1. Alcohol Risk
Data collected: - Current use status (yes/no) - Frequency (days per week: 0-7) - Amount (drinks per day: 0-5+) - Last use date
Risk calculation:
if status == "No":
risk = 0
else:
daily_consumption = (days_per_week / 7) * drinks_per_day
if daily_consumption <= 1.0: # Low risk
risk = 0.1
elif daily_consumption <= 2.0: # Moderate
risk = 0.3
else: # High risk
risk = 0.6 + min(daily_consumption - 2.0, 2.0) * 0.2Evidence base: NIAAA guidelines for low-risk drinking
2. Smoking Risk
Data collected: - Current smoking status - Frequency (days per week) - Amount (cigarettes per day) - Quit date (if former smoker)
Risk calculation: - Current smokers: High risk (0.8-1.0) - Recent quitters (<1 year): Moderate risk (0.4-0.6) - Former smokers (>5 years): Low residual risk (0.1-0.2) - Never smokers: No risk (0.0)
Pack-year adjustment:
pack_years = (cigarettes_per_day / 20) * years_smoking
risk_multiplier = 1.0 + min(pack_years / 20, 1.0)3. Drug Use Risk
Categories: - Prescription misuse - Recreational drugs - Cannabis - Other substances
Frequency-based scoring: - Occasional use: 0.2 - Regular use: 0.5 - Daily use: 0.8+
4. BMI Risk
Calculation:
# Convert to metric if needed
height_m = height_cm / 100
bmi = weight_kg / (height_m ** 2)
# Risk categories (WHO classification)
if 18.5 <= bmi < 25:
risk = 0.0 # Normal
elif 25 <= bmi < 30:
risk = 0.2 # Overweight
elif 30 <= bmi < 35:
risk = 0.4 # Obese Class I
elif 35 <= bmi < 40:
risk = 0.6 # Obese Class II
else:
risk = 0.8 # Obese Class IIIUnit conversion support: - Imperial (lb, in) automatically converts to metric - Validation prevents invalid measurements
5. Clinical Indicators
Assessed factors: - Blood pressure - Cholesterol levels - Blood glucose - Other lab values
Integration: Clinical data enhances risk prediction when available
6. Family History
Genetic risk modifiers: - First-degree relatives: Higher weight - Multiple affected relatives: Multiplicative risk - Age of onset in relatives: Earlier onset = higher risk
Age-Specific Risk Adjustment
Risk impact varies by age:
age_factor = lookup_age_factor(risk_type, age_days)
adjusted_risk = base_risk * age_factorAge Factor Table (Smoking Example)
| Age Range | Age Factor | Rationale |
|---|---|---|
| 18-29 | 0.8 | Lower cumulative exposure |
| 30-49 | 1.0 | Standard risk |
| 50-64 | 1.2 | Accumulated damage |
| 65+ | 1.4 | Highest vulnerable population |
Database-driven: All age factors stored in risk_lookup table for easy updates
Risk Score Calculation
Step-by-Step Process
Detect Active Risks
detected_risks = [] for domain in risk_domains: risk_value = calculate_domain_risk(domain, survey_data) if risk_value > 0: detected_risks.append((domain, risk_value))Apply Age Adjustment
for risk_key, risk_value in detected_risks: lookup = get_risk_lookup(risk_key, age_days) base_factor = lookup.risk_factor age_factor = lookup.age_factor adjusted_risk = base_factor * age_factor * risk_valueAggregate Total Risk
total_risk = sum(adjusted_risks)
Mathematical Formula
\[ \text{Total Risk Score} = \sum_{i=1}^{n} R_i \cdot A_i \cdot V_i \]
Where: - \(R_i\) = Base risk factor for domain \(i\) - \(A_i\) = Age adjustment factor - \(V_i\) = Individual’s risk value (0-1) - \(n\) = Number of detected risks
Validation
Clinical Validation
Compared against: - Framingham Risk Score: 0.87 correlation for cardiovascular risk - ASCVD Calculator: 0.82 agreement on high-risk classification - Clinical outcomes: Predictive of 5-year adverse events (AUC 0.78)
Sensitivity Analysis
Tested robustness to: - Missing data - Self-report accuracy - Temporal changes
See Scoring Audit for full validation report.
Limitations
Self-report bias: Responses may underreport risky behaviors
Unmeasured factors: Cannot capture all health risks
Population validity: Based primarily on US/Western populations
Temporal changes: Single time-point assessment
Future Enhancements
- Wearable integration: Objective activity and sleep data
- Lab results: Direct clinical measurements
- Genetic testing: Polygenic risk scores
- Longitudinal tracking: Risk trajectory over time
Open Source
Implementation: github.com/preacterik/preact-health-scoring/blob/main/preact/health_scorer/v020/risk_factor.py
Contribute improvements or report issues on GitHub.
References
- Framingham Heart Study. Risk Assessment Tool (2019)
- WHO. Body Mass Index Classification (2021)
- NIAAA. Alcohol Use Guidelines (2020)
- US Preventive Services Task Force. Risk Factor Screening (2022)
Last updated: February 17, 2026