This dataset contains health and lifestyle survey data aimed at understanding the relationship between various factors and the prevalence of diabetes. It is derived from the Behavioral Risk Factor Surveillance System (BRFSS) 2015, a health-related telephone survey conducted by the Centers for Disease Control and Prevention (CDC).
- Source: Behavioral Risk Factor Surveillance System (BRFSS) 2015, CDC
- Instances: 253,680 survey responses
- Features: 21 variables, including demographic information, health indicators, and lifestyle factors
The dataset includes the following variables:
- Diabetes_binary: Target variable indicating diabetes status (0 = no diabetes, 1 = prediabetes or diabetes)
- HighBP: High blood pressure status (0 = no, 1 = yes)
- HighChol: High cholesterol status (0 = no, 1 = yes)
- CholCheck: Whether cholesterol was checked in the past 5 years (0 = no, 1 = yes)
- BMI: Body Mass Index (continuous variable)
- Smoker: Smoking status (0 = no, 1 = yes)
- Stroke: History of stroke (0 = no, 1 = yes)
- HeartDiseaseorAttack: History of heart disease or heart attack (0 = no, 1 = yes)
- PhysActivity: Physical activity in the past 30 days (0 = no, 1 = yes)
- Fruits: Consumption of fruits at least once per day (0 = no, 1 = yes)
- Veggies: Consumption of vegetables at least once per day (0 = no, 1 = yes)
- HvyAlcoholConsump: Heavy alcohol consumption (0 = no, 1 = yes)
- AnyHealthcare: Having any kind of healthcare coverage (0 = no, 1 = yes)
- NoDocbcCost: Did not visit a doctor due to cost in the past 12 months (0 = no, 1 = yes)
- GenHlth: General health rating (1 = excellent, 2 = very good, 3 = good, 4 = fair, 5 = poor)
- MentHlth: Number of days mental health was not good in the past 30 days
- PhysHlth: Number of days physical health was not good in the past 30 days
- DiffWalk: Difficulty walking or climbing stairs (0 = no, 1 = yes)
- Sex: Gender (0 = female, 1 = male)
- Age: Age category (1 = 18-24, 2 = 25-29, ..., 13 = 80 or older)