-
+
-
+
-Column
-Missing Values
+Column
+Missing Values
-0
-age
-0
+0
+age
+0
-1
-anaemia
-0
+1
+anaemia
+0
-2
-creatinine_phosphokinase
-0
+2
+creatinine_phosphokinase
+0
-3
-diabetes
-0
+3
+diabetes
+0
-4
-ejection_fraction
-0
+4
+ejection_fraction
+0
-5
-high_blood_pressure
-0
+5
+high_blood_pressure
+0
-6
-platelets
-0
+6
+platelets
+0
-7
-serum_creatinine
-0
+7
+serum_creatinine
+0
-8
-serum_sodium
-0
+8
+serum_sodium
+0
-9
-sex
-0
+9
+sex
+0
-10
-smoking
-0
+10
+smoking
+0
-11
-time
-0
+11
+time
+0
-12
-DEATH_EVENT
-0
+12
+DEATH_EVENT
+0
@@ -3334,7 +3339,7 @@
-
+
@@ -3373,12 +3378,18 @@ Model
We compared Decision Tree, KNN, Logistic Regression, and selected Logistic Regression due to its interpretability, and ability to handle both linear and non-linear relationships between features. Logistic Regression performed better than the other two models as it works well with fewer features and is less prone to overfitting compared to more complex models like Decision Trees or KNN, especially when the data is relatively small.
Hyperparameter tuning to find find the best Logistic Regression model:
-
-
+
+
+
+
+Table 4: Logistic Regression Scores
+
+
+
-
+
@@ -3424,26 +3435,30 @@ Model
+
+
+
The model is performing well with C = 0.0001 with a CV score of 0.83 and is close to train score, indicating that model is generalising well.
-
+
-The best features to train our model are show in Table 4:
+Logistic regression performs better than Decision tree and KNN on the cross validation data, hence, we selected it as our final model.
+The best features to train our model are show in Table 5:
-Table 4: Top features for trainig the model.
+Table 5: Top features for trainig the model.
-
+
@@ -3547,19 +3562,19 @@ Confusion Matrix
-Table 5: Confusion matrix for the final model on the test dataset.
+Table 6: Confusion matrix for the final model on the test dataset.
-
+
-
+
Predicted
-0
-1
+0
+1
Actual
@@ -3569,14 +3584,14 @@ Confusion Matrix
-0
-35
-6
+0
+35
+6
-1
-5
-14
+1
+5
+14
@@ -3589,10 +3604,10 @@ Confusion Matrix
-Table 6: Evaluation metrics for the final model.
+Table 7: Evaluation metrics for the final model.
-
+
@@ -3638,7 +3653,7 @@ Confusion Matrix
Results and Conculsion
-The analysis revealed that platelets
and ejection_fraction
are the most important features (see Table 4) in predicting the risk of patient mortality. These features significantly impact the model’s ability to assess patient risk, which is crucial for early intervention. Our model achieved a recall score of 0.74 (see Table 6), which is a good start, but there is room for improvement, particularly in reducing the number of high risk patients the model might miss, i.e., maximising recall by minimising False Negatives.
+The analysis revealed that platelets
and ejection_fraction
are the most important features (see Table 5) in predicting the risk of patient mortality. These features significantly impact the model’s ability to assess patient risk, which is crucial for early intervention. Our model achieved a recall score of 0.74 (see Table 7), which is a good start, but there is room for improvement, particularly in reducing the number of high risk patients the model might miss, i.e., maximising recall by minimising False Negatives.
The main challenges in this project stem from class imbalance and limited data availability. With more diverse and comprehensive datasets, performance could be further enhanced. We would also like to explore other machine learning models to improve the overall accuracy.
In conclusion, while the current model shows potential, there is significant opportunity to enhance its effectiveness. With improvements in data quality and model optimization, this tool could become a crucial asset in predicting patient risk and saving lives.
diff --git a/reports/heart-failure-analysis.ipynb b/reports/heart-failure-analysis.ipynb
index 73c6cc9..f3e7ebf 100644
--- a/reports/heart-failure-analysis.ipynb
+++ b/reports/heart-failure-analysis.ipynb
@@ -62,7 +62,7 @@
},
{
"cell_type": "code",
- "execution_count": 1,
+ "execution_count": 44,
"metadata": {},
"outputs": [
{
@@ -70,74 +70,74 @@
"text/html": [
"\n",
- "\n",
+ "\n",
" \n",
" \n",
- " Column Name \n",
- " Description \n",
+ " Column Name \n",
+ " Description \n",
" \n",
" \n",
" \n",
" \n",
- " age \n",
- " Patient's age \n",
+ " age \n",
+ " Patient's age \n",
" \n",
" \n",
- " anaemia \n",
- " Decrease of red blood cells or hemoglobin \n",
+ " anaemia \n",
+ " Decrease of red blood cells or hemoglobin \n",
" \n",
" \n",
- " creatinine_phosphokinase \n",
- " Level of the CPK enzyme in the blood \n",
+ " creatinine_phosphokinase \n",
+ " Level of the CPK enzyme in the blood \n",
" \n",
" \n",
- " diabetes \n",
- " If the patient has diabetes \n",
+ " diabetes \n",
+ " If the patient has diabetes \n",
" \n",
" \n",
- " ejection_fraction \n",
- " Percentage of blood leaving the heart at each contraction \n",
+ " ejection_fraction \n",
+ " Percentage of blood leaving the heart at each contraction \n",
" \n",
" \n",
- " high_blood_pressure \n",
- " If the patient has hypertension \n",
+ " high_blood_pressure \n",
+ " If the patient has hypertension \n",
" \n",
" \n",
- " platelets \n",
- " Platelets in the blood \n",
+ " platelets \n",
+ " Platelets in the blood \n",
" \n",
" \n",
- " serum_creatinine \n",
- " Level of serum creatinine in the blood \n",
+ " serum_creatinine \n",
+ " Level of serum creatinine in the blood \n",
" \n",
" \n",
- " serum_sodium \n",
- " Level of serum sodium in the blood \n",
+ " serum_sodium \n",
+ " Level of serum sodium in the blood \n",
" \n",
" \n",
- " sex \n",
- " Woman or man \n",
+ " sex \n",
+ " Woman or man \n",
" \n",
" \n",
- " smoking \n",
- " If the patient smokes or not \n",
+ " smoking \n",
+ " If the patient smokes or not \n",
" \n",
" \n",
- " time \n",
- " Follow-up period \n",
+ " time \n",
+ " Follow-up period \n",
" \n",
" \n",
- " DEATH_EVENT \n",
- " Whether the patient died or not (target variable) \n",
+ " DEATH_EVENT \n",
+ " Whether the patient died or not (target variable) \n",
" \n",
" \n",
"
\n"
],
"text/plain": [
- ""
+ ""
]
},
- "execution_count": 1,
+ "execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
@@ -170,7 +170,7 @@
"Markdown(table_df.to_markdown(index=False))\n",
"\n",
"# Save the table as a CSV\n",
- "table_df.to_csv('tables/patient_table.csv', index=False)\n",
+ "table_df.to_csv('../results/tables/patient_table.csv', index=False)\n",
"\n",
"# Display the DataFrame without the index\n",
"table_df.style.hide(axis=\"index\")\n"
@@ -224,17 +224,9 @@
},
{
"cell_type": "code",
- "execution_count": 2,
+ "execution_count": 45,
"metadata": {},
- "outputs": [
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "deepchecks - WARNING - You are using deepchecks version 0.18.1, however a newer version is available. Deepchecks is frequently updated with major improvements. You should consider upgrading via the \"python -m pip install --upgrade deepchecks\" command.\n"
- ]
- }
- ],
+ "outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
@@ -265,12 +257,12 @@
},
{
"cell_type": "code",
- "execution_count": 3,
+ "execution_count": 46,
"metadata": {},
"outputs": [],
"source": [
"# Load the dataset\n",
- "file_path = 'data/heart_failure_clinical_records_dataset.csv'\n",
+ "file_path = '../data/raw/heart_failure_clinical_records_dataset.csv'\n",
"heart_failure_data = pd.read_csv(file_path)\n",
"\n",
"# List of binary columns\n",
@@ -289,7 +281,7 @@
},
{
"cell_type": "code",
- "execution_count": 4,
+ "execution_count": 47,
"metadata": {},
"outputs": [
{
@@ -553,7 +545,7 @@
"[299 rows x 13 columns]"
]
},
- "execution_count": 4,
+ "execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
@@ -587,7 +579,7 @@
},
{
"cell_type": "code",
- "execution_count": 5,
+ "execution_count": 48,
"metadata": {},
"outputs": [
{
@@ -596,7 +588,7 @@
"(299, 13)"
]
},
- "execution_count": 5,
+ "execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
@@ -607,7 +599,7 @@
},
{
"cell_type": "code",
- "execution_count": 6,
+ "execution_count": 49,
"metadata": {},
"outputs": [
{
@@ -643,7 +635,7 @@
},
{
"cell_type": "code",
- "execution_count": 7,
+ "execution_count": 50,
"metadata": {},
"outputs": [
{
@@ -655,7 +647,7 @@
"Name: count, dtype: int64"
]
},
- "execution_count": 7,
+ "execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
@@ -674,7 +666,7 @@
},
{
"cell_type": "code",
- "execution_count": 8,
+ "execution_count": 51,
"metadata": {},
"outputs": [
{
@@ -821,7 +813,7 @@
"max 9.40000 148.000000 285.000000 "
]
},
- "execution_count": 8,
+ "execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
@@ -834,7 +826,7 @@
},
{
"cell_type": "code",
- "execution_count": 9,
+ "execution_count": 52,
"metadata": {},
"outputs": [
{
@@ -949,7 +941,7 @@
"12 DEATH_EVENT 0"
]
},
- "execution_count": 9,
+ "execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
@@ -977,7 +969,7 @@
},
{
"cell_type": "code",
- "execution_count": 10,
+ "execution_count": 53,
"metadata": {},
"outputs": [
{
@@ -985,23 +977,23 @@
"text/html": [
"\n",
"\n",
- "\n",
+ "\n",
""
],
"text/plain": [
"alt.VConcatChart(...)"
]
},
- "execution_count": 12,
+ "execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
@@ -1250,7 +1242,7 @@
},
{
"cell_type": "code",
- "execution_count": 13,
+ "execution_count": 56,
"metadata": {},
"outputs": [
{
@@ -1258,23 +1250,23 @@
"text/html": [
"\n",
"\n",
- "\n",
+ "\n",
""
],
"text/plain": [
"alt.Chart(...)"
]
},
- "execution_count": 14,
+ "execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
@@ -1426,7 +1418,7 @@
},
{
"cell_type": "code",
- "execution_count": 15,
+ "execution_count": 58,
"metadata": {},
"outputs": [
{
@@ -1434,23 +1426,23 @@
"text/html": [
"\n",
"\n",
- "\n",
+ "\n",
"
- | Column | -Missing Values | +Column | +Missing Values | |
---|---|---|---|---|---|
0 | -age | -0 | +0 | +age | +0 |
1 | -anaemia | -0 | +1 | +anaemia | +0 |
2 | -creatinine_phosphokinase | -0 | +2 | +creatinine_phosphokinase | +0 |
3 | -diabetes | -0 | +3 | +diabetes | +0 |
4 | -ejection_fraction | -0 | +4 | +ejection_fraction | +0 |
5 | -high_blood_pressure | -0 | +5 | +high_blood_pressure | +0 |
6 | -platelets | -0 | +6 | +platelets | +0 |
7 | -serum_creatinine | -0 | +7 | +serum_creatinine | +0 |
8 | -serum_sodium | -0 | +8 | +serum_sodium | +0 |
9 | -sex | -0 | +9 | +sex | +0 |
10 | -smoking | -0 | +10 | +smoking | +0 |
11 | -time | -0 | +11 | +time | +0 |
12 | -DEATH_EVENT | -0 | +12 | +DEATH_EVENT | +0 |
-
+
@@ -3373,12 +3378,18 @@ Model
We compared Decision Tree, KNN, Logistic Regression, and selected Logistic Regression due to its interpretability, and ability to handle both linear and non-linear relationships between features. Logistic Regression performed better than the other two models as it works well with fewer features and is less prone to overfitting compared to more complex models like Decision Trees or KNN, especially when the data is relatively small.
Hyperparameter tuning to find find the best Logistic Regression model:
-
-
+
+
+
+
+Table 4: Logistic Regression Scores
+
+
+
-
+
@@ -3424,26 +3435,30 @@ Model
+
+
+
The model is performing well with C = 0.0001 with a CV score of 0.83 and is close to train score, indicating that model is generalising well.
-
+
-The best features to train our model are show in Table 4:
+Logistic regression performs better than Decision tree and KNN on the cross validation data, hence, we selected it as our final model.
+The best features to train our model are show in Table 5:
-Table 4: Top features for trainig the model.
+Table 5: Top features for trainig the model.
-
+
@@ -3547,19 +3562,19 @@ Confusion Matrix
-Table 5: Confusion matrix for the final model on the test dataset.
+Table 6: Confusion matrix for the final model on the test dataset.
-
+
-
+
Predicted
-0
-1
+0
+1
Actual
@@ -3569,14 +3584,14 @@ Confusion Matrix
-0
-35
-6
+0
+35
+6
-1
-5
-14
+1
+5
+14
@@ -3589,10 +3604,10 @@ Confusion Matrix
-Table 6: Evaluation metrics for the final model.
+Table 7: Evaluation metrics for the final model.
-
+
@@ -3638,7 +3653,7 @@ Confusion Matrix
Results and Conculsion
-The analysis revealed that platelets
and ejection_fraction
are the most important features (see Table 4) in predicting the risk of patient mortality. These features significantly impact the model’s ability to assess patient risk, which is crucial for early intervention. Our model achieved a recall score of 0.74 (see Table 6), which is a good start, but there is room for improvement, particularly in reducing the number of high risk patients the model might miss, i.e., maximising recall by minimising False Negatives.
+The analysis revealed that platelets
and ejection_fraction
are the most important features (see Table 5) in predicting the risk of patient mortality. These features significantly impact the model’s ability to assess patient risk, which is crucial for early intervention. Our model achieved a recall score of 0.74 (see Table 7), which is a good start, but there is room for improvement, particularly in reducing the number of high risk patients the model might miss, i.e., maximising recall by minimising False Negatives.
The main challenges in this project stem from class imbalance and limited data availability. With more diverse and comprehensive datasets, performance could be further enhanced. We would also like to explore other machine learning models to improve the overall accuracy.
In conclusion, while the current model shows potential, there is significant opportunity to enhance its effectiveness. With improvements in data quality and model optimization, this tool could become a crucial asset in predicting patient risk and saving lives.
diff --git a/reports/heart-failure-analysis.ipynb b/reports/heart-failure-analysis.ipynb
index 73c6cc9..f3e7ebf 100644
--- a/reports/heart-failure-analysis.ipynb
+++ b/reports/heart-failure-analysis.ipynb
@@ -62,7 +62,7 @@
},
{
"cell_type": "code",
- "execution_count": 1,
+ "execution_count": 44,
"metadata": {},
"outputs": [
{
@@ -70,74 +70,74 @@
"text/html": [
"\n",
- "\n",
+ "\n",
" \n",
" \n",
- " Column Name \n",
- " Description \n",
+ " Column Name \n",
+ " Description \n",
" \n",
" \n",
" \n",
" \n",
- " age \n",
- " Patient's age \n",
+ " age \n",
+ " Patient's age \n",
" \n",
" \n",
- " anaemia \n",
- " Decrease of red blood cells or hemoglobin \n",
+ " anaemia \n",
+ " Decrease of red blood cells or hemoglobin \n",
" \n",
" \n",
- " creatinine_phosphokinase \n",
- " Level of the CPK enzyme in the blood \n",
+ " creatinine_phosphokinase \n",
+ " Level of the CPK enzyme in the blood \n",
" \n",
" \n",
- " diabetes \n",
- " If the patient has diabetes \n",
+ " diabetes \n",
+ " If the patient has diabetes \n",
" \n",
" \n",
- " ejection_fraction \n",
- " Percentage of blood leaving the heart at each contraction \n",
+ " ejection_fraction \n",
+ " Percentage of blood leaving the heart at each contraction \n",
" \n",
" \n",
- " high_blood_pressure \n",
- " If the patient has hypertension \n",
+ " high_blood_pressure \n",
+ " If the patient has hypertension \n",
" \n",
" \n",
- " platelets \n",
- " Platelets in the blood \n",
+ " platelets \n",
+ " Platelets in the blood \n",
" \n",
" \n",
- " serum_creatinine \n",
- " Level of serum creatinine in the blood \n",
+ " serum_creatinine \n",
+ " Level of serum creatinine in the blood \n",
" \n",
" \n",
- " serum_sodium \n",
- " Level of serum sodium in the blood \n",
+ " serum_sodium \n",
+ " Level of serum sodium in the blood \n",
" \n",
" \n",
- " sex \n",
- " Woman or man \n",
+ " sex \n",
+ " Woman or man \n",
" \n",
" \n",
- " smoking \n",
- " If the patient smokes or not \n",
+ " smoking \n",
+ " If the patient smokes or not \n",
" \n",
" \n",
- " time \n",
- " Follow-up period \n",
+ " time \n",
+ " Follow-up period \n",
" \n",
" \n",
- " DEATH_EVENT \n",
- " Whether the patient died or not (target variable) \n",
+ " DEATH_EVENT \n",
+ " Whether the patient died or not (target variable) \n",
" \n",
" \n",
"
\n"
],
"text/plain": [
- ""
+ ""
]
},
- "execution_count": 1,
+ "execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
@@ -170,7 +170,7 @@
"Markdown(table_df.to_markdown(index=False))\n",
"\n",
"# Save the table as a CSV\n",
- "table_df.to_csv('tables/patient_table.csv', index=False)\n",
+ "table_df.to_csv('../results/tables/patient_table.csv', index=False)\n",
"\n",
"# Display the DataFrame without the index\n",
"table_df.style.hide(axis=\"index\")\n"
@@ -224,17 +224,9 @@
},
{
"cell_type": "code",
- "execution_count": 2,
+ "execution_count": 45,
"metadata": {},
- "outputs": [
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "deepchecks - WARNING - You are using deepchecks version 0.18.1, however a newer version is available. Deepchecks is frequently updated with major improvements. You should consider upgrading via the \"python -m pip install --upgrade deepchecks\" command.\n"
- ]
- }
- ],
+ "outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
@@ -265,12 +257,12 @@
},
{
"cell_type": "code",
- "execution_count": 3,
+ "execution_count": 46,
"metadata": {},
"outputs": [],
"source": [
"# Load the dataset\n",
- "file_path = 'data/heart_failure_clinical_records_dataset.csv'\n",
+ "file_path = '../data/raw/heart_failure_clinical_records_dataset.csv'\n",
"heart_failure_data = pd.read_csv(file_path)\n",
"\n",
"# List of binary columns\n",
@@ -289,7 +281,7 @@
},
{
"cell_type": "code",
- "execution_count": 4,
+ "execution_count": 47,
"metadata": {},
"outputs": [
{
@@ -553,7 +545,7 @@
"[299 rows x 13 columns]"
]
},
- "execution_count": 4,
+ "execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
@@ -587,7 +579,7 @@
},
{
"cell_type": "code",
- "execution_count": 5,
+ "execution_count": 48,
"metadata": {},
"outputs": [
{
@@ -596,7 +588,7 @@
"(299, 13)"
]
},
- "execution_count": 5,
+ "execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
@@ -607,7 +599,7 @@
},
{
"cell_type": "code",
- "execution_count": 6,
+ "execution_count": 49,
"metadata": {},
"outputs": [
{
@@ -643,7 +635,7 @@
},
{
"cell_type": "code",
- "execution_count": 7,
+ "execution_count": 50,
"metadata": {},
"outputs": [
{
@@ -655,7 +647,7 @@
"Name: count, dtype: int64"
]
},
- "execution_count": 7,
+ "execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
@@ -674,7 +666,7 @@
},
{
"cell_type": "code",
- "execution_count": 8,
+ "execution_count": 51,
"metadata": {},
"outputs": [
{
@@ -821,7 +813,7 @@
"max 9.40000 148.000000 285.000000 "
]
},
- "execution_count": 8,
+ "execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
@@ -834,7 +826,7 @@
},
{
"cell_type": "code",
- "execution_count": 9,
+ "execution_count": 52,
"metadata": {},
"outputs": [
{
@@ -949,7 +941,7 @@
"12 DEATH_EVENT 0"
]
},
- "execution_count": 9,
+ "execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
@@ -977,7 +969,7 @@
},
{
"cell_type": "code",
- "execution_count": 10,
+ "execution_count": 53,
"metadata": {},
"outputs": [
{
@@ -985,23 +977,23 @@
"text/html": [
"\n",
"\n",
- "\n",
+ "\n",
""
],
"text/plain": [
"alt.VConcatChart(...)"
]
},
- "execution_count": 12,
+ "execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
@@ -1250,7 +1242,7 @@
},
{
"cell_type": "code",
- "execution_count": 13,
+ "execution_count": 56,
"metadata": {},
"outputs": [
{
@@ -1258,23 +1250,23 @@
"text/html": [
"\n",
"\n",
- "\n",
+ "\n",
""
],
"text/plain": [
"alt.Chart(...)"
]
},
- "execution_count": 14,
+ "execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
@@ -1426,7 +1418,7 @@
},
{
"cell_type": "code",
- "execution_count": 15,
+ "execution_count": 58,
"metadata": {},
"outputs": [
{
@@ -1434,23 +1426,23 @@
"text/html": [
"\n",
"\n",
- "\n",
+ "\n",
"
Model
We compared Decision Tree, KNN, Logistic Regression, and selected Logistic Regression due to its interpretability, and ability to handle both linear and non-linear relationships between features. Logistic Regression performed better than the other two models as it works well with fewer features and is less prone to overfitting compared to more complex models like Decision Trees or KNN, especially when the data is relatively small.
Hyperparameter tuning to find find the best Logistic Regression model:
-@@ -3424,26 +3435,30 @@ |
---|
Predicted | -0 | -1 | +0 | +1 | |
---|---|---|---|---|---|
Actual | @@ -3569,14 +3584,14 @@|||||
0 | -35 | -6 | +0 | +35 | +6 |
1 | -5 | -14 | +1 | +5 | +14 |
Confusion Matrix
Confusion Matrix
Results and Conculsion
-The analysis revealed that platelets
and ejection_fraction
are the most important features (see Table 4) in predicting the risk of patient mortality. These features significantly impact the model’s ability to assess patient risk, which is crucial for early intervention. Our model achieved a recall score of 0.74 (see Table 6), which is a good start, but there is room for improvement, particularly in reducing the number of high risk patients the model might miss, i.e., maximising recall by minimising False Negatives.
The analysis revealed that platelets
and ejection_fraction
are the most important features (see Table 5) in predicting the risk of patient mortality. These features significantly impact the model’s ability to assess patient risk, which is crucial for early intervention. Our model achieved a recall score of 0.74 (see Table 7), which is a good start, but there is room for improvement, particularly in reducing the number of high risk patients the model might miss, i.e., maximising recall by minimising False Negatives.
The main challenges in this project stem from class imbalance and limited data availability. With more diverse and comprehensive datasets, performance could be further enhanced. We would also like to explore other machine learning models to improve the overall accuracy.
In conclusion, while the current model shows potential, there is significant opportunity to enhance its effectiveness. With improvements in data quality and model optimization, this tool could become a crucial asset in predicting patient risk and saving lives.
Column Name | \n", - "Description | \n", + "Column Name | \n", + "Description | \n", "
---|---|---|---|
age | \n", - "Patient's age | \n", + "age | \n", + "Patient's age | \n", "
anaemia | \n", - "Decrease of red blood cells or hemoglobin | \n", + "anaemia | \n", + "Decrease of red blood cells or hemoglobin | \n", "
creatinine_phosphokinase | \n", - "Level of the CPK enzyme in the blood | \n", + "creatinine_phosphokinase | \n", + "Level of the CPK enzyme in the blood | \n", "
diabetes | \n", - "If the patient has diabetes | \n", + "diabetes | \n", + "If the patient has diabetes | \n", "
ejection_fraction | \n", - "Percentage of blood leaving the heart at each contraction | \n", + "ejection_fraction | \n", + "Percentage of blood leaving the heart at each contraction | \n", "
high_blood_pressure | \n", - "If the patient has hypertension | \n", + "high_blood_pressure | \n", + "If the patient has hypertension | \n", "
platelets | \n", - "Platelets in the blood | \n", + "platelets | \n", + "Platelets in the blood | \n", "
serum_creatinine | \n", - "Level of serum creatinine in the blood | \n", + "serum_creatinine | \n", + "Level of serum creatinine in the blood | \n", "
serum_sodium | \n", - "Level of serum sodium in the blood | \n", + "serum_sodium | \n", + "Level of serum sodium in the blood | \n", "
sex | \n", - "Woman or man | \n", + "sex | \n", + "Woman or man | \n", "
smoking | \n", - "If the patient smokes or not | \n", + "smoking | \n", + "If the patient smokes or not | \n", "
time | \n", - "Follow-up period | \n", + "time | \n", + "Follow-up period | \n", "
DEATH_EVENT | \n", - "Whether the patient died or not (target variable) | \n", + "DEATH_EVENT | \n", + "Whether the patient died or not (target variable) | \n", "