fix(curriculum): medical data visualizer missing instructions (#56779)

2026-05-14 19:01:05 -04:00 · 2024-10-29 20:27:49 +01:00
parent 7ee5ac0328
commit a580118e90
1 changed files with 14 additions and 11 deletions
--- a/curriculum/challenges/english/08-data-analysis-with-python/data-analysis-with-python-projects/medical-data-visualizer.md
+++ b/curriculum/challenges/english/08-data-analysis-with-python/data-analysis-with-python-projects/medical-data-visualizer.md
@@ -43,29 +43,32 @@ File name: medical_examination.csv


 ## Instructions
+
+Create a chart similar to `examples/Figure_1.png`, where we show the counts of good and bad outcomes for the `cholesterol`, `gluc`, `alco`, `active`, and `smoke` variables for patients with `cardio=1` and `cardio=0` in different panels.
+
 By each number in the `medical_data_visualizer.py` file, add the code from the associated instruction number below.

-1. Import the data from `medical_examination.csv` and assign it to the `df` variable
+1. Import the data from `medical_examination.csv` and assign it to the `df` variable.
 2. Add an `overweight` column to the data. To determine if a person is overweight, first calculate their BMI by dividing their weight in kilograms by the square of their height in meters. If that value is > 25 then the person is overweight. Use the value `0` for NOT overweight and the value `1` for overweight.
 3. Normalize data by making `0` always good and `1` always bad. If the value of `cholesterol` or `gluc` is 1, set the value to `0`. If the value is more than `1`, set the value to `1`.
-4. Draw the Categorical Plot in the `draw_cat_plot` function
+4. Draw the Categorical Plot in the `draw_cat_plot` function.
 5. Create a DataFrame for the cat plot using `pd.melt` with values from `cholesterol`, `gluc`, `smoke`, `alco`, `active`, and `overweight` in the `df_cat` variable.
 6. Group and reformat the data in `df_cat` to split it by `cardio`. Show the counts of each feature. You will have to rename one of the columns for the `catplot` to work correctly.
-7. Convert the data into `long` format and create a chart that shows the value counts of the categorical features using the following method provided by the seaborn library import : `sns.catplot()`
-8. Get the figure for the output and store it in the `fig` variable 
-9. Do not modify the next two lines
-10. Draw the Heat Map in the `draw_heat_map` function 
+7. Convert the data into `long` format and create a chart that shows the value counts of the categorical features using the following method provided by the seaborn library import: `sns.catplot()`.
+8. Get the figure for the output and store it in the `fig` variable.
+9. Do not modify the next two lines.
+10. Draw the Heat Map in the `draw_heat_map` function.
 11. Clean the data in the `df_heat` variable by filtering out the following patient segments that represent incorrect data:
    - diastolic pressure is higher than systolic (Keep the correct data with `(df['ap_lo'] <= df['ap_hi'])`)
    - height is less than the 2.5th percentile (Keep the correct data with `(df['height'] >= df['height'].quantile(0.025))`)
    - height is more than the 97.5th percentile
    - weight is less than the 2.5th percentile
    - weight is more than the 97.5th percentile
-12. Calculate the correlation matrix and store it in the `corr` variable
-13. Generate a mask for the upper triangle and store it in the `mask` variable
-14. Set up the `matplotlib` figure 
-15. Plot the correlation matrix using the method provided by the `seaborn` library import: `sns.heatmap()`
-16. Do not modify the next two lines
+12. Calculate the correlation matrix and store it in the `corr` variable.
+13. Generate a mask for the upper triangle and store it in the `mask` variable.
+14. Set up the `matplotlib` figure.
+15. Plot the correlation matrix using the method provided by the `seaborn` library import: `sns.heatmap()`.
+16. Do not modify the next two lines.

 ## Development