MediPulse is preparing a patient-risk dataset for a hospital readmission model. Before any imputation or modeling, the analytics team wants to understand whether missing values are random, concentrated in certain fields, or associated with patient outcomes.
You are given summary statistics from a 5,000-row dataset. Your task is to quantify the missingness pattern and recommend the most informative visualizations. Also test whether missingness in one key variable appears related to the target outcome.
| Variable | Missing Count | Missing Rate |
|---|---|---|
| Age | 50 | 1.0% |
| Income | 600 | 12.0% |
| Blood Pressure | 425 | 8.5% |
| Cholesterol | 900 | 18.0% |
| Smoking Status | 300 | 6.0% |
| Readmitted in 30 Days | 0 | 0.0% |
Additional joint missingness and outcome data:
| Metric | Value |
|---|---|
| Total rows | 5000 |
| Rows with both Income and Cholesterol missing | 420 |
| Readmitted among Income missing | 138 of 600 |
| Readmitted among Income observed | 690 of 4400 |
| Significance level | 0.05 |