You’re the on-call data scientist supporting a high-throughput clinical diagnostics lab at a healthcare company processing ~80,000 patient samples/day. A routine immunoassay (used to quantify a biomarker that gates downstream clinical review) is monitored using daily quality-control (QC) runs. A “fail” triggers a production stop because biased results could lead to incorrect clinical decisions and regulatory exposure.
Yesterday, the assay failed QC: the control material (a standardized sample) produced unusually low readings. The lab team suspects either (a) a true shift in the assay calibration (systemic issue), or (b) a one-off problem like a bad reagent lot or pipetting error (localized issue). You have access to historical QC data and the last two days of runs.
Using hypothesis testing and probability, determine whether the failure is more consistent with a systemic mean shift in the assay or a one-day anomaly, and quantify how surprising the observed QC result is under “business as usual.”
The assay reports a continuous concentration value (units: ng/mL). Historically, the QC control is stable.
| Item | Value |
|---|---|
| Historical period | 60 days |
| QC replicates per day (historical) | 20 |
| Total historical QC replicates | 1,200 |
| Historical mean (μ₀) | 100.0 |
| Historical SD (σ) | 4.0 |
| Yesterday’s QC replicates (n) | 20 |
| Yesterday’s sample mean (\bar{x}) | 96.8 |
| Yesterday’s sample SD (s) | 4.2 |
| Significance level (α) | 0.05 |
Operational detail: yesterday morning the lab switched to a new reagent lot on 6 of the 20 replicates (the other 14 used the old lot). You also have the replicate-level means by lot:
| Lot | Replicates | Mean |
|---|---|---|
| Old lot | 14 | 98.1 |
| New lot | 6 | 93.7 |