Dataford
Interview Guides
Upgrade
All questions/Machine Learning/Mitigate Bias in Ads CTR Models

Mitigate Bias in Ads CTR Models

Hard
Machine Learning
Asked at 1 company1Supervised LearningCross-ValidationFeature Engineering
Also asked at
Google

Problem

Business Context

Google Ads uses click-through-rate (CTR) prediction models to rank and price sponsored results. You are asked to improve a binary classifier that predicts whether an impression will receive a click while reducing measurable bias against underrepresented advertiser and user segments.

Dataset

You are given a historical training set built from ad impression logs over 90 days.

Feature GroupCountExamples
Ad features12campaign_type, creative_format, bid_amount, ad_quality_score
Query/context10query_length, device_type, country, hour_of_day
User behavior aggregates8prior_ctr_7d, sessions_30d, conversion_rate_30d
Advertiser/account9vertical, account_age_days, spend_tier, region
Sensitive / audit-only attributes4user_gender, user_age_bucket, advertiser_size_bucket, market_tier
  • Rows: 24M ad impressions, 43 model features + 4 audit-only attributes
  • Target: clicked (1 if clicked, 0 otherwise)
  • Class balance: 6.1% positive, 93.9% negative
  • Missing data: 9% missing in user aggregates for cold-start users, 4% missing in advertiser metadata, sparse long-tail categories in region and vertical
  • Known issue: training data over-represents large advertisers in Tier-1 markets and Android mobile traffic

Success Criteria

A good solution should improve fairness without causing unacceptable ranking degradation:

  • PR-AUC drop must be less than 2% relative to the current production baseline
  • Worst-group false negative rate gap across audit groups must be reduced by at least 30%
  • Calibration error for each major segment must remain below 0.03

Constraints

  • Batch retraining on Vertex AI once per day; online inference latency under 20 ms at p95
  • Sensitive attributes may be used for offline auditing and bias mitigation analysis, but not served directly at inference
  • The model must support explainability for policy and ads quality reviews

Deliverables

  1. Define what bias means in this CTR system and how you would measure it.
  2. Build a training pipeline that handles imbalance, missingness, and segment skew.
  3. Compare at least two mitigation strategies (for example: reweighting, thresholding, constrained training, or post-hoc calibration).
  4. Report overall and per-group performance on a held-out time-based test set.
  5. Recommend a production rollout and monitoring plan in Google Cloud.

Problem

Business Context

Google Ads uses click-through-rate (CTR) prediction models to rank and price sponsored results. You are asked to improve a binary classifier that predicts whether an impression will receive a click while reducing measurable bias against underrepresented advertiser and user segments.

Dataset

You are given a historical training set built from ad impression logs over 90 days.

Feature GroupCountExamples
Ad features12campaign_type, creative_format, bid_amount, ad_quality_score
Query/context10query_length, device_type, country, hour_of_day
User behavior aggregates8prior_ctr_7d, sessions_30d, conversion_rate_30d
Advertiser/account9vertical, account_age_days, spend_tier, region
Sensitive / audit-only attributes4user_gender, user_age_bucket, advertiser_size_bucket, market_tier
  • Rows: 24M ad impressions, 43 model features + 4 audit-only attributes
  • Target: clicked (1 if clicked, 0 otherwise)
  • Class balance: 6.1% positive, 93.9% negative
  • Missing data: 9% missing in user aggregates for cold-start users, 4% missing in advertiser metadata, sparse long-tail categories in region and vertical
  • Known issue: training data over-represents large advertisers in Tier-1 markets and Android mobile traffic

Success Criteria

A good solution should improve fairness without causing unacceptable ranking degradation:

  • PR-AUC drop must be less than 2% relative to the current production baseline
  • Worst-group false negative rate gap across audit groups must be reduced by at least 30%
  • Calibration error for each major segment must remain below 0.03

Constraints

  • Batch retraining on Vertex AI once per day; online inference latency under 20 ms at p95
  • Sensitive attributes may be used for offline auditing and bias mitigation analysis, but not served directly at inference
  • The model must support explainability for policy and ads quality reviews

Deliverables

  1. Define what bias means in this CTR system and how you would measure it.
  2. Build a training pipeline that handles imbalance, missingness, and segment skew.
  3. Compare at least two mitigation strategies (for example: reweighting, thresholding, constrained training, or post-hoc calibration).
  4. Report overall and per-group performance on a held-out time-based test set.
  5. Recommend a production rollout and monitoring plan in Google Cloud.
Your answer
Try one AI text evaluation on us
Get structured feedback, scored against a 4-axis rubric. Premium unlocks unlimited.
0 wordstarget ~200
Up next
Diagnose Bias-Variance in Ads CTRMediumMetaChoose Features for Google Ads CTRMediumMetaTune Feed CTR ModelsMedium
Next question