Dataford
Interview Guides
Upgrade
All questions/Machine Learning/Diagnose Bias-Variance in Ads CTR

Diagnose Bias-Variance in Ads CTR

Medium
Machine Learning
Cross-ValidationHyperparameter TuningBias-Variance Tradeoff

Problem

Business Context

Google Ads wants to predict whether a search ad impression will receive a click so bidding and ranking systems can use calibrated click-through-rate estimates. You are given an offline training dataset and asked to evaluate whether the current model suffers more from high bias or high variance, then recommend changes.

Dataset

Feature GroupCountExamples
Query and ad text features12query_length, ad_title_length, keyword_match_type, semantic_similarity_score
Auction context9device_type, country, hour_of_day, ad_position, page_type
Historical performance8advertiser_ctr_7d, campaign_ctr_30d, quality_score, conversion_rate_30d
User and session signals6returning_user, prior_searches_24h, signed_in, browser_family
Engineered interaction features5query_x_device, position_x_quality_score, hour_bucket
  • Size: 2.4M ad impressions, 40 features
  • Target: Binary label indicating whether the impression was clicked
  • Class balance: 11.6% clicked, 88.4% not clicked
  • Missing data: ~7% missing in historical advertiser features for new campaigns; <2% missing in some session fields

Success Criteria

A strong solution should:

  • correctly diagnose bias vs variance using train/validation/test behavior,
  • compare at least two model families or complexity settings,
  • use cross-validation and learning curves rather than a single split,
  • recommend concrete actions that improve generalization,
  • achieve log loss < 0.31 and AUC-ROC > 0.78 on the held-out test set.

Constraints

  • Inference must stay under 15 ms p95 in a Google Ads batch scoring service.
  • Model should be explainable enough to justify major feature or regularization changes.
  • Retraining happens daily, so tuning must be computationally practical.

Deliverables

  1. Build a baseline and at least one higher-capacity model.
  2. Use training/validation curves to determine whether errors are caused by underfitting or overfitting.
  3. Quantify the impact of regularization, feature engineering, and model complexity.
  4. Report final offline metrics and explain the chosen operating point.
  5. Recommend production changes for improving the bias-variance trade-off.

Problem

Business Context

Google Ads wants to predict whether a search ad impression will receive a click so bidding and ranking systems can use calibrated click-through-rate estimates. You are given an offline training dataset and asked to evaluate whether the current model suffers more from high bias or high variance, then recommend changes.

Dataset

Feature GroupCountExamples
Query and ad text features12query_length, ad_title_length, keyword_match_type, semantic_similarity_score
Auction context9device_type, country, hour_of_day, ad_position, page_type
Historical performance8advertiser_ctr_7d, campaign_ctr_30d, quality_score, conversion_rate_30d
User and session signals6returning_user, prior_searches_24h, signed_in, browser_family
Engineered interaction features5query_x_device, position_x_quality_score, hour_bucket
  • Size: 2.4M ad impressions, 40 features
  • Target: Binary label indicating whether the impression was clicked
  • Class balance: 11.6% clicked, 88.4% not clicked
  • Missing data: ~7% missing in historical advertiser features for new campaigns; <2% missing in some session fields

Success Criteria

A strong solution should:

  • correctly diagnose bias vs variance using train/validation/test behavior,
  • compare at least two model families or complexity settings,
  • use cross-validation and learning curves rather than a single split,
  • recommend concrete actions that improve generalization,
  • achieve log loss < 0.31 and AUC-ROC > 0.78 on the held-out test set.

Constraints

  • Inference must stay under 15 ms p95 in a Google Ads batch scoring service.
  • Model should be explainable enough to justify major feature or regularization changes.
  • Retraining happens daily, so tuning must be computationally practical.

Deliverables

  1. Build a baseline and at least one higher-capacity model.
  2. Use training/validation curves to determine whether errors are caused by underfitting or overfitting.
  3. Quantify the impact of regularization, feature engineering, and model complexity.
  4. Report final offline metrics and explain the chosen operating point.
  5. Recommend production changes for improving the bias-variance trade-off.
Your answer
Try one AI text evaluation on us
Get structured feedback, scored against a 4-axis rubric. Premium unlocks unlimited.
0 wordstarget ~200
Up next
GoogleMitigate Bias in Ads CTR ModelsHardMetaChoose Features for Google Ads CTRMediumDun & BradstreetSelect Models for CTR PredictionMedium
Next question