Business Context
You’re interviewing for an ML role on the Pricing team at RideNow, a ride-hailing marketplace operating in 35 North American cities with ~4.5M weekly active riders and ~900K active drivers. Pricing is currently set by a rules-based “surge” system using coarse geofences and simple demand/supply ratios. Leadership believes the system leaves 3–6% weekly gross bookings on the table: it underprices in some micro-areas during events (leading to long ETAs and cancellations) and overprices in others (hurting conversion).
Your task is to design a model that optimizes price multipliers based on local demand at a fine spatiotemporal resolution, while respecting marketplace constraints (driver supply, rider experience, and fairness).
Dataset
You are given 12 months of historical marketplace logs aggregated to (city, zone_id, 15-minute bucket).
| Feature Group | Examples | Notes |
|---|
| Spatiotemporal | city, zone_id, day_of_week, hour, holiday_flag | zone_id is a hex grid (~500–2,000 zones per city) |
| Demand signals | ride_requests, unique_requesters, search_sessions, airport_queue_views | Some are leading indicators |
| Supply signals | available_drivers, driver_accept_rate, driver_eta_p50 | Supply is endogenous to price |
| Marketplace outcomes | completed_rides, cancellations, rider_eta_p50 | Outcomes depend on both demand & supply |
| Price & incentives | current_multiplier, driver_bonus_per_trip | current_multiplier is the policy that generated the data |
| Context | weather_temp, precipitation, event_score, traffic_index | event_score from a third-party feed |
Target decision: choose a price multiplier m for each (zone, time) for the next 15 minutes.
Success Criteria
You will be evaluated offline and via an online A/B test plan:
- Offline objective: maximize expected gross profit per 15-min bucket (revenue − driver incentives − support costs from cancellations).
- Guardrails (must not regress):
- Rider conversion rate drop ≤ 0.5% relative
- Cancellation rate increase ≤ 0.2 pp
- ETA p90 increase ≤ 30 seconds
- Business target: demonstrate an estimated +1.5% to +3.0% lift in gross profit in backtests on holdout months.
Constraints
- Latency: real-time inference under 50 ms p95 per zone request; batch scoring for all zones every 5 minutes is acceptable.
- Interpretability: pricing ops needs to understand major drivers (events/weather/commute patterns) and have override controls.
- Data leakage: must avoid using post-decision outcomes (e.g., completed_rides) when predicting next bucket.
- Policy bias: historical data was generated by the current surge algorithm; the model must handle counterfactual pricing concerns.
- Safety/fairness: avoid systematically higher multipliers in protected neighborhoods without demand justification; require monitoring by neighborhood clusters.
Deliverables
- Formulate the problem mathematically: what do you predict (demand, elasticity, or profit) and how does that map to an optimal multiplier?
- Propose a modeling approach that can learn local price-response from logged data (and explain assumptions).
- Describe feature engineering for spatiotemporal demand, including cold-start zones and event spikes.
- Provide an evaluation plan: offline metrics, backtesting protocol, and an online A/B test design with guardrails.
- Outline a production deployment: data pipeline, inference architecture, retraining cadence, and monitoring (drift, fairness, and KPI regressions).