Task
You are given rider signup data and Lyft ride activity. Write a SQL query to compare month-1 retention across monthly signup cohorts. Define a rider's cohort as the month of their signup_date, and define a retained rider as someone who completed at least one ride in the calendar month immediately after their signup month. Return one row per cohort month with the cohort size, retained riders, and retention rate, ordered by cohort month.
Use only completed rides for retention. Riders with no rides should still be included in the cohort size.
Schema
riders
| column | type | description |
|---|
| rider_id | INT | Unique rider ID |
| rider_name | VARCHAR(100) | Rider name |
| signup_date | DATE | Date the rider signed up |
| city | VARCHAR(50) | Primary city |
rides
| column | type | description |
|---|
| ride_id | INT | Unique ride ID |
| rider_id | INT | Rider who requested the ride |
| ride_date | DATE | Ride date |
| status | VARCHAR(20) | Ride status |
| fare_amount | NUMERIC(8,2) | Fare amount |
Sample data
riders
| rider_id | rider_name | signup_date | city |
|---|
| 101 | Maya Chen | 2024-01-05 | San Francisco |
| 102 | Jordan Lee | 2024-01-20 | Los Angeles |
| 103 | Priya Patel | 2024-02-02 | Chicago |
rides
| ride_id | rider_id | ride_date | status | fare_amount |
|---|
| 1001 | 101 | 2024-02-10 | completed | 18.50 |
| 1002 | 101 | 2024-03-01 | completed | 22.00 |
| 1003 | 102 | 2024-02-15 | canceled | 0.00 |
Expected output
| cohort_month | cohort_size | retained_riders | retention_rate |
|---|
| 2024-01 | 3 | 2 | 0.6667 |
| 2024-02 | 3 | 2 | 0.6667 |
| 2024-03 | 3 | 1 | 0.3333 |