HomeValue, a residential pricing platform serving 200K monthly property searches, wants a transparent baseline model for estimating sale prices from listing attributes. The pricing team needs a model trained from scratch with gradient descent so they can validate optimization behavior before moving to more complex methods.
You are given a historical housing dataset built from MLS listings and closed sales.
| Feature Group | Count | Examples |
|---|---|---|
| Numerical property features | 9 | square_feet, lot_size, bedrooms, bathrooms, year_built, hoa_fee |
| Categorical location features | 4 | neighborhood, zip_code, property_type, condition_rating |
| Temporal features | 2 | listing_month, days_on_market |
| Derived market features | 3 | price_per_sqft_neighborhood_avg, school_score, distance_to_downtown |
A good solution should achieve RMSE below $42K and MAE below $28K on a held-out test set, while showing stable convergence of the training loss. The team also expects a clear explanation of learning rate choice, regularization, and stopping criteria.