PixShare, a consumer photo-sharing platform, receives about 1.8M image uploads per day. The search and discovery team wants an automated tag prediction system that suggests relevant tags (for example: beach, dog, sunset, indoor, food) immediately after upload to improve search recall and downstream recommendations.
You are given a historical training set of user-uploaded images with moderator-approved tags. This is a multi-label image classification problem: each image can have zero, one, or multiple valid tags.
| Dataset Component | Details |
|---|---|
| Images | 420K RGB images, resized to 224x224 for training |
| Labels | 120 possible tags; average 2.7 tags per image |
| Metadata | upload_device, country, hour_of_day, image_width, image_height |
| Class balance | Highly imbalanced; top 10 tags cover 61% of labels, long tail tags appear in <1% of images |
| Data quality | ~6% noisy labels from weak user annotations; ~3% corrupted images removed during preprocessing |
A solution is considered good enough if it achieves strong ranking quality on common tags while maintaining usable recall on tail tags. Target performance is micro-F1 >= 0.68, macro-F1 >= 0.42, and Precision@3 >= 0.78 on a held-out test set.