Business Context
NovaBank is hiring an ML engineer to support fraud, credit risk, and personalization models across products serving millions of users. Before discussing advanced systems, the interview focuses on whether you can clearly explain the deep learning foundations needed to train and debug neural networks in production.
Dataset
Use a realistic supervised learning dataset for demonstration: a customer event classification table built from mobile banking sessions. The goal is not to build the most complex model, but to explain core deep learning concepts through a practical binary classification task.
| Feature Group | Count | Examples |
|---|
| Numerical behavior features | 18 | session_length_sec, transactions_7d, avg_amount, failed_logins_30d |
| Categorical account features | 9 | device_type, region, account_tier, acquisition_channel |
| Temporal features | 6 | hour_of_day, day_of_week, days_since_last_login |
| Engineered ratios | 5 | failed_login_rate, txn_per_session, amount_volatility |
| | |
- Size: 120K sessions, 38 input features
- Target: Binary label indicating whether a session ends in a high-risk event
- Class balance: 11% positive, 89% negative
- Missing data: 8% missing in behavioral features, 3% in categorical fields
Success Criteria
A strong answer should demonstrate that you can explain the following clearly and correctly:
- forward propagation and how layers transform inputs
- activation functions and why nonlinearity matters
- loss functions for classification
- backpropagation and gradient descent
- overfitting, regularization, and validation strategy
- how to interpret training vs validation curves
Constraints
- Explanations must be technically correct but understandable to non-research stakeholders
- The baseline model should train in minutes on a laptop CPU
- The solution should be simple enough to support debugging and feature inspection
Deliverables
- Train a baseline neural network for binary classification.
- Explain each core deep learning concept using this model as the example.
- Compare the neural network against a logistic regression baseline.
- Show how regularization changes generalization performance.
- Report evaluation metrics and discuss common failure modes in training.