Assess Graph ML Role Readiness

Business Context

NovaGraph is hiring an ML engineer for a graph-focused recommendation and risk modeling team. As part of the interview, the hiring panel wants a practical screening model that predicts whether a candidate demonstrates the core technical competencies required for Graph ML work.

Dataset

You are given a structured hiring dataset built from 18 months of interview loops and take-home evaluations. Each row represents one candidate. The target is whether the candidate was rated "Graph-ML ready" by the final hiring committee.

Feature Group	Count	Examples
ML fundamentals	8	classification_score, regression_score, bias_variance_score, model_eval_score
Graph concepts	7	graph_algorithms_score, message_passing_score, link_prediction_score, node_classification_score
Engineering signals	6	python_score, system_design_score, feature_pipeline_score, deployment_score
Experience & background	5	years_experience, prior_graph_project_count, degree_level, domain_focus
Interview process metadata	4	interview_stage_count, referral_flag, takehome_completed, panel_variance

Size: 12,400 candidates, 30 features
Target: Binary — Graph-ML ready (1) vs not ready (0)
Class balance: 28% positive, 72% negative
Missing data: 12% missing in prior project history, 6% missing in take-home-derived rubric features, and sparse missingness in some interviewer scores

Success Criteria

A strong solution should identify qualified candidates with ROC-AUC e 0.84, F1 e 0.70, and recall e 0.75 on the positive class. The model should also provide interpretable evidence about which competencies matter most.

Constraints

The recruiting team needs interpretable outputs for hiring calibration
Batch inference on new applicants should complete in under 200 ms per candidate
The model should be retrained quarterly as hiring rubrics evolve
Avoid leakage from final committee notes or post-decision artifacts

Deliverables

Build a binary classification model to predict Graph-ML readiness
Explain which technical competencies are most predictive
Design preprocessing for mixed numeric/categorical data with missing values
Evaluate the model with appropriate classification metrics and threshold selection
Recommend how the model should be used in production without replacing human judgment

Business Context

Dataset

Feature Group	Count	Examples
ML fundamentals	8	classification_score, regression_score, bias_variance_score, model_eval_score
Graph concepts	7	graph_algorithms_score, message_passing_score, link_prediction_score, node_classification_score
Engineering signals	6	python_score, system_design_score, feature_pipeline_score, deployment_score
Experience & background	5	years_experience, prior_graph_project_count, degree_level, domain_focus
Interview process metadata	4	interview_stage_count, referral_flag, takehome_completed, panel_variance

Size: 12,400 candidates, 30 features
Target: Binary — Graph-ML ready (1) vs not ready (0)
Class balance: 28% positive, 72% negative
Missing data: 12% missing in prior project history, 6% missing in take-home-derived rubric features, and sparse missingness in some interviewer scores

Success Criteria

Constraints

The recruiting team needs interpretable outputs for hiring calibration
Batch inference on new applicants should complete in under 200 ms per candidate
The model should be retrained quarterly as hiring rubrics evolve
Avoid leakage from final committee notes or post-decision artifacts

Deliverables

Build a binary classification model to predict Graph-ML readiness
Explain which technical competencies are most predictive
Design preprocessing for mixed numeric/categorical data with missing values
Evaluate the model with appropriate classification metrics and threshold selection
Recommend how the model should be used in production without replacing human judgment

Business Context

Dataset

Feature Group	Count	Examples
ML fundamentals	8	classification_score, regression_score, bias_variance_score, model_eval_score
Graph concepts	7	graph_algorithms_score, message_passing_score, link_prediction_score, node_classification_score
Engineering signals	6	python_score, system_design_score, feature_pipeline_score, deployment_score
Experience & background	5	years_experience, prior_graph_project_count, degree_level, domain_focus
Interview process metadata	4	interview_stage_count, referral_flag, takehome_completed, panel_variance

Size: 12,400 candidates, 30 features
Target: Binary — Graph-ML ready (1) vs not ready (0)
Class balance: 28% positive, 72% negative
Missing data: 12% missing in prior project history, 6% missing in take-home-derived rubric features, and sparse missingness in some interviewer scores

Success Criteria

Constraints

The recruiting team needs interpretable outputs for hiring calibration
Batch inference on new applicants should complete in under 200 ms per candidate
The model should be retrained quarterly as hiring rubrics evolve
Avoid leakage from final committee notes or post-decision artifacts

Deliverables

Build a binary classification model to predict Graph-ML readiness
Explain which technical competencies are most predictive
Design preprocessing for mixed numeric/categorical data with missing values
Evaluate the model with appropriate classification metrics and threshold selection
Recommend how the model should be used in production without replacing human judgment

Business Context

Dataset

Feature Group	Count	Examples
ML fundamentals	8	classification_score, regression_score, bias_variance_score, model_eval_score
Graph concepts	7	graph_algorithms_score, message_passing_score, link_prediction_score, node_classification_score
Engineering signals	6	python_score, system_design_score, feature_pipeline_score, deployment_score
Experience & background	5	years_experience, prior_graph_project_count, degree_level, domain_focus
Interview process metadata	4	interview_stage_count, referral_flag, takehome_completed, panel_variance

Size: 12,400 candidates, 30 features
Target: Binary — Graph-ML ready (1) vs not ready (0)
Class balance: 28% positive, 72% negative
Missing data: 12% missing in prior project history, 6% missing in take-home-derived rubric features, and sparse missingness in some interviewer scores

Success Criteria

Constraints

The recruiting team needs interpretable outputs for hiring calibration
Batch inference on new applicants should complete in under 200 ms per candidate
The model should be retrained quarterly as hiring rubrics evolve
Avoid leakage from final committee notes or post-decision artifacts

Deliverables

Build a binary classification model to predict Graph-ML readiness
Explain which technical competencies are most predictive
Design preprocessing for mixed numeric/categorical data with missing values
Evaluate the model with appropriate classification metrics and threshold selection
Recommend how the model should be used in production without replacing human judgment

Interview Guides

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Assess Graph ML Role Readiness

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer

Assess Graph ML Role Readiness

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Assess Graph ML Role Readiness

Business Context

Dataset

Success Criteria

Constraints

Deliverables

Your Answer