Backend Coding and Low-Level Design
Backend implementation skill matters because Atlassian ships ML through services that must be correct, maintainable, and observable. Interviewers evaluate how you translate ambiguous requirements into clean APIs, choose appropriate data structures, and test incrementally. Strong performance looks like thoughtful decomposition, clear invariants, and production-quality code.
Be ready to go over:
- Object-oriented design and data structures – Maps/sets/queues, graphs, heaps, and when to choose each.
- Concurrency and rate control – Designing thread-safe components and rate limiters.
- API ergonomics and testability – Input validation, edge cases, and unit/integration tests.
Advanced concepts (less common):
- Efficient caching strategies, eviction policies, and consistency trade-offs.
- Backpressure, circuit breakers, and resilience patterns.
- Log-structured storage choices and read/write amplification considerations.
Example questions or scenarios:
- “Implement a simplified parking system with entry/exit, capacity, and pricing extensions.”
- “Build a snake-and-ladder game engine that supports variable board configurations.”
- “Design and code a token bucket rate limiter with burst handling and fairness.”
ML System Design
ML system design assesses your ability to deliver end-to-end impact for Jira/Confluence use cases. You will frame the problem, define success metrics, architect data and model pipelines, plan experimentation, and think about rollout, safety, and observability. Strong performance ties model choices to user experience, cost, latency, and reliability.
Be ready to go over:
- Problem framing and metrics – Choosing task definition and guardrail metrics aligned to product goals.
- Data architecture – Batch/stream ingestion, feature stores, labeling, and feedback loops.
- Deployment and monitoring – Canary, progressive rollout, drift detection, bias/safety checks.
Advanced concepts (less common):
- Retrieval-augmented generation, grounding, and prompt/adapter strategies for LLMs.
- Human-in-the-loop labeling, active learning, and offline/online metric alignment.
- Guardrails for responsible AI: output moderation, PII handling, and evaluation harnesses.
Example questions or scenarios:
- “Design a Confluence Q&A assistant that answers from internal pages with citations.”
- “Build Jira issue recommendations (assignees or related issues) with online feedback.”
- “Architect an LLM inference service that balances latency, cost, and quality across tenants.”
Applied ML Craft and Project Deep Dive
You will walk through a prior project to demonstrate practical ML judgment, experimentation discipline, and impact. Interviewers probe the why behind your choices, how you handled data issues, what baselines you used, and how you measured success. Strong performance shows clear hypotheses, strong baselines, and sharp error analysis leading to measurable gains.
Be ready to go over:
- Baseline-first thinking – Simple heuristics vs. complex models and why.
- Experimentation – A/B testing, power analysis, and confidence in results.
- Error analysis – Segment performance, fairness, and iteration prioritization.
Advanced concepts (less common):
- Counterfactual evaluation and off-policy estimators for recommenders.
- Calibration, uplift modeling, and long-horizon reward alignment.
- Causal inference to de-bias experiments.
Example questions or scenarios:
- “Walk us through an end-to-end recommender you built: objective, features, offline metrics, and A/B results.”
- “How did you debug a model whose offline AUC was high but online CTR did not improve?”
- “What trade-offs did you make to reduce inference latency without harming quality?”
Data Platform and MLOps on AWS
Many Atlassian roles require fluency in AWS and production ML tooling. Interviewers assess how you build, deploy, and operate pipelines and services with CI/CD, monitoring, and cost controls. Strong candidates navigate S3, SageMaker, microservices, and structured observability confidently.
Be ready to go over:
- Training and serving pipelines – Artifact versioning, feature lineage, and reproducibility.
- Runtime engineering – REST/gRPC services, autoscaling, caching, and circuit breaking.
- Observability – Metrics, logs, traces, and automated alerts for models and services.
Advanced concepts (less common):
- Multi-LLM routing, distillation, KV cache management, and inference optimization.
- Batch vs. streaming features and exactly-once semantics.
- Security, tenancy, and data governance in enterprise environments.
Example questions or scenarios:
- “Sketch a SageMaker training pipeline with data validation, model registry, and staged deployment.”
- “Design a multi-tenant inference service with per-tenant quotas and guardrails.”
- “How would you track and roll back a bad model using CI/CD and monitoring signals?”
Collaboration, Values, and Stakeholder Communication
You will be evaluated on how you handle ambiguity, align stakeholders, and live Atlassian values. Strong performance sounds like crisp problem statements, proactive risk surfacing, and collaborative decision-making. Expect scenario-based prompts where you negotiate scope, metrics, or timelines.
Be ready to go over:
- Communicating trade-offs between accuracy, latency, and cost.
- Aligning with PMs, design, and partner teams on requirements and success metrics.
- Handling conflicting feedback and deciding when to ship vs. iterate.
Advanced concepts (less common):
- Ethical considerations and responsible AI escalation paths.
- Running cross-team RFCs and design reviews efficiently.
- Mentoring peers on ML design and production standards.
Example questions or scenarios:
- “You’re asked to add a new feature one week before launch—how do you decide?”
- “A stakeholder wants a complex model; you believe a heuristic is sufficient—what do you do?”
- “How do you handle an ambiguous request for ‘AI in search’ without clear goals?”