Context
NovaIDE is building an AI IntelliSense feature for its mobile coding app. Users expect inline code completions, short explanations, and next-action suggestions even on unreliable cellular networks where round-trip latency is often high.
Constraints
- p95 time to first suggestion: 250ms for local suggestions, 1,500ms when cloud fallback is used
- Cost ceiling: $0.80 per 1,000 sessions, with 2.5 suggestions per session on average
- Acceptance rate of top suggestion: at least 28%
- Hallucination ceiling: fewer than 2% of accepted suggestions should introduce syntax errors or reference unavailable symbols
- Must degrade gracefully offline or on 3G-like networks
- Must not leak source code from private repos to unauthorized services
- Prompt injection risk exists through comments, README files, and retrieved project docs
Available Resources
- 20M anonymized IDE completion events from desktop and mobile
- 200K permissively licensed code files for pretraining or distillation
- On-device budget: up to 1.2GB model size, 6GB RAM phones at p50, intermittent CPU/GPU acceleration
- Cloud access to a GPT-4-class or Claude-class model for fallback requests
- Project-local context: current file, open tabs, symbol table, recent edits, and optional repo docs
- Telemetry for suggestion shown, accepted, edited-after-accept, and latency by network tier
Task
- Design the end-to-end IntelliSense architecture, including what runs on-device vs. in the cloud and how you handle degraded connectivity.
- Specify the prompting strategy for cloud fallback, including how you ground on local context and reduce hallucinated APIs or symbols.
- Define an evaluation plan before implementation: offline datasets, acceptance-quality metrics, hallucination checks, and online experiment design.
- Explain whether you would use prompt design only, distillation, or fine-tuning for the on-device model, and justify the choice with cost/latency tradeoffs.
- Identify the main failure modes, including prompt injection and privacy leakage, and propose mitigations.