Optimize E-commerce Prompts for Support

Business Context

ShopFlow uses a large language model to draft customer-support replies for order, refund, and shipping questions. The team wants a prompt-engineering solution that improves answer quality and consistency without fine-tuning a new model.

Data

You have 120,000 historical support conversations, 18,000 human-written gold responses, and a prompt test set of 5,000 recent tickets. Messages are primarily English, with 8% Spanish routed through translation. Ticket length ranges from 20 to 900 tokens, with a median of 140 tokens. Common intents include order status (35%), returns/refunds (25%), shipping delays (20%), account issues (10%), and policy questions (10%). Labels for evaluation include intent, factual correctness, tone compliance, and resolution status.

Success Criteria

A good solution should increase factual correctness and policy compliance while reducing hallucinated refund promises. Target at least 15% improvement in human preference score over the current baseline prompt, with p95 latency under 2 seconds and stable behavior across major ticket types.

Constraints

No model fine-tuning; only prompt design, retrieval context, and lightweight preprocessing are allowed
Must run with a hosted LLM API and a limited token budget
Responses must follow company policy and avoid unsupported claims

Requirements

Design a prompt-engineering workflow for customer-support response generation
Define preprocessing for ticket cleanup, intent extraction, and policy-context retrieval
Propose a modern Python implementation using an LLM API, prompt templates, and evaluation scripts
Explain how you would compare zero-shot, few-shot, and retrieval-augmented prompts
Specify guardrails for tone, structure, and factual grounding
Describe how you would measure quality, latency, and failure modes before deployment

Business Context

Data

Success Criteria

Constraints

No model fine-tuning; only prompt design, retrieval context, and lightweight preprocessing are allowed
Must run with a hosted LLM API and a limited token budget
Responses must follow company policy and avoid unsupported claims

Requirements

Design a prompt-engineering workflow for customer-support response generation
Define preprocessing for ticket cleanup, intent extraction, and policy-context retrieval
Propose a modern Python implementation using an LLM API, prompt templates, and evaluation scripts
Explain how you would compare zero-shot, few-shot, and retrieval-augmented prompts
Specify guardrails for tone, structure, and factual grounding
Describe how you would measure quality, latency, and failure modes before deployment

Business Context

Data

Success Criteria

Constraints

No model fine-tuning; only prompt design, retrieval context, and lightweight preprocessing are allowed
Must run with a hosted LLM API and a limited token budget
Responses must follow company policy and avoid unsupported claims

Requirements

Design a prompt-engineering workflow for customer-support response generation
Define preprocessing for ticket cleanup, intent extraction, and policy-context retrieval
Propose a modern Python implementation using an LLM API, prompt templates, and evaluation scripts
Explain how you would compare zero-shot, few-shot, and retrieval-augmented prompts
Specify guardrails for tone, structure, and factual grounding
Describe how you would measure quality, latency, and failure modes before deployment

Business Context

Data

Success Criteria

Constraints

No model fine-tuning; only prompt design, retrieval context, and lightweight preprocessing are allowed
Must run with a hosted LLM API and a limited token budget
Responses must follow company policy and avoid unsupported claims

Requirements

Design a prompt-engineering workflow for customer-support response generation
Define preprocessing for ticket cleanup, intent extraction, and policy-context retrieval
Propose a modern Python implementation using an LLM API, prompt templates, and evaluation scripts
Explain how you would compare zero-shot, few-shot, and retrieval-augmented prompts
Specify guardrails for tone, structure, and factual grounding
Describe how you would measure quality, latency, and failure modes before deployment

Interview Guides

Business Context

Data

Success Criteria

Constraints

Requirements

Optimize E-commerce Prompts for Support

Business Context

Data

Success Criteria

Constraints

Requirements

Your Answer

Optimize E-commerce Prompts for Support

Business Context

Data

Success Criteria

Constraints

Requirements

Optimize E-commerce Prompts for Support

Business Context

Data

Success Criteria

Constraints

Requirements

Your Answer