
You are working on an LLM-powered feature and notice that the prompt has grown over time with repeated instructions, long examples, and extra context. Input token usage is now higher than it needs to be, and you want to reduce it without making the model's output worse.
How would you optimize a prompt to reduce input token usage by 30% without degrading the quality of the model's output?