





You have a general-purpose language model, but your product needs more consistent behavior on a specific task such as customer support response drafting, dispute reason classification, or policy-grounded answer generation. The base model works reasonably well with prompting, but quality is uneven and you are considering fine-tuning.
Can you explain how you would approach fine-tuning a large language model for a specific task?