You trained a transformer model after starting with a simpler NLP baseline, and the transformer performed better on the same task. Now you need to explain that improvement in a way that is technically sound, not just descriptive.
How would you explain why a transformer improved your baseline model?
Understanding of transformer advantages over lexical baselinesAbility to compare fine-tuned models on the same taskUse of F1 and per-class analysis rather than accuracy aloneEvidence-based explanation through error analysis