
You are building a chat assistant for an internal workflow tool. A conversation can run long enough that earlier messages no longer fit in the model context window, and the assistant still needs to answer correctly using the most relevant prior state.
What happens when a prompt exceeds the context window mid-conversation? How do you handle it?