Session Memory Pruning

1. Memory pruning protects both quality and cost

Long sessions accumulate irrelevant details, repeated acknowledgments, and outdated assumptions. Without pruning, the model becomes slower, more expensive, and more likely to focus on the wrong context.

2. Keep working memory and durable memory separate

A session should distinguish immediate task context from long-term user facts or policy preferences. That separation makes pruning safer.

3. Prune by value, not by age alone

Some old details still matter because they constrain the task. Some recent details do not matter at all. Good pruning rules consider task relevance, recency, confidence, and whether the fact already exists in durable memory.

4. Summaries should preserve decisions and unresolved items

Compression works best when it keeps commitments, active goals, constraints, and open questions. That is the context the next turn truly needs.

5. Review pruning failures explicitly

If the system forgets critical context or keeps too much noise, those failures should feed back into the pruning rubric and memory-layer design.

Practical Checklist

Separate working memory from durable memory.
Prune by task value and relevance, not by age alone.
Preserve decisions, constraints, and open questions in summaries.

References

LangChain, Memory overview
Useful for framing separate memory layers and persistence choices.
Anthropic, Long context tips
Helpful for reasoning about long-context behavior and summarization.
OpenAI, Prompting guide
Relevant when pruning changes what remains in the active prompt.