Date: March 12, 2026
Status: Defined — Pending Implementation
Classification: aiConnected OS — Memory Architecture Layer 1
What It Is
The Rotating Context Window is an intra-conversation memory architecture that eliminates the hard tradeoff between RAG’s information loss and long-context’s cost inefficiency. Rather than treating the context window as a passive container and RAG as a separate retrieval pipeline, the Rotating Context Window unifies them into a single active memory surface that manages itself in real time. It is designed specifically as an integration workaround for platforms where the developer does not control the context ceiling — such as Claude, GPT, or other third-party model APIs. On those platforms a token limit is imposed externally. The Rotating Context Window is how aiConnected operates intelligently within that imposed constraint.The Problem It Solves
Existing approaches present a forced tradeoff:- RAG — Chunks documents for efficient retrieval but loses information at chunk boundaries, severs semantic continuity, and retrieves fragments that may be slightly off-target or misleading.
- Long Context — Preserves full document fidelity by loading everything into the context window but forces the model to re-read the entire document on every conversation turn, making it economically unviable at scale.
How It Works
Window Division
The total available context window is divided into two zones:| Zone | Size | Purpose |
|---|---|---|
| Live Window | 50% of total context | Active conversation memory — always in context, no retrieval needed |
| RAG Layer | Unlimited | Conversation history that has been chunked, enriched, and stored — retrieved on demand |
The Chunking Threshold
Content does not get pushed to RAG reactively. Chunking begins proactively at 80% of the live window capacity — giving the system enough runway to:- Complete the current conversation turn without interruption
- Chunk clean, complete exchanges rather than cutting mid-thought
- Run the entire process as a background operation with no conversation pause
Background Chunking Process
When content crosses the chunking threshold, a background process:- Segments the oldest content into clean chunks at natural turn boundaries
- Enriches each chunk with keywords, a short summary, and a timestamp
- Stores enriched chunks in the conversation’s micro-database
- Frees the live window space for the continuing conversation
Retrieval — Every Turn
On every conversation turn, a lightweight semantic search runs against the RAG layer automatically. This is not triggered by the user referencing something old — it runs regardless, because:- The model doesn’t always know what it doesn’t know
- Relevant stored context may connect to the current turn in ways that aren’t linguistically obvious
- Waiting for an explicit reference means sometimes missing relevant context entirely
Conflict Resolution — Version History
When retrieved content conflicts with something already in the live window (e.g. an earlier design decision surfacing against a newer one):- Timestamps resolve priority automatically — newer content takes precedence by default
- Both versions are preserved — nothing is deleted
- Conflicts are surfaced to the user when relevant — “I have two versions of this, here’s the current one and here’s the prior one”
- This mechanism creates implicit version history as a natural byproduct of the architecture
Micro-Database Model
Every conversation is its own isolated micro-database. The entire history of a conversation lives in that database. Starting a new conversation means a clean micro-database with its own fresh Rotating Context Window. In project or collaborative contexts, permissions govern whether a conversation’s RAG layer can search across sibling conversation databases. This is a permissions decision, not an architectural one. The search logic remains the same — it simply has authorized access to a broader pool.Relationship to Neurigraph
The Rotating Context Window is Layer 1 of the aiConnected memory architecture — intra-conversation memory. Neurigraph is Layer 2 — inter-conversation, cross-project, long-term memory organized as a hierarchical 3D knowledge graph. Over time, conversation micro-databases feed into Neurigraph as knowledge matures. The Rotating Context Window does not need to know anything about Neurigraph. It manages its own micro-database and passes upward. The boundary is clean.Key Principles
- No hard stop — chunking is a background stream, never an interruption
- Every turn is searched — retrieval is constant, not reactive
- Time and tokens govern everything — no complex intent-detection or semantic scoring pipelines making judgment calls
- Nothing is deleted — version history is implicit and automatic
- The live window stays at 50% — always working with the model at full capacity
- RAG storage is unlimited — storage is cheap; there is no reason to cap it
- Chunks are enriched — keywords, summaries, and timestamps travel with every chunk
What This Is Not
- This is not a replacement for Neurigraph — it feeds it
- This is not the final vision — it is an integration workaround for platforms with imposed token limits
- The final vision is the Infinite Context Window — documented separately
Originated by Bob Hunter, March 12, 2026. Developed through iterative conversation with Claude (Anthropic). All conceptual authorship belongs to Bob Hunter.