🧩 Middleware Architecture for Long-Term Memory in LLM Agents

This document describes a modular, plugin-driven middleware system for building cognitive memory infrastructure in LLM agents, using graph-based storage and context-aware pipelines.

1. Overview

This architecture enables LLM agents to leverage graph-based knowledge storage (Neo4j or Memgraph), LLM-powered semantic enrichment, and a flexible plugin-based middleware system inspired by ASP.NET Core.

2. From State Machine to Contextual Middleware

The flow is not fixed.
Execution depends on message type, available services, and the evolving context.
Each plugin performs a task, modifies the context, and decides whether to continue or halt the pipeline.

3. Message Structure

{
  "type": "conversation" | "batch_conversation" | "mood" | "preference_save" | ...,
  "payload": { ... },
  "user_id": "abc123",
  "metadata": {
    "timestamp": "...",
    "status": "default" | "error" | "exit" | "on_exit"
  }
}

4. Context Structure

Context = {
  user_id: str,
  request: {...},
  response: {...},
  persistible_data: Dict[type, Any],
  services: Dict[interface, implementation],
  metadata: {
    status: "default" | "error" | "exit" | "on_exit",
    trace: [...],
    errors: [...]
  }
}

5. Middleware Plugins

Plugins are modular, stateless functions or classes that:

Receive and mutate the Context
Request services (e.g., iUserProvider, iApiKeyProvider)
Optionally read/write persistible_data (e.g., GraphData, ConversationSummary)
Handle their own errors or propagate status changes

Example plugin types:

ValidateUserId
EnrichWithMemory
LLMExtractEntities
DeduplicateGraphNodes
MergeToGraph
SimpleStorage
DomainRuleProcessor
ObservabilityWrapper

6. Plugin Control & Orchestration

A PluginRunner resolves which plugins to invoke based on message type and context
Plugins declare what interfaces or context keys they depend on
Middleware manager tracks:
- execution order
- errors thrown or handled
- final context state

7. Flow Control

Plugins can:

Exit pipeline on error or invalid state
Continue on recoverable conditions
Trigger alternate plugin paths (e.g., fallback, retry, error resolution chains)

This provides graceful degradation and partial processing when complete processing is not possible.

8. Service & Dependency Model

Context serves as a Service Locator. Plugins can request services such as:

iUserProvider
iApiKeyProvider
iContextLogger
iConfigService

This enables separation of concerns, runtime configurability, and testability.

9. Observability & Metrics

Wrap all plugin calls with timing, tracing, and exception tracking
Persist or stream trace logs to a monitoring dashboard
Support per-plugin execution metadata (duration, success/failure, retries)

10. Memory Infrastructure

The memory layer remains graph-based, with:

Entity and relation extraction from LLMs
Node deduplication via embedding similarity
Flexible insertion paths (Neo4j or Memgraph)
Weekly summarization support for memory compression

11. Advanced Features (Planned)

Plugin Dependency Graph: resolve ordering based on declared needs
Plugin Config: per-type and per-user plugin chains
Parallel Plugin Execution: fork-join models for batch processing
Adaptive Pipelines: dynamically adjust execution path based on real-time context state
Introspection: expose context and plugin state as GraphQL or REST for UI observability

12. Summary

This architecture builds a modular, pluggable, and context-aware middleware system for agent memory and task processing. It is not a rigid state machine, but a distributed, adaptive pipeline where plugins operate over a shared evolving context—allowing scalable, personalized, and reflective AI agents to emerge.

1. Overview​

2. From State Machine to Contextual Middleware​

3. Message Structure​

4. Context Structure​

5. Middleware Plugins​

6. Plugin Control & Orchestration​

7. Flow Control​

8. Service & Dependency Model​

9. Observability & Metrics​

10. Memory Infrastructure​

11. Advanced Features (Planned)​

12. Summary​

See Also​