Self-Optimization

Mnemexa continuously evaluates the health of your workspace’s memory store. Stale, duplicate, overlong, and never-retrieved memories are detected automatically and surfaced as recommendations — visible from the dashboard, queryable via optimize.health.

The five signals

Each signal is computed over a rolling window (default 30 days). They’re surfaced as both raw counts and rates (count ÷ total) in the optimize.health response.

stale

Memories that haven’t been retrieved within the window. A high stale_rate means the workspace is accumulating context that nothing is reading — either the memories are obsolete, or the queries hitting the workspace don’t match them.

stale doesn’t mean “wrong” — historical facts (project timelines, prior decisions) stay valuable even unread. It’s a signal, not a verdict. The optimization recommendation usually offers compression or archival, not deletion.

never_retrieved

A stricter subset of stale — memories that have never been retrieved since they were stored, not just unread within the window. A high never_retrieved_rate is the clearest sign of write-time noise: your agent is storing things no query is ever asking for.

duplicate

Near-duplicate clusters that slipped past the inline dedup pipeline. The inline pipeline catches dupes at write time, but as memories accumulate, two independently-stored memories can converge in similarity below the 0.70 LLM threshold. The optimization sweep catches these post-hoc.

overlong

Memories whose text exceeds the recommended length budget. Long memories cost more to embed, more to retrieve, and tend to bundle multiple distinct facts that should have been stored separately. Recommendations usually offer to split or summarize.

optimized_count / optimized_rate

A counter, not a problem signal — memories that have already been auto-improved (compressed, merged, archived) by previous optimization passes. Tracks how much curation work has happened on the workspace.

The quality score

quality_score = 100 − (penalty for each signal × rate)

Mapped into a 0–100 scale where:

  • 90–100 — clean workspace, low-signal noise, fresh memory.
  • 70–89 — some stale or duplicate accumulation. Worth a sweep.
  • 50–69 — material amount of underused memory. Optimization recommended.
  • < 50 — the workspace needs curation work. The dashboard surfaces specific actions.

The exact penalty weights aren’t documented publicly because they’re tuned over time. What’s stable: lower is worse, the signal counts tell you why, and the dashboard shows you actionable recommendations.

Recommendations

Beyond the raw signals, Mnemexa produces concrete optimization recommendations — actionable cards in the dashboard like “merge these 3 near-duplicate memories”, “archive memories from project X that’s no longer active”, “split this overlong memory into 4 distinct facts”.

The recommendation count surfaces on the API response as open_recommendations. The recommendations themselves are dashboard-only at the moment — there isn’t a public endpoint for fetching or actioning them. Apply them from the workspace dashboard at app.mnemexa.com.

Recommendations are suggestions, not auto-applied actions. The system never deletes or modifies a memory without you accepting the recommendation. Self-optimization observes; you decide.

Automatic decay

A separate background process applies temporal decay — not deletion, but ranking weight reduction:

  • temporal memories whose valid_until has passed have their effective importance reduced over time.
  • persistent memories that haven’t been accessed in a long time also gradually decay, but more slowly.

Decayed memories aren’t removed. They remain retrievable; they just rank lower in the hybrid score so fresher, more relevant memories win head-to-head.

There’s no manual decay knob on the public API. Decay applies automatically as part of the normal operation of the workspace.

How to use this in practice

import mnemexa

client = mnemexa.Client()
h = client.optimize.health()

if h.quality_score < 70:
    print(f"Workspace needs attention: {h.quality_score}/100")
    for issue in h.top_issue_types:
        print(f"  • {issue.type}: {issue.count}")
    print(f"  Open recommendations: {h.open_recommendations}")
    print(f"  Curate at https://app.mnemexa.com")

Run this on a schedule (e.g. nightly) and you’ll catch memory-quality drift before it affects retrieval.

What this is not

  • Not a hard quota. A low quality score doesn’t block writes or retrieves. It’s a diagnostic.
  • Not a deletion mechanism. Decay reduces rank; it doesn’t remove memory.
  • Not configurable from the API. The decay window, signal weights, and recommendation rules are tuned centrally.