// Evidence

Long-Context Validation

Public summary of how SPM performs under increasing history size, including the ultra-long validation surface.

Layered long-context buckets

The public layered set is used to keep task type roughly stable while baseline history size grows through 10k, 50k, 100k, 300k, and 800k+ token buckets.

infrastructure recall
configuration value recall
environment separation
planning target recall

Ultra-long validation

The ultra-long set is the narrowest and most extreme public surface. It answers whether SPM still compresses aggressively without visible collapse in factual retention when baseline history becomes operationally extreme.

Cases

Baseline size

0.8M+ tokens

Reported final prompt

about 1.5k–1.6k tokens

Compression ratio

>500×

Public takeaway

The long-context story on this site is simple: SPM is not only described through small benchmark cases. It is also presented against very large baseline histories where factual usefulness must survive extreme context length.