NoRag.
/

NoRag.
RAG without vectors.

Ask your docs anything. Get an answer you can trace — full sections, exact citations. No vector DB, no embedding API, no recurring cost.

Why not RAG.

Vectors are opaque, chunks are arbitrary, and ingestion keeps paying. NoRag swaps the whole stack for Markdown the LLM can read directly.

Infrastructure
RAG
Vector DB + embedding model
NoRag
Plain Markdown files
Cost of adding a doc
RAG
Recurring (re-embed + storage)
NoRag
One-shot archivist pass
Context given to LLM
RAG
Arbitrary chunks
NoRag
Complete sections
Auditability
RAG
Opaque vectors
NoRag
Git-diffable Markdown
Citations
RAG
Approximate
NoRag
Precise [doc_id, section]
Who fixes the index?
RAG
Data scientist
NoRag
Any dev who reads MD

L1 — two calls, done.

Call 1: a small model reads the question, the document catalog, and the agent catalog. It picks an agent and the relevant sections. Call 2: the chosen agent reads those sections and answers with citations.

Question
Router (SLM)
→ agent + docs
Answer (LLM)
[doc_id, section]
{
  "agent_id": "juriste_conformite",
  "documents": [
    { "doc_id": "contrat_acme", "sections": ["art_7", "annexe_A"] }
  ],
  "reasoning": "Contract retention question → juriste + SLA clauses"
}

Multi_L — parallel, then synthesized.

A Planner fans out N L1 layers — different agents, sub-questions, or corpora. The Aggregator names contradictions and writes the synthesis.

Planner (SLM)
emits N layer plans
Layer 1
agent: juriste_conformite
L1 → answer + citations
Layer 2
agent: analyste_technique
L1 → answer + citations
Layer 3
agent: analyste_finance
L1 → answer + citations
Aggregator (LLM)
synthesis · all citations preserved · contradictions named

Four presets. Same engine.

Configure Multi_L for your use case by picking a preset — or let the Planner decide automatically.

AMulti-Agent

Same question, different agents. Cross-perspectives in one response.

Layer 1: juriste_conformite
Layer 2: analyste_technique
Layer 3: analyste_finance
BDecomposition

Split the question into sub-questions routed independently.

L1: "AWS cloud strategy 2020-2024"
L2: "Azure cloud strategy 2020-2024"
CMulti-Corpus

Same question, different agents, different document scopes.

L1: juriste, scope=contrats
L2: analyste_technique, scope=doc_technique
DHybrid / Auto

Planner freely combines agents, sub-questions, and index scopes.

Let the Planner decide.

Under the hood.

Two Markdown files. That’s the entire “database”. Git-diffable, human-readable, zero infra.

data/index.md
## contrat_saas_acme
- **Titre** : Contrat SaaS — Acme Technologies
- **Résumé** : Accord B2B SaaS couvrant SLA, rétention des données, et sécurité.
- **Sections** :
  - `art_7` — Rétention données — mots-clés : rétention, RGPD, purge, 90 jours
  - `annexe_A` — SLA et disponibilité — mots-clés : SLA, uptime, 99.9%, crédit
data/index_system_prompt.md
## juriste_conformite
**Description** : expert juridique B2B (contrats, RGPD, SLA).
**Quand l'utiliser** : clauses, rétention, DPA, SLA.
**System prompt** :
> Tu es juriste senior. Tu cites [doc_id, section] systématiquement.

Get started.

API

Full L1 + Multi_L via FastAPI. Any client, any language.

uvicorn api.main:app --reload
Web chat

Copy a plugin prompt into ChatGPT, Claude, Gemini, or Grok. L1 only.

norag/plugins/<provider>.md
Claude Code skill

Use /norag directly in your terminal. L1 + Multi_L, reads local files.

/norag <question>