Docs

Get a memory.

Neruva Memory exposes a Pinecone-compatible REST API at https://api.neruva.io/v1. If you can call Pinecone, you can call us.

Authenticate

Issue an API key from the dashboard. Send it with every request as either an Api-Key header or a bearer token.

curl https://api.neruva.io/v1/health \
  -H "Api-Key: nv_..."

Create an index

POST /v1/indexes
{
  "name": "agent-memory",
  "dimension": 1024,
  "metric": "cosine",
  "spec": {
    "serverless": {"cloud": "gcp", "region": "us-central1"}
  }
}

Upsert vectors

Submit float vectors. They are normalized, 1-bit-encoded, and written to an append-only WAL. Index updates asynchronously and is queryable within milliseconds.

POST /v1/indexes/agent-memory/vectors/upsert
{
  "namespace": "agent_42",
  "vectors": [
    {
      "id": "mem_001",
      "values": [0.1, -0.3, ...],
      "metadata": {"role": "assistant", "ts": 1715533200}
    }
  ]
}

Query

POST /v1/indexes/agent-memory/query
{
  "namespace": "agent_42",
  "vector": [0.1, -0.3, ...],
  "topK": 8,
  "includeMetadata": true,
  "filter": {
    "role": {"$eq": "assistant"},
    "ts":   {"$gte": 1715000000}
  }
}

Supported operators: $eq, $ne, $in, $nin, $gt, $gte, $lt, $lte.

Drop-in Pinecone client

# Existing Pinecone code:
from pinecone import Pinecone
pc = Pinecone(api_key="pcsk_...")

# Switch to Neruva (zero changes below this line):
from neruva import Pinecone
pc = Pinecone(api_key="nv_...")

index = pc.Index("agent-memory")
index.upsert([("mem-1", vec, {"agent": "coder"})])
index.query(vector=vec, top_k=8)
HD substrate -- operations on memory

The substrate reasons.

Every endpoint above operates on a vector by similarity. The endpoints below operate on the vector's algebra. Triples bind, queries unbind, analogies parallelogram, interventions substitute, plans minimize Expected Free Energy -- all in the substrate, none of them touching an LLM.

All HD endpoints accept the same Api-Key header. JSON in, JSON out. Sub-millisecond per call.

Knowledge graphs

Bind (subject, relation, object) triples into a single ~32KB vector per relation shard. Query by (subject, relation) -- unbind returns the most likely object with a calibrated cosine-based confidence. Thousands of facts per shard. No materialized triple table.

POST /v1/hd/kg/people/facts
{
  "facts": [
    {"subject": "alice", "relation": "lives_in", "object": "toronto"},
    {"subject": "bob",   "relation": "lives_in", "object": "vancouver"},
    {"subject": "alice", "relation": "works_at", "object": "acme"}
  ]
}
-> {"added": 3, "relations": 2}

POST /v1/hd/kg/people/query
{"subject": "alice", "relation": "lives_in"}
-> {"object": "toronto", "confidence": 0.71}

GET    /v1/hd/kg/people/stats
DELETE /v1/hd/kg/people

Analogy by algebra

Parallelogram completion: A:B::C:?. The substrate computes the answer D = C xor (A xor B) over factored binary items. Stateless -- the codebook is deterministic in (n_feat, seed).

POST /v1/hd/analogy
{"n_feat": 6, "a": 0, "b": 1, "c": 2, "seed": 4301}
-> {
     "candidate": 3,
     "candidate_bits": [1,1,0,0,0,0],
     "cosine": 0.999,
     "runner_up": 0.83,
     "ambiguity": 0.83,
     "confidence": 0.17
   }

Causal do-operator

Upload worlds (rows of categorical variables). Then query either observation (conditional probability) or intervention (Pearl's do-operator -- forced assignment that cuts the confounder path). Same logged data, two arithmetically distinct queries.

POST /v1/hd/causal/scm1/worlds
{
  "n_vars": 3,
  "vocab_per_var": [2, 2, 2],
  "worlds": [[0,1,1], [1,1,0], ...],   # rows of int category indices
  "seed": 4401
}

# What did we observe? P(Y=1 | X=1)
POST /v1/hd/causal/scm1/query
{
  "query_type": "observation",
  "condition_var": 1, "condition_value": 1,
  "query_var": 2,     "query_value": 1
}

# What WOULD happen if we forced X=1? P(Y=1 | do(X=1))
POST /v1/hd/causal/scm1/query
{"query_type": "intervention", ...}

DELETE /v1/hd/causal/scm1

Active-Inference planning

Expected Free Energy planner over a discrete action space. Provide vocabulary size, action count, initial state attrs, goal attrs, plan depth, and candidate count -- get the optimal action sequence plus a KL-to-goal that doubles as a confidence signal. No prompt, no LLM, no temperature.

POST /v1/hd/plan
{
  "V": 100,
  "n_actions": 8,
  "init_state": [0, 1, 2],
  "goal_attrs": [10, 11, 12],
  "depth": 4,
  "n_candidates": 20,
  "seed": 6001
}
-> {
     "best_plan": [6, 2, 6, 0],
     "kl_divergence": 33.5,
     "confidence": 0.029
   }

Endpoint reference

MethodPathPurpose
GET/v1/healthLiveness
POST/v1/indexesCreate index
GET/v1/indexesList indexes
GET/v1/indexes/{name}Describe
DELETE/v1/indexes/{name}Delete index
POST/v1/indexes/{name}/vectors/upsertWrite vectors
POST/v1/indexes/{name}/queryTop-K query
POST/v1/indexes/{name}/vectors/deleteDelete by id / filter
GET/v1/indexes/{name}/vectors/fetchFetch by IDs
POST/v1/indexes/{name}/vectors/updatePatch metadata
GET/v1/indexes/{name}/describe_index_statsPer-namespace counts
POST/v1/hd/kg/{name}/factsHD KG -- bind triples
POST/v1/hd/kg/{name}/queryHD KG -- unbind (s,r) -> (o, conf)
GET/v1/hd/kg/{name}/statsHD KG -- shard stats
DELETE/v1/hd/kg/{name}HD KG -- drop
POST/v1/hd/analogyHD parallelogram analogy
POST/v1/hd/causal/{name}/worldsHD causal -- add SCM worlds
POST/v1/hd/causal/{name}/queryHD causal -- observe vs intervene
DELETE/v1/hd/causal/{name}HD causal -- drop
POST/v1/hd/planHD planner -- EFE over action space