How do I avoid ingesting the same content twice?

Use sourceRef to make conversation ingest idempotent — re-ingesting the same sourceRef is a no-op.

My worker retries failed ingests. How do I prevent the same conversation from being stored multiple times?

Pass a sourceRef — a stable, unique string identifying the source record. If you ingest with the same sourceRef twice, the second call is a no-op: the job returns successfully but no duplicate content is written.

How to use it

curl -X POST https://brain.unisonlabs.ai/v1/brain/ingest \
  -H "Authorization: Bearer $UNISON_TOKEN" \
  -H 'content-type: application/json' \
  -d '{
    "items": [
      {
        "type": "conversation",
        "turns": [...],
        "sourceRef": "chat-session-7f3a9b2c",
        "visibility": "private"
      }
    ]
  }'

The sourceRef can be any stable string: a database row ID, a session UUID, a content hash, a webhook event ID.

What makes a good sourceRef

  • Stable: The same logical record always produces the same sourceRef.
  • Unique: Two different records never share a sourceRef.
  • Opaque to the brain: Unison doesn't parse or index it — it's purely for deduplication.

A content hash (sha256(rawText)) is a reliable choice when you don't have a natural record ID.

Direct page writes are naturally idempotent

If you're writing documents with PUT /v1/brain/doc (path in the body), idempotency is built-in: writing the same path twice overwrites the first, and the brain tracks versions. You don't need sourceRef for document writes.

On this page