Hindsight: My Self-Hosted Knowledge Graph for Personal Knowledge Management

What is Hindsight? Link zu Überschrift

Hindsight is an open-source knowledge graph from Vectorize.io that functions as a personal knowledge management system. At its core, Hindsight stores information as linked facts in a graph, enriches them with embeddings, and makes them searchable via an LLM-powered API. You can think of it as a “Second Brain” – a system that absorbs knowledge, links it, and makes it retrievable when needed.

What particularly appealed to me about Hindsight: it can be run completely self-hosted, includes an MCP tool (sync_retain) for integration with AI coding agents, and the data remains entirely under your own control.

My Setup Link zu Überschrift

My Hindsight deployment runs on a Hetzner server and is completely orchestrated via Docker Compose. The architecture consists of four containers:

Architecture Overview Link zu Überschrift

┌─────────────────────────────────────────────────┐
│                   Internet                       │
└──────────────────────┬──────────────────────────┘
                       │
              ┌────────▼────────┐
              │  Traefik v2.11  │
              │  (Reverse Proxy)│
              │  Let's Encrypt  │
              └───┬─────────┬───┘
                  │         │
     ┌────────────▼──┐  ┌──▼────────────┐
     │  Hindsight    │  │    Auth        │
     │  v0.5.0       │  │  (ForwardAuth) │
     │  Port 9999/   │  └───────────────┘
     │       8888    │
     └───────┬───────┘
             │
     ┌───────▼───────┐
     │  PostgreSQL   │
     │  pgvector:pg15│
     │  (ext. Volume)│
     └───────────────┘

The Components in Detail Link zu Überschrift

ComponentDetails
Hindsightghcr.io/vectorize-io/hindsight:0.5.0 – pinned image tag
Databasepgvector/pgvector:pg15 with external volume pgdata
Reverse ProxyTraefik v2.11 with automatic Let’s Encrypt TLS certificates
AuthenticationBasic Auth for the control plane (Port 9999), Bearer Token ForwardAuth for the API (Port 8888)

LLM & Retrieval Configuration Link zu Überschrift

For the LLM connection, I use Gemini 2.5 Flash – a good compromise between speed and quality. The embeddings and reranker run completely locally, which reduces dependency on external APIs and increases data sovereignty.

The Graph Retriever is set to link_expansion – since version 0.5.0, the only available option after BFS and MPFP were removed. In practice, link expansion provides good results when traversing linked facts.

Update Experience: v0.4.22 → v0.5.0 Link zu Überschrift

On April 9, 2026, I performed the update from v0.4.22 to v0.5.0. My update process looks like this:

  1. Adjust Image Tag – from :latest to the pinned tag ghcr.io/vectorize-io/hindsight:0.5.0
  2. Pull new image
  3. Create database backup – as a precaution (in my case, a 1.4 MB backup_pre_v0.5.0_.sql.gz)
  4. Copy docker-compose.yml to the server
  5. Restart only the Hindsight container
  6. Check logs
  7. Perform health check via the API

The result: No errors. The database migration ran automatically upon container start, and the health check was green immediately. Total downtime was about 30 seconds.

What v0.5.0 Brings Link zu Überschrift

The new version has some exciting features:

  • 3-Phase Retain Pipeline – improved processing when saving new facts
  • Constellation View – a new visualization of the knowledge graph
  • Bank Template Import/Export – templates for recurring knowledge structures
  • sync_retain MCP Tool – integration with AI coding agents like Pi
  • OpenRouter Support – additional LLM provider options
  • Google Embeddings/Reranker Support – native support for Google models

At the same time, things were cleaned up: BFS and MPFP graph retrieval methods were removed (Link Expansion is now the default), and the Hindsight Hermes integration was also dropped.

Tips for Your Own Setup Link zu Überschrift

A few learnings from my experience:

Pin Image Tags Link zu Überschrift

I recommend always pinning the image tag in docker-compose.yml instead of using :latest. This gives you full control over updates and allows you to easily roll back if there are issues.

Backups Before Updates Link zu Überschrift

A pg_dump before every update takes only seconds and can save a lot of trouble in an emergency. In my case, the backup is stored directly in the deployment directory.

Secure Authentication Link zu Überschrift

The two-stage auth concept with Basic Auth for the admin interface and Bearer Token ForwardAuth for the API has proven effective. The control plane should definitely not be accessible without protection.

Local Embeddings Link zu Überschrift

If you prefer to keep your data under your own control, you should run embeddings and reranker locally. The performance is perfectly sufficient on a dedicated server.

Conclusion Link zu Überschrift

Hindsight has established itself for me as a solid tool for personal knowledge management. Self-hosting via Docker Compose is straightforward, updates run cleanly, and the integration with AI tools via MCP makes the system practical for everyday use. If you’re looking for an alternative to commercial “Second Brain” solutions and want to host your data yourself, you should definitely take a look at Hindsight.


Links: