0-2

Entity Extraction, Observation Clusters, Multi-View Embeddings

TL;DR

This release introduces three foundational Neural Memory capabilities. Entity Extraction automatically identifies engineers, projects, endpoints, and services mentioned in your development activity using both pattern matching and LLM-powered semantic extraction. Observation Clusters groups related events by topic using embedding similarity, entity overlap, and actor involvement. Multi-View Embeddings generates three specialized vectors per observation (title, content, summary) optimized for different query types, improving search relevance across broad and specific searches.

Entity Extraction, Observation Clusters, Multi-View Embeddings


Entity Extraction

Neural Memory now automatically identifies and tracks meaningful references in your development activity. The hybrid extraction pipeline combines fast regex patterns with LLM-powered semantic extraction to capture entities that would otherwise be missed.

What's included:

  • Seven entity categories: engineers (@mentions), projects (#issues, ENG-123), API endpoints, environment variables, file paths, external services, and generic references

  • Dual extraction paths: Regex patterns run inline during capture (0.70-0.95 confidence); LLM extraction runs async for content >200 characters

  • Automatic deduplication: Entities are tracked by workspace with occurrence counts and "last seen" timestamps

  • Search integration: Entity mentions boost search results via the four-path retrieval system

Example entities extracted:

text
@sarah-dev → engineer (0.90 confidence)
#authentication → project (0.95 confidence)
POST /api/users → endpoint (0.95 confidence)
DATABASE_URL → config (0.85 confidence)
src/lib/auth.ts → definition (0.80 confidence)

Limitations:

  • LLM extraction requires content >200 characters

  • Confidence threshold of 0.65 filters low-confidence extractions

  • Patterns optimized for English text

  • API endpoint detection requires HTTP verb prefix (GET, POST, etc.)


Observation Clusters

Related development events are now automatically grouped into topic clusters. Each observation is assigned to the most semantically similar cluster—or creates a new topic group if no good match exists.

What's included:

  • Four-signal affinity scoring: Embedding similarity (40pts), entity overlap (30pts), actor overlap (20pts), temporal proximity (10pts)

  • 60-point threshold: Observations scoring 60+ join existing clusters; below that creates new clusters

  • Cluster context in search: Topic labels and keywords are returned as context in search results

  • Automatic tracking: Primary entities, actors, observation counts, and temporal bounds

Affinity calculation:

typescript
// Maximum score: 100 points
embeddingSimilarity * 40 // Semantic relatedness
+ entityOverlap * 30 // Shared @mentions, #issues
+ actorMatch * 20 // Same contributor
+ temporalProximity * 10 // Recent activity (decays over 10 hours)

Current status:

Observation Clusters is in beta. Cluster assignment and search context are fully operational. LLM-generated cluster summaries are not yet available—observations are grouped but the summary generation pipeline requires a schema migration (Phase 5) to link observations to their assigned clusters.

Why we built it this way: We chose a multi-signal approach over pure embedding similarity because development context matters. A PR from the same author about the same feature should cluster together even if the semantic content differs slightly.


Multi-View Embeddings

Every observation now generates three specialized embedding vectors, each optimized for different query types. This improves search relevance by matching the right content perspective to your search intent.

The three views:

View

Text

Purpose

Title

Event headline (≤120 chars)

Broad topic discovery

Content

Full body text

Detailed, specific queries

Summary

Title + first 1000 chars

Balanced retrieval

What's included:

  • Cohere embed-english-v3.0: 1024-dimensional vectors with input type optimization

  • Batch generation: All three embeddings generated in a single API call

  • Smart deduplication: Search queries all views; results deduplicated by max score

  • Cluster assignment: Uses content embedding for best semantic matching

Search behavior:

typescript
// All views are queried in parallel
const results = await pinecone.query({ vector: queryEmbedding, filter: { layer: "observations" }, topK: 50
});

// Deduplicate by observation, keeping max score
// If title matches at 0.85 and content at 0.72,
// the observation appears once with score 0.85

Limitations:

  • Cohere provider only (no OpenAI or custom models)

  • English language only

  • Fixed 1024 dimensions (no dimension reduction for cost optimization)

  • 3x vector storage per observation


Resources

4 min read

Try Lightfast now.

Join Early Access