Privacy & Security

How Lightfast ensures data isolation, access control, and compliance through tenant isolation and audit trails

Privacy & Security

Lightfast is built with privacy and security as core principles. Your team's data stays isolated, access controls are enforced, and all activity is auditable.

Tenant Isolation

Every workspace operates in complete isolation from other organizations. Your data never mixes with content from other teams.

Complete Workspace Separation

Database isolation:

Each workspace has its own tables in the database layer
Queries are scoped by workspace ID at the database level
No cross-workspace joins or queries possible
Row-level security enforces isolation

Namespaced embeddings:

Vector indexes are partitioned by workspace ID in Pinecone
Separate namespaces prevent cross-contamination
Queries never search across workspace boundaries
Index isolation at the infrastructure level

Isolated caches:

Redis caches are prefixed by workspace ID
Cache keys include workspace identifier
No shared cache entries between workspaces
Cache eviction is workspace-scoped

Independent calibration:

Ranking weights tune per workspace
Quality metrics measured separately
Feedback loops operate in isolation
No cross-workspace learning or model sharing

Why This Matters

Security:

Vulnerabilities affect one workspace, not all
Breaches cannot leak data across organizations
Attack surface limited to single tenant

Privacy:

Your usage patterns never influence other software teams
Query data stays completely private
No cross-organization analytics or reporting

Compliance:

Meet data residency requirements per workspace
Separate retention policies per organization
Audit trail isolation for compliance reviews

Performance:

No "noisy neighbor" problems
Resource allocation per workspace
Predictable query performance

Zero trust architecture. Even internal Lightfast systems authenticate and authorize every workspace operation. No implicit trust between components—every request is validated.

Access Control

Lightfast respects the access controls of your connected repositories and tools. If you can't see it in GitHub, you can't find it in Lightfast.

Permission Inheritance

GitHub permissions:

Only content you can view in GitHub is indexed for you
Private repos require appropriate access
Organization membership determines visibility
Team-based access controls are honored

Per-user filtering:

Search results are filtered based on individual permissions
No user sees content they lack access to
Permissions checked at query time, not index time
Real-time updates when access changes

Dynamic access checks:

If permissions change (repo made private, access revoked), content disappears from results immediately
No stale results from previously accessible content
Webhooks and sync jobs keep permissions current

Software team visibility:

Software team members only see content from repositories they have access to
Workspace admins see aggregated metrics, not restricted content
Invitation-based access to workspace features

Permission Example

Alice has access to repo-a and repo-b. Bob only has access to repo-a.

Scenario: Both search for "authentication"

Alice sees results from:

repo-a/docs/auth.md
repo-a/src/auth/
repo-b/auth-service/
PR discussions from both repos

Bob sees results from:

repo-a/docs/auth.md
repo-a/src/auth/
(No results from repo-b)

Alice and Bob are in the same workspace, but see different results based on their individual GitHub permissions.

Permission Enforcement

At indexing:

Content is tagged with required permissions
Access control lists stored with each document
Organization and team membership recorded

At query time:

User's GitHub token validated
Permissions fetched from GitHub API
Results filtered before ranking
Only accessible content enters result set

On permission changes:

GitHub webhooks notify Lightfast of access changes
Content re-indexed or removed as needed
User sessions invalidated when access revoked
Results update within minutes

Data Security

Encryption

At rest:

Database encrypted with AES-256
S3 document storage encrypted
Pinecone indexes encrypted
Redis cache encrypted in transit and at rest

In transit:

All API traffic uses TLS 1.3
Internal service communication encrypted
Database connections use SSL/TLS
No plaintext data transmission

Authentication

API authentication:

API keys with workspace-scoped access
Keys can be rotated without downtime
Rate limiting per key to prevent abuse
Automatic key expiration for security

User authentication:

OAuth via GitHub (or other providers)
No passwords stored in Lightfast systems
Multi-factor authentication supported
Session tokens with configurable TTL

Service authentication:

Mutual TLS for internal services
Service accounts with minimal permissions
Credential rotation enforced
Audit logs for all service access

Network Security

Infrastructure:

VPC isolation for production services
Private subnets for databases
Security groups restrict traffic
WAF protects API endpoints

DDoS protection:

Cloudflare or similar CDN
Rate limiting at multiple layers
Auto-scaling handles traffic spikes
Graceful degradation under load

Audit & Compliance

Full query audit trail enables compliance and security reviews.

Query Logs

What's logged:

Every search, retrieval, and answer request
User ID and workspace ID
Timestamp and query parameters
Results returned and citations shown
API key used (if applicable)

Retention:

Configurable per workspace (30-365 days)
Longer retention available for enterprise
Automatic archival to cold storage
Deletion after retention period

Access:

Workspace admins can download logs
API for programmatic access
Filtering and search capabilities
Export to SIEM tools

Access Logs

What's tracked:

Which users accessed which documents
Timestamp and access method
Source IP and user agent
Results clicked and citations followed

Use cases:

Security investigations
Compliance audits
Usage analytics
Anomaly detection

Compliance Features

SOC 2 Type II:

Annual audits by independent firms
Evidence collection automated
Security controls documented
Continuous monitoring

GDPR compliance:

Data deletion on request
Export user data in portable format
Consent management for analytics
Privacy-by-design architecture

Data residency:

Workspace data stays in specified region
No cross-region replication without consent
Regional deployment options (US, EU)

Retention policies:

Configure document retention per workspace
Automatic deletion after configured period
Exceptions for legal hold
Audit trail of deletions

SOC 2 Type II compliance. Lightfast undergoes regular security audits and maintains SOC 2 Type II certification. Enterprise customers can request compliance documentation and security questionnaires.

Data Handling

What We Store

Metadata:

Document titles, URLs, timestamps
User IDs, workspace IDs
Repository names, file paths
Relationships and graph edges

Content:

Document bodies in S3
Vector embeddings in Pinecone
Chunks and summaries
Observations and highlights

Analytics:

Query logs and access logs
Usage metrics and performance data
Calibration weights and thresholds
Quality evaluation results

What We Don't Store

Credentials:

No passwords or API keys from source systems
GitHub OAuth tokens are ephemeral
No long-term storage of secrets

Sensitive data (unless explicitly indexed):

Environment variables
Configuration secrets
API keys in code (if .gitignored)
Personal identifiable information (filtered)

Data Retention

Active data:

Documents: Until removed from source or manually deleted
Embeddings: Until document deleted or re-indexed
Metadata: Lifetime of workspace

Logs and analytics:

Query logs: 90 days default (configurable)
Access logs: 90 days default (configurable)
Metrics: 1 year rolling window

Deleted data:

Soft delete with 30-day recovery window
Permanent deletion after recovery period
Embeddings purged immediately
Logs retained per retention policy

Data Residency

Choose where your workspace data is stored:

US region (default):

Pinecone: us-east-1
PlanetScale: AWS us-east-1
S3: us-east-1
Redis: us-east-1

EU region (enterprise):

Pinecone: eu-west-1
PlanetScale: AWS eu-west-1
S3: eu-west-1
Redis: eu-west-1

Multi-region (enterprise):

High availability across regions
Active-active or active-passive
Latency-based routing
Regional failover

Security Best Practices

For Workspace Administrators

Access management:

Review workspace members regularly
Remove former employees promptly
Use GitHub organization for SSO
Enforce MFA for all software team members

API key hygiene:

Rotate keys quarterly
Use separate keys per environment (dev/staging/prod)
Revoke unused keys
Monitor key usage in audit logs

Compliance:

Review audit logs monthly
Configure retention to meet requirements
Document data handling in privacy policy
Test incident response procedures

For Developers

Secure integration:

Store API keys in environment variables, not code
Use HTTPS for all API requests
Implement proper error handling
Respect rate limits

Data handling:

Don't log query results containing sensitive data
Sanitize user inputs before querying
Implement client-side rate limiting
Cache responses appropriately (with TTL)

Incident Response

Lightfast has documented procedures for security incidents:

Detection: Automated monitoring and alerts
Triage: Security team assesses severity
Containment: Isolate affected workspaces
Investigation: Determine root cause and impact
Remediation: Fix vulnerability and restore service
Communication: Notify affected customers
Post-mortem: Document lessons learned

Customers are notified within 24 hours of confirmed data breaches.

Security Reporting

Report vulnerabilities:

Email: security@lightfast.ai
PGP key available on request
Bug bounty program for critical findings

Response SLA:

Critical: 4 hours
High: 24 hours
Medium: 7 days
Low: 30 days

Next Steps

Authentication — How to authenticate API requests
Search & Retrieval — How permissions affect search
Architecture — Security architecture details
API Reference — Security headers and best practices

Privacy & Security

Privacy & Security

Tenant Isolation

Complete Workspace Separation

Why This Matters

Access Control

Permission Inheritance

Permission Example

Permission Enforcement

Data Security

Encryption

Authentication

Network Security

Audit & Compliance

Query Logs

Access Logs

Compliance Features

Data Handling

What We Store

What We Don't Store

Data Retention

Data Residency

Security Best Practices

For Workspace Administrators

For Developers

Incident Response

Security Reporting

Next Steps

On this page