langchain-cockroachdb¶
LangChain integration for CockroachDB with native vector support
Overview¶
This package provides LangChain abstractions backed by CockroachDB, leveraging CockroachDB's native VECTOR type and C-SPANN (CockroachDB SPANN) indexes for vector search at scale in a distributed, horizontally scalable database.
Key Features¶
🎯 Vector Store¶
- Native Vector Support: Uses CockroachDB's native
VECTORtype (not pgvector) - C-SPANN Indexes: CockroachDB's vector index algorithm optimized for distributed systems
- Advanced Filtering: Rich metadata filtering with
$and/$or/$gt/$inoperators - Hybrid Search: Combine full-text search (TSVECTOR) with vector similarity
- Query-Time Tuning: Adjust beam size for accuracy/speed tradeoff
🏗️ Reliability Features¶
- Automatic Retry Logic: Handles 40001 serialization errors with exponential backoff
- Isolation Level Support: Works with both SERIALIZABLE (default, recommended) and READ COMMITTED
- Multi-Tenancy: Namespace-based isolation with C-SPANN prefix columns
- Connection Pooling: Configurable connection pools with health checks
- Horizontal Scalability: Designed for distributed deployments
💬 Chat History¶
- Persistent Storage: Store conversation history in CockroachDB
- Session Management: Organize by session/thread ID
- LangChain Integration: Drop-in replacement for other chat history implementations
🧠 LangGraph Checkpointer¶
- Short-Term Memory: Persist agent state across conversation turns
- Human-in-the-Loop: Interrupt and resume workflows with durable state
- Fault Tolerance: Recover from process restarts without losing progress
- Sync & Async: Both
CockroachDBSaverandAsyncCockroachDBSaver
🔄 Async & Sync APIs¶
- Async-First: High-performance async operations for I/O concurrency
- Sync Wrapper: Simple synchronous API for scripts and batch jobs
- Connection Pooling: Efficient connection reuse across async operations
Quick Example¶
import asyncio
from langchain_cockroachdb import AsyncCockroachDBVectorStore, CockroachDBEngine
from langchain_openai import OpenAIEmbeddings
async def main():
# Initialize engine
engine = CockroachDBEngine.from_connection_string(
"cockroachdb://user:pass@host:26257/db?sslmode=verify-full"
)
# Create table
await engine.ainit_vectorstore_table(
table_name="documents",
vector_dimension=1536,
)
# Initialize vector store
vectorstore = AsyncCockroachDBVectorStore(
engine=engine,
embeddings=OpenAIEmbeddings(),
collection_name="documents",
)
# Add documents
await vectorstore.aadd_texts([
"CockroachDB is a distributed SQL database",
"LangChain makes it easy to build LLM applications",
])
# Search
results = await vectorstore.asimilarity_search(
"Tell me about databases",
k=2
)
print(results)
await engine.aclose()
asyncio.run(main())
Why CockroachDB?¶
- Distributed by Design: Scale horizontally across regions
- Native Vector Support: First-class
VECTORtype and C-SPANN indexes - Strong Consistency: SERIALIZABLE by default, READ COMMITTED also supported
- Cloud Native: Deploy anywhere (AWS, GCP, Azure, on-prem)
- PostgreSQL Compatible: Familiar SQL with distributed superpowers
Getting Started¶
Choose your path:
-
Quick Start
Get up and running in 5 minutes
-
Guides
Learn key concepts and patterns
-
API Reference
Detailed API documentation
-
Examples
Working code examples
LangChain Official Integration Docs¶
Community & Support¶
- GitHub: cockroachdb/langchain-cockroachdb
- Issues: Report bugs or request features
- Discussions: Ask questions
Contributing¶
Contributions welcome! This is an open-source project built for the community.
License¶
Apache License 2.0 - see LICENSE for details.
Built with ❤️ for the CockroachDB and LangChain communities