Skip to content

Configuration

Learn how to configure langchain-cockroachdb for different workloads.

Engine Configuration

Basic Configuration

from langchain_cockroachdb import CockroachDBEngine

engine = CockroachDBEngine.from_connection_string(
    "cockroachdb://user:pass@host:26257/db",
    pool_size=10,                    # Base connections
    max_overflow=20,                 # Additional connections
    pool_pre_ping=True,              # Health check before use
    pool_recycle=3600,               # Recycle after 1 hour
    pool_timeout=30.0,               # Wait 30s for connection
)

Retry Configuration

All operations automatically retry on transient errors (40001, connection failures):

engine = CockroachDBEngine.from_connection_string(
    connection_string,
    retry_max_attempts=5,            # Max retries (default: 5)
    retry_initial_backoff=0.1,       # Initial delay in seconds
    retry_max_backoff=10.0,          # Max delay in seconds
    retry_backoff_multiplier=2.0,    # Exponential multiplier
    retry_jitter=True,               # Add randomization
)

Vector Store Configuration

Basic Settings

from langchain_cockroachdb import AsyncCockroachDBVectorStore

vectorstore = AsyncCockroachDBVectorStore(
    engine=engine,
    embeddings=embeddings,
    collection_name="documents",

    # Column names
    content_column="content",
    embedding_column="embedding",
    metadata_column="metadata",
    id_column="id",

    # Distance metric
    distance_strategy=DistanceStrategy.COSINE,  # or L2, IP

    # Batch size for inserts
    batch_size=100,
)

Retry Settings (Per-Batch)

vectorstore = AsyncCockroachDBVectorStore(
    engine=engine,
    embeddings=embeddings,
    collection_name="documents",
    retry_max_attempts=3,            # Per-batch retries
    retry_initial_backoff=0.1,
    retry_max_backoff=5.0,
)

Configuration Presets

Development

# Optimized for local development
engine = CockroachDBEngine.from_connection_string(
    connection_string,
    pool_size=5,
    max_overflow=10,
    retry_max_attempts=3,
    retry_max_backoff=1.0,
)

vectorstore = AsyncCockroachDBVectorStore(
    engine=engine,
    embeddings=embeddings,
    collection_name="docs",
    batch_size=100,
)

Web Applications

# High concurrency, aggressive retries
engine = CockroachDBEngine.from_connection_string(
    connection_string,
    pool_size=20,
    max_overflow=40,
    retry_max_attempts=10,
    retry_max_backoff=30.0,
)

vectorstore = AsyncCockroachDBVectorStore(
    engine=engine,
    embeddings=embeddings,
    collection_name="docs",
    batch_size=100,
    retry_max_attempts=5,
)

Batch Jobs

# Resilient, patient retries
engine = CockroachDBEngine.from_connection_string(
    connection_string,
    pool_size=3,
    max_overflow=5,
    retry_max_attempts=20,
    retry_max_backoff=60.0,
)

vectorstore = AsyncCockroachDBVectorStore(
    engine=engine,
    embeddings=embeddings,
    collection_name="docs",
    batch_size=500,  # Larger batches for throughput
    retry_max_attempts=10,
)

Multi-Region

# High latency tolerance
engine = CockroachDBEngine.from_connection_string(
    connection_string,
    pool_size=15,
    max_overflow=30,
    retry_max_attempts=15,
    retry_max_backoff=60.0,
    pool_timeout=60.0,  # Longer timeout
)

vectorstore = AsyncCockroachDBVectorStore(
    engine=engine,
    embeddings=embeddings,
    collection_name="docs",
    batch_size=200,  # Moderate batches
    retry_max_attempts=8,
)

Batch Size Guidelines

Choose batch size based on embedding dimensions and workload:

Embedding Size Recommended Batch Size
< 512 dims 200-500
512-1536 dims 100-200
> 1536 dims 50-100

CockroachDB works best with smaller batches compared to single-node databases.

Connection Pool Sizing

General guidelines:

Workload Type pool_size max_overflow
Development 5 10
Web (low traffic) 10 20
Web (high traffic) 20-50 40-100
Batch jobs 3-5 5-10
Analytics 10-20 20-40

Formula: max_connections = pool_size + max_overflow

Monitoring Configuration

Enable SQL Logging

import logging

logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)

Enable Pool Logging

engine = CockroachDBEngine.from_connection_string(
    connection_string,
    echo=True,        # Log all SQL
    echo_pool=True,   # Log pool events
)

Environment Variables

Store sensitive configuration in environment variables:

export COCKROACHDB_URL="cockroachdb://user:pass@host:26257/db?sslmode=verify-full"
export COCKROACHDB_POOL_SIZE="20"
export COCKROACHDB_MAX_OVERFLOW="40"
import os

engine = CockroachDBEngine.from_connection_string(
    os.getenv("COCKROACHDB_URL"),
    pool_size=int(os.getenv("COCKROACHDB_POOL_SIZE", "10")),
    max_overflow=int(os.getenv("COCKROACHDB_MAX_OVERFLOW", "20")),
)

SSL Configuration

CockroachDB Cloud

connection_string = (
    "cockroachdb://user:pass@cluster.cloud:26257/db"
    "?sslmode=verify-full"
    "&sslrootcert=/path/to/root.crt"
)

Self-Hosted with Custom CA

connection_string = (
    "cockroachdb://user@host:26257/db"
    "?sslmode=verify-full"
    "&sslcert=/path/to/client.crt"
    "&sslkey=/path/to/client.key"
    "&sslrootcert=/path/to/ca.crt"
)

Insecure (Development Only)

connection_string = "cockroachdb://root@localhost:26257/db?sslmode=disable"

Next Steps