API Reference

Core Concepts

Understanding the fundamental concepts behind vector search and RAG (Retrieval Augmented Generation) will help you build more effective AI applications with Vectorize.

🔄 RAG (Retrieval Augmented Generation)

RAG is a technique that enhances large language models by retrieving relevant information from external knowledge sources before generating a response. This approach allows AI systems to provide more accurate, up-to-date, and contextually relevant answers.

How RAG Works:

  1. Retrieval: Search for relevant documents or passages based on the user's query
  2. Augmentation: Combine the retrieved context with the original query
  3. Generation: Use the enhanced prompt to generate a more informed response

Why Use RAG?

  • Provides access to current, domain-specific information
  • Reduces hallucinations and improves accuracy
  • Allows customization without retraining models
  • Enables source attribution and fact-checking

🧮 Vector Embeddings

Vector embeddings are numerical representations of text, images, or other data types in high-dimensional space. Similar content is represented by vectors that are close together in this space, enabling semantic search and similarity matching.

Key Properties:

  • Semantic Meaning: Captures the meaning and context of content
  • Dimensionality: Typically 384 to 1536 dimensions
  • Distance Metrics: Cosine similarity, Euclidean distance, dot product
  • Model Dependent: Different embedding models produce different representations
// Example: Text to vector embedding
const text = "Vectorize helps build AI applications";
const embedding = await embed(text);
// Result: [0.1, -0.3, 0.7, 0.2, ...] (1536 dimensions)

✂️ Chunking Strategies

Chunking is the process of breaking down large documents into smaller, manageable pieces that can be effectively embedded and retrieved. The right chunking strategy is crucial for optimal RAG performance.

Common Chunking Methods:

  • Fixed Size: Split by character or token count (e.g., 512 tokens)
  • Semantic: Split by meaning, preserving context
  • Sentence-based: Split at sentence boundaries
  • Recursive: Try larger chunks first, then break down if needed
  • Document Structure: Split by headers, paragraphs, or sections

Chunking Considerations:

  • Chunk Size: Balance between context and specificity
  • Overlap: Include overlapping content to maintain context
  • Metadata: Preserve document structure and source information
  • Content Type: Different strategies for code, tables, lists
// Example chunking configuration
{
  "strategy": "recursive",
  "chunkSize": 1000,
  "chunkOverlap": 200,
  "separators": ["\n\n", "\n", " ", ""],
  "preserveMetadata": true
}

Advanced Concepts

Embedding Model Selection

Different embedding models excel in different domains. Vectorize automatically evaluates models to find the best fit for your data:

  • OpenAI text-embedding-ada-002: General purpose, good performance
  • Voyage AI: Specialized for retrieval and search tasks
  • Sentence Transformers: Open source, domain-specific options

Vector Database Integration

Vectorize supports multiple vector databases, each with unique strengths:

  • Pinecone: Managed, high-performance, easy scaling
  • Weaviate: Open source, GraphQL interface, hybrid search
  • Qdrant: High performance, advanced filtering
  • Milvus: Open source, enterprise features

Next Steps

Now that you understand the core concepts, you're ready to: