Core Concepts

Understanding the fundamental concepts behind vector search and RAG (Retrieval Augmented Generation) will help you build more effective AI applications with Vectorize.

🔄 RAG (Retrieval Augmented Generation)

RAG is a technique that enhances large language models by retrieving relevant information from external knowledge sources before generating a response. This approach allows AI systems to provide more accurate, up-to-date, and contextually relevant answers.

How RAG Works:

Retrieval: Search for relevant documents or passages based on the user's query
Augmentation: Combine the retrieved context with the original query
Generation: Use the enhanced prompt to generate a more informed response

                        Why Use RAG?
                        Provides access to current, domain-specific information
Reduces hallucinations and improves accuracy
Allows customization without retraining models
Enables source attribution and fact-checking

                    

🧮 Vector Embeddings

Vector embeddings are numerical representations of text, images, or other data types in high-dimensional space. Similar content is represented by vectors that are close together in this space, enabling semantic search and similarity matching.

Key Properties:

Semantic Meaning: Captures the meaning and context of content
Dimensionality: Typically 384 to 1536 dimensions
Distance Metrics: Cosine similarity, Euclidean distance, dot product
Model Dependent: Different embedding models produce different representations

// Example: Text to vector embedding
const text = "Vectorize helps build AI applications";
const embedding = await embed(text);
// Result: [0.1, -0.3, 0.7, 0.2, ...] (1536 dimensions)

🎯 Similarity Search

Similarity search finds the most relevant content by comparing vector embeddings. Instead of exact keyword matching, it finds semantically similar content even when different words are used.

Search Types:

k-NN Search: Find the k nearest neighbors to a query vector
Range Search: Find all vectors within a distance threshold
Filtered Search: Combine similarity with metadata filters
Hybrid Search: Combine vector search with keyword search

Example Search Results

Query: "machine learning algorithms"

Score: 0.95 - "Introduction to ML classification methods"
Score: 0.89 - "Deep learning and neural networks"
Score: 0.82 - "Statistical models for prediction"

✂️ Chunking Strategies

Chunking is the process of breaking down large documents into smaller, manageable pieces that can be effectively embedded and retrieved. The right chunking strategy is crucial for optimal RAG performance.

Common Chunking Methods:

Fixed Size: Split by character or token count (e.g., 512 tokens)
Semantic: Split by meaning, preserving context
Sentence-based: Split at sentence boundaries
Recursive: Try larger chunks first, then break down if needed
Document Structure: Split by headers, paragraphs, or sections

Chunking Considerations:

Chunk Size: Balance between context and specificity
Overlap: Include overlapping content to maintain context
Metadata: Preserve document structure and source information
Content Type: Different strategies for code, tables, lists

// Example chunking configuration
{
  "strategy": "recursive",
  "chunkSize": 1000,
  "chunkOverlap": 200,
  "separators": ["\n\n", "\n", " ", ""],
  "preserveMetadata": true
}

Advanced Concepts

Embedding Model Selection

Different embedding models excel in different domains. Vectorize automatically evaluates models to find the best fit for your data:

OpenAI text-embedding-ada-002: General purpose, good performance
Voyage AI: Specialized for retrieval and search tasks
Sentence Transformers: Open source, domain-specific options

Vector Database Integration

Vectorize supports multiple vector databases, each with unique strengths:

Pinecone: Managed, high-performance, easy scaling
Weaviate: Open source, GraphQL interface, hybrid search
Qdrant: High performance, advanced filtering
Milvus: Open source, enterprise features

Next Steps

Now that you understand the core concepts, you're ready to:

Learn how to implement these concepts in practice
Build & Deploy your first RAG pipeline
Explore real-world use cases