Core Concepts
Understanding the fundamental concepts behind vector search and RAG (Retrieval Augmented Generation) will help you build more effective AI applications with Vectorize.
RAG (Retrieval Augmented Generation)
RAG is a technique that enhances large language models by retrieving relevant information from external knowledge sources before generating a response. This approach allows AI systems to provide more accurate, up-to-date, and contextually relevant answers.
How RAG Works:
- Retrieval: Search for relevant documents or passages based on the user's query
- Augmentation: Combine the retrieved context with the original query
- Generation: Use the enhanced prompt to generate a more informed response
Why Use RAG?
- Provides access to current, domain-specific information
- Reduces hallucinations and improves accuracy
- Allows customization without retraining models
- Enables source attribution and fact-checking
Vector Embeddings
Vector embeddings are numerical representations of text, images, or other data types in high-dimensional space. Similar content is represented by vectors that are close together in this space, enabling semantic search and similarity matching.
Key Properties:
- Semantic Meaning: Captures the meaning and context of content
- Dimensionality: Typically 384 to 1536 dimensions
- Distance Metrics: Cosine similarity, Euclidean distance, dot product
- Model Dependent: Different embedding models produce different representations
// Example: Text to vector embedding
const text = "Vectorize helps build AI applications";
const embedding = await embed(text);
// Result: [0.1, -0.3, 0.7, 0.2, ...] (1536 dimensions)
Similarity Search
Similarity search finds the most relevant content by comparing vector embeddings. Instead of exact keyword matching, it finds semantically similar content even when different words are used.
Search Types:
- k-NN Search: Find the k nearest neighbors to a query vector
- Range Search: Find all vectors within a distance threshold
- Filtered Search: Combine similarity with metadata filters
- Hybrid Search: Combine vector search with keyword search
Example Search Results
Query: "machine learning algorithms"
- Score: 0.95 - "Introduction to ML classification methods"
- Score: 0.89 - "Deep learning and neural networks"
- Score: 0.82 - "Statistical models for prediction"
Chunking Strategies
Chunking is the process of breaking down large documents into smaller, manageable pieces that can be effectively embedded and retrieved. The right chunking strategy is crucial for optimal RAG performance.
Common Chunking Methods:
- Fixed Size: Split by character or token count (e.g., 512 tokens)
- Semantic: Split by meaning, preserving context
- Sentence-based: Split at sentence boundaries
- Recursive: Try larger chunks first, then break down if needed
- Document Structure: Split by headers, paragraphs, or sections
Chunking Considerations:
- Chunk Size: Balance between context and specificity
- Overlap: Include overlapping content to maintain context
- Metadata: Preserve document structure and source information
- Content Type: Different strategies for code, tables, lists
// Example chunking configuration
{
"strategy": "recursive",
"chunkSize": 1000,
"chunkOverlap": 200,
"separators": ["\n\n", "\n", " ", ""],
"preserveMetadata": true
}
Advanced Concepts
Embedding Model Selection
Different embedding models excel in different domains. Vectorize automatically evaluates models to find the best fit for your data:
- OpenAI text-embedding-ada-002: General purpose, good performance
- Voyage AI: Specialized for retrieval and search tasks
- Sentence Transformers: Open source, domain-specific options
Vector Database Integration
Vectorize supports multiple vector databases, each with unique strengths:
- Pinecone: Managed, high-performance, easy scaling
- Weaviate: Open source, GraphQL interface, hybrid search
- Qdrant: High performance, advanced filtering
- Milvus: Open source, enterprise features
Next Steps
Now that you understand the core concepts, you're ready to:
- Learn how to implement these concepts in practice
- Build & Deploy your first RAG pipeline
- Explore real-world use cases