Skip to main content

File Search & RAG

Estimated Time: 4 minutes

File Search, also known as Retrieval-Augmented Generation (RAG), enables your AI agents to find and reference information from uploaded documents. This allows your agents to provide accurate, context-specific answers based on your own content.

How File Search Works

  1. Document Processing - Your files are broken into smaller chunks and converted to vector embeddings
  2. Vector Storage - These embeddings are stored in a searchable vector database
  3. Semantic Search - When users ask questions, the system finds relevant chunks using semantic similarity
  4. Response Generation - The AI incorporates the retrieved information into its response

Primary Use Cases

Knowledge Base Queries

Turn your documentation into an intelligent help system.

Examples:

  • Company policies and procedures
  • Product documentation and FAQs
  • Training materials and guides
  • Technical specifications

Document Analysis

Extract insights from large document collections.

Examples:

  • Research paper analysis
  • Legal document review
  • Financial report summaries
  • Medical record insights

Contextual Support

Provide personalized assistance based on specific content.

Examples:

  • Customer-specific troubleshooting guides
  • Personalized onboarding materials
  • Role-specific documentation
  • Project-related information

Supported File Types

Common formats supported across providers:

  • PDF - Reports, manuals, academic papers
  • Text - Plain text files, documentation
  • Word Documents - DOCX files with rich formatting
  • Markdown - Technical documentation, README files

Note: Specific supported formats may vary by provider

Key Concepts

Vector Embeddings

Mathematical representations of text that capture semantic meaning, allowing the system to find conceptually similar content even when exact keywords don't match.

Chunks and Chunking

Documents are split into smaller pieces (chunks) to improve search relevance and manage token limits.

Chunk Size Considerations:

  • Smaller chunks (400-600 tokens) - More precise, specific information
  • Larger chunks (800-1200 tokens) - Better context and relationships

Chunk Overlap

Overlapping text between chunks ensures important context isn't lost at chunk boundaries.

Overlap Benefits:

  • Maintains context across chunk boundaries
  • Reduces information fragmentation
  • Improves search result relevance

Document Preparation Best Practices

Structure Your Content

  • Use clear headings and sections
  • Include relevant keywords naturally
  • Maintain consistent formatting
  • Remove unnecessary formatting artifacts
  • Write descriptive section titles
  • Include summary paragraphs
  • Use bullet points for key information
  • Avoid excessive technical jargon without context

Content Organization

  • Group related information together
  • Avoid duplicate content across files
  • Use consistent terminology throughout
  • Include context for acronyms and abbreviations

Performance Optimization

Chunk Configuration

Start with default settings and adjust based on your content:

  • Technical Documentation: 800-1200 token chunks with 400-500 token overlap
  • FAQ Content: 400-600 token chunks with 200-300 token overlap
  • Long-form Content: 1000-1500 token chunks with 500-600 token overlap

Search Result Tuning

Maximum Chunks Returned:

  • 1-3 chunks - Very focused responses, faster processing
  • 5-7 chunks - Balanced relevance and context
  • 10+ chunks - Comprehensive context, higher token usage

Cost Considerations

  • More chunks returned = higher token usage per query
  • Larger chunk sizes = more tokens per search result
  • Monitor usage patterns and adjust accordingly

Quality Assurance

Testing Your Knowledge Base

  1. Coverage Testing - Ask questions about different sections of your content
  2. Edge Case Testing - Try queries that span multiple documents
  3. Accuracy Verification - Ensure responses match source material
  4. Relevance Testing - Check that returned chunks are actually relevant

Monitoring Performance

  • Review search result quality in session logs
  • Track user satisfaction with document-based answers
  • Monitor token usage and costs
  • Identify gaps in your knowledge base coverage

Common Challenges & Solutions

Poor Search Results

Problem: Irrelevant or incomplete answers Solutions:

  • Improve document structure and headings
  • Adjust chunk size and overlap settings
  • Add more comprehensive content
  • Include FAQ-style content for common queries

Information Gaps

Problem: Agent can't find information that exists in documents Solutions:

  • Check document processing status
  • Verify content is properly structured
  • Add alternative phrasings for key concepts
  • Ensure technical terms are defined in context

Inconsistent Responses

Problem: Different answers to similar questions Solutions:

  • Standardize terminology across documents
  • Remove or consolidate duplicate content
  • Create comprehensive, authoritative sources
  • Use consistent formatting and structure

Provider-Specific Implementation

OpenAI

File search is fully available with Predictable Dialogs managing the vector stores.

Features:

  • Direct file upload through Predictable Dialogs
  • Configurable chunk settings
  • Multiple vector store management

Learn more: OpenAI File Search Setup →

OpenAI Assistant

File search is supported using OpenAI's vector store system.

Features:

  • Integration with OpenAI's file management
  • Assistant-level file search configuration
  • Automatic chunking and processing

Learn more: OpenAI Assistant Documentation

Advanced Strategies

Hybrid Approaches

Combine file search with function calling for comprehensive solutions:

  1. Search Internal Docs - Find relevant internal information
  2. Call External APIs - Fetch real-time or personalized data
  3. Synthesize Response - Combine both sources for complete answers

Content Versioning

Manage evolving documentation:

  • Update vector stores when documents change
  • Version control your knowledge base files
  • Test changes before deploying to production
  • Maintain backup copies of vector stores

Multi-Language Support

Handle documents in different languages:

  • Group documents by language
  • Use language-specific vector stores
  • Consider translation for key documents
  • Test search quality across languages

Next Steps