How can I use OpenAI's file search to let my assistant answer customer questions using our internal documents?

Enable the `file_search` tool in your Assistant, upload your internal documents to a Vector Store, and link that store to the Assistant. Your AI assistant will automatically retrieve relevant answers from these uploaded files when responding to customer queries.

Is there a limit to the file size and number of tokens that can be included per file?

Yes. Each file can be up to 512 MB in size and can contain a maximum of 5 million tokens.

How does the File Search feature handle document indexing and retrieval—does it support semantic search?

Yes. File Search automatically indexes files using both keyword and semantic search, ensuring relevant document retrieval. It breaks documents into chunks, creates embeddings, and ranks results by relevance.

Can I attach multiple files to my AI assistant at once, and if so, what's the most efficient method?

Yes. You can efficiently upload multiple files simultaneously using the batch upload method, adding them directly to a Vector Store via a single API call or through the SDK’s batching and polling features.

What are the costs associated with using the File Search feature, and how can I manage or minimize them?

Your first GB of vector storage is free. Afterward, it’s $0.10 per GB per day. You can manage costs by setting file expiration policies—default expiration is typically 7 days after last use, but you can customize this.

How long does OpenAI retain my documents in vector stores, and can I control this retention policy?

By default, files are retained in vector stores associated with threads for 7 days after last activity. You have complete control over retention policies and can adjust or override them at any time.

Is it possible to customize how my files are chunked and ranked when retrieved by the AI assistant?

Yes. You can customize chunk sizes (default: 800 tokens) and overlap (default: 400 tokens), as well as ranking thresholds to optimize relevance and retrieval performance.

What happens if the files I'm using haven't finished processing before my assistant tries to access them?

OpenAI includes a built-in wait (up to 60 seconds) for processing to complete before the assistant accesses new files in a thread. However, it’s recommended to confirm file readiness beforehand to ensure optimal responses.

Can my assistant search across multiple sources (e.g., my uploaded company docs and files uploaded directly by customers)?

Yes. Your assistant can simultaneously search across multiple sources: the vector store linked to your assistant and an additional vector store created when customers directly upload files in a message thread.

OpenAI File Upload

Q: What file formats are supported by the File Search tool for uploading company documents?

Supported formats include PDF (`.pdf`), Word documents (`.doc`, `.docx`), plain text (`.txt`), markdown (`.md`), JSON (`.json`), HTML, PowerPoint (`.pptx`), and various code files (e.g., `.py`, `.js`, `.cpp`).

OpenAI File Upload transforms your chatbot into a knowledgeable partner, instantly familiar with your product catalogs, company policies, training documents, and content archives. Imagine your AI assistant proactively delivering accurate, contextually relevant answers—this capability is now easily accessible.

By uploading internal files like PDFs, CSVs, Word documents, PowerPoints, or even code files, your chatbot gains direct, intelligent access to specific information tailored exactly to your business.

In this post, we will look at the Retrieval tool, also called as file search. This is useful to make your Chabot respond from the information in your files.

Haven't created your openAI assistant yet? Please read this post → OpenAI Chatbot.

Using the file search tool in OpenAI assistant.

The File Search tool is a feature within the OpenAI platform that allows assistants to access and retrieve information from external files. This capability enhances the assistant's knowledge base by incorporating proprietary documents or user-provided data, making it more versatile and accurate in responding to queries.

How It Works

Understanding how it works is not necessary for you to use it, but could help when things don't work perfectly.

File Upload and Vector Store Creation: Users upload their files, which go into a vector store, which is an object that stores file embeddings. These embeddings are created automatically when files are added, enabling efficient semantic search.
Search Mechanism: The tool uses both keyword and vector (semantic) searches across these stored embeddings to find relevant content based on user queries.
Query Optimization: The file search tool optimizes user queries for better search results by rewriting them and breaking down complex queries into parallel searches.
Result Reranking: It reranks search results based on relevance before generating responses, ensuring that the most pertinent information is used.

Steps to Use File Search

Enable File Search

Create an assistant with file_search enabled in its tools parameter.
Then there are two indicators one is a gear icon and the other is a button with “+ Files”:
- On clicking the gear icon, we see a modal with "Max num results". This "Max num results" setting controls how many search results are returned from your files. This value is significant because it determines the number of relevant documents or chunks of text that will be considered when answering a user's query. More on this below.
- On clicking the "+ Files" button we see a modal with “Attach files to file search” and then it allows to upload files. It says information in attached files will be available to this assistant. Then at the bottom of the modal there is a button “Select vector store”. When using the OpenAI platform to create an assistant with file search enabled, you have two main options for managing files: attaching files directly or creating and attaching a vector store. More on this below.

Significance of "Max num results"

Default Values: The default is set to 5 for models like get-3.5-turbo and 20 for models like gpt-4o. This means that if you're using a model like gpt-4o, up to 20 relevant chunks or documents can be retrieved and used to generate a response.
Impact on Performance: Increasing this number allows more information to be considered, potentially improving the accuracy and relevance of responses. However, it may also increase processing time and costs associated with file search operations.

Should You Change "Max num results" from the default?

Whether you should change this value depends on your specific use case:

Information Density: If your queries often require detailed information from multiple sources within your files, increasing this value might help ensure all relevant data is considered.
Response Complexity: If you need concise answers based on fewer but highly relevant sources, keeping or reducing this value could streamline responses.
Cost Considerations: File search has additional charges beyond token-based fees. Increasing the number of results could lead to higher costs if not carefully managed.

To decide whether to change it:

Test both default and adjusted settings with typical queries for your application.
Evaluate response quality (accuracy and relevance) versus cost implications.

If most queries are well-served by considering up to 20 chunks/documents (the default for GPT4o), there may be no need to adjust this setting unless specific scenarios benefit from more extensive retrieval capabilities.

Attaching Files To OpenAI Assistant

Upload Files: When you click the "+ Files" button, you can upload your files directly to the assistant. This method allows OpenAI to automatically process and make these files available for search.
Automatic Vector Store Creation: If you don't manually create a vector store, OpenAI might automatically handle this step in some cases (e.g., when attaching files through messages). However, it's generally recommended to manage your vector stores explicitly.

Creating and Attaching a Vector Store

Create a Vector Store: You can manually create a vector store before uploading your files. This gives you more control over how your data is organized and accessed across different assistants or threads.
Attach Vector Store: Once created, you can attach this vector store to your assistant by selecting it from the "Attach vector store" modal.
Upload Files to Vector Store: After attaching the vector store, upload your files into it from the dashboard.

File Upload Vs Vector Store

Should you be attaching files or creating a vector store?

If you're just starting out or have few documents, uploading them directly via "+ Files" might be sufficient.
For more complex setups with many documents or multiple assistants sharing resources:
- Create separate vector stores for better organization.
- Attach these stores to relevant assistants.

The vector store approach provides flexibility in managing large datasets across multiple assistants efficiently.

Use Cases

Use Cases for File Search

Business Type	Use Case	Description
Solo Entrepreneurs	Instant Customer Support	Upload FAQs, pricing sheets, or detailed product info, allowing your chatbot to handle common inquiries effortlessly, freeing you up for strategic growth.
	Personalized Engagement	Share custom service descriptions or coaching program details, boosting customer interactions without additional workload.
Small Businesses	Streamlined Operations	Integrate employee handbooks, inventory data, or booking schedules. Your chatbot answers queries swiftly, simplifying day-to-day operations.
	Enhanced Sales	Upload comprehensive product descriptions and engaging case studies. Your chatbot becomes a persuasive sales assistant, guiding customers seamlessly through the purchasing journey.
Medium-Sized Businesses	Efficient Knowledge Base	Embed detailed documentation and training materials into your chatbot, significantly cutting down onboarding and training times.
	Automated Customer Success	Provide your chatbot with success stories, testimonials, and troubleshooting guides to deliver instant support, improving customer satisfaction and retention.
	Community Building	Share community guidelines or event schedules. Your chatbot proactively moderates interactions, fostering a vibrant, engaging community.
Large Businesses	Financial Analysis	Load financial reports like annual statements (e.g., 10-K filings) into a vector store, allowing assistants to answer specific questions about company performance directly from those documents.
	Product Information Retrieval	For customer service applications, load product manuals or technical specifications into a vector store so assistants can provide detailed product information based on customer inquiries.
	Research Assistance	Researchers can use this feature by uploading academic papers or research articles into a vector store, enabling assistants to summarize findings or extract specific data points efficiently.

Technical Insights (FAQs)

Supported File Formats: Your chatbot can access PDFs, Word documents (.doc, .docx), plain text (.txt), markdown (.md), JSON (.json), HTML, PowerPoint (.pptx), and various code files (e.g., .py, .js, .cpp).

File Size and Token Limits: Each file can be up to 512 MB with a maximum of 5 million tokens, ensuring extensive document handling. Ref: OpenAI

File Size Limits: The total size of all the files uploaded in your project cannot exceed 100 GB. Ref: OpenAI

Total Number Of Files: You can attach a maximum of 20 files to code_interpreter and 10,000 files to file_search. Ref: OpenAI

Document Indexing & Semantic Search: OpenAI File Search automatically indexes your documents, combining keyword and semantic search for precise, relevant retrieval. Documents are intelligently chunked, embedded, and ranked by relevance.

Batch File Uploading: Easily upload multiple files at once through batch uploads, directly to a Vector Store via a single API call or the SDK, streamlining setup and maintenance.

Cost Management: Your first GB of vector storage is free; subsequent storage costs $0.10 per GB per day. Control your expenses effectively by customizing file expiration policies.

Retention Policy: Files are retained by default for 7 days after the last activity, but you have complete flexibility to adjust retention timelines according to your needs.

Customization Options: Tailor document chunking (default 800 tokens) and overlaps (default 400 tokens) to optimize relevance and retrieval performance according to your preferences.

File Readiness and Processing: OpenAI ensures documents are fully processed before your chatbot accesses them, incorporating a built-in waiting period of up to 60 seconds for optimal responses.

Multi-source Searching: Your assistant seamlessly searches across multiple sources—both your internal vector store and customer-uploaded files—enhancing versatility and responsiveness.

Getting Started is Easier Than You Think

With Predictable Dialogs and OpenAI Assistants, customizing your chatbot is simple:

Gather Your Files: PDFs, CSVs, Word docs, or any files your chatbot needs.
Upload Directly to OpenAI Assistant: Easily drag and drop or batch-upload files.
Instant Customization: Your chatbot now accesses your files seamlessly during conversations.

Ready to Elevate Your Business?

Leveraging generative AI with OpenAI File Upload enhances every interaction, providing unparalleled customer experiences while significantly reducing operational overhead. It's more than technology—it's your next business advantage.

Experience this firsthand by creating your chatbot agent, and see how OpenAI transforms your interactions today.