2 posts tagged with "ai"

Build AI Applications with MongoDB: A Complete RAG Workshop

March 15, 2025 · 4 min read

Developer Advocate @ MongoDB

Since releasing MongoDB-RAG earlier this year, I've received a consistent stream of questions from developers about best practices for building production-ready AI applications. While the library makes RAG implementation much simpler, many developers are looking for end-to-end guidance on the entire development journey.

That's why I'm excited to announce our new MongoDB-RAG Workshop - a comprehensive, hands-on guide to building intelligent applications with MongoDB Atlas Vector Search.

🧠 Why We Created This Workshop

Building modern AI applications isn't just about connecting to an LLM API. It requires:

Understanding vector embeddings and semantic search
Organizing and storing your knowledge base efficiently
Implementing retrieval mechanisms that deliver relevant context
Creating a scalable architecture that performs well in production

This workshop addresses all these challenges, providing a clear path from concept to production.

📚 What You'll Learn

Our new workshop walks you through the complete process of building a production-ready RAG application:

Understanding RAG Fundamentals
Before diving into code, we explore how vector search works, why embeddings matter, and the core RAG architecture patterns.
Setting Up MongoDB Atlas
Learn how to create and configure a MongoDB Atlas cluster with Vector Search capabilities - the foundation of your AI application.
Creating Vector Embeddings
Master techniques for generating and managing vector embeddings from various text sources, including handling different providers (OpenAI, Ollama, and more).
Building a Complete RAG Application
Develop a full-featured application that ingests documents, performs semantic search, and generates contextually relevant responses.
Advanced Techniques
Take your application to the next level with hybrid search, re-ranking, query expansion, and other advanced retrieval strategies.
Production Deployment
Learn best practices for scaling, monitoring, and optimizing your RAG application in production.

💡 Who Should Take This Workshop?

This workshop is perfect for:

Backend Developers looking to add AI capabilities to existing applications
AI Engineers who want to build more robust retrieval systems
Technical Leaders evaluating RAG architecture patterns
Full-Stack Developers building end-to-end AI applications

No prior experience with vector databases is required, though basic familiarity with MongoDB and Node.js will help you get the most out of the material.

🚀 A Hands-On Approach

What makes this workshop special is its hands-on nature. You won't just read about concepts - you'll implement them step-by-step:

// By the end of the workshop, you'll be writing code like this
async function advancedRAGPipeline(query) {
  // Step 1: Expand query with variations
  const expandedQueries = await expandQuery(query);
  
  // Step 2: Retrieve from multiple collections
  const initialResults = await retrieveFromMultipleSources(expandedQueries);
  
  // Step 3: Rerank results
  const rerankedResults = await rerankResults(initialResults, query);
  
  // Step 4: Generate response with the LLM
  const response = await generateResponse(query, rerankedResults);
  
  return {
    answer: response,
    sources: rerankedResults.map(r => ({
      document: r.documentId,
      source: r.metadata?.source,
      score: r.score
    }))
  };
}

You'll build real components that solve common challenges:

Document chunking strategies for optimal retrieval
Caching mechanisms for performance optimization
Hybrid search implementations
Microservice architectures for production deployment

📈 Real-World Applications

The workshop focuses on practical applications that solve real business problems:

Customer Support Systems that retrieve accurate information from knowledge bases
Research Assistants that can analyze and retrieve information from scientific literature
Content Recommendation Engines powered by semantic similarity
Intelligent Document Search across enterprise content

🛠️ Getting Started

The workshop is available now in our documentation. To begin:

Make sure you have a MongoDB Atlas account
Install Node.js on your development machine
Head over to our Workshop Introduction

🔮 Looking Ahead

This workshop represents the beginning of our commitment to helping developers build sophisticated AI applications. In the coming months, we'll be expanding the content with:

Multi-modal RAG implementations (text + images)
Enterprise-scale architectures
Performance optimization techniques
Integration with popular AI frameworks

🤔 Your Feedback Matters

As you work through the workshop, we'd love to hear your feedback. What challenges are you facing? What additional topics would you like to see covered? Your input will help shape future content.

Building AI applications doesn't have to be complicated. With MongoDB-RAG and this workshop, you have everything you need to create intelligent, context-aware applications that deliver real value.

Happy building!

Building an Intelligent Documentation Assistant with MongoDB-RAG

February 22, 2025 · 4 min read

Michael Lynn

Developer Advocate @ MongoDB

📖 TL;DR

Ever wished your documentation could just answer questions directly instead of forcing users to sift through endless pages? That’s exactly what we built with the MongoDB-RAG Documentation Assistant. In this article, I’ll walk you through the why, what, and how of building a chatbot that retrieves precise, relevant information from MongoDB-RAG’s own documentation.

🤔 Why Build a Documentation Assistant?

Traditional documentation search is useful, but it often leaves users with more questions than answers. Developers need to read through entire pages, infer context, and piece together solutions. Instead, we wanted something:

✅ Conversational – Answers questions in natural language
✅ Context-aware – Finds relevant documentation snippets instead of just keywords
✅ Fast & Accurate – Uses vector search to surface precise answers
✅ Transparent – Links to original sources so users can verify answers
✅ Scalable – Handles multiple LLM providers, including OpenAI and Ollama

Our solution? A chatbot powered by MongoDB-RAG, showcasing exactly what our tool was built for: retrieval-augmented generation (RAG) using MongoDB Atlas Vector Search.

🛠️ How We Built It

We structured the assistant around four core components:

1️⃣ Document Ingestion

To make documentation searchable, we need to process it into vector embeddings. We use semantic chunking to break long docs into meaningful pieces before ingestion.

const chunker = new Chunker({
  strategy: 'semantic',
  maxChunkSize: 500,
  overlap: 50
});

const documents = await loadMarkdownFiles('./docs');
const chunks = await Promise.all(
  documents.map(doc => chunker.chunkDocument(doc))
);

await rag.ingestBatch(chunks.flat());

📝 Why Semantic Chunking? Instead of blindly splitting text, we preserve contextual integrity by overlapping related sections.

2️⃣ Vector Search with MongoDB Atlas

Once ingested, we use MongoDB Atlas Vector Search to find the most relevant documentation snippets based on a user’s query.

const searchResults = await rag.search(query, { 
  maxResults: 6,
  filter: { 'metadata.type': 'documentation' }
});

MongoDB’s $vectorSearch operator ensures we retrieve the closest matching content, ranked by relevance.

3️⃣ Streaming Responses for a Real Chat Experience

To improve user experience, we stream responses incrementally as they’re generated.

router.post('/chat', async (req, res) => {
  const { query, history = [], stream = true } = req.body;
  
  const context = await ragService.search(query);
  
  if (stream) {
    res.writeHead(200, {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      'Connection': 'keep-alive'
    });
    
    await llmService.generateResponse(query, context, history, res);
  } else {
    const answer = await llmService.generateResponse(query, context, history);
    res.json({ answer, sources: context });
  }
});

With this approach:

Responses appear in real-time instead of waiting for full generation 🚀
Developers can get partial answers quickly while longer responses load

4️⃣ Multi-Provider LLM Support

The assistant supports multiple embedding providers, including OpenAI and self-hosted Ollama.

const config = {
  embedding: {
    provider: process.env.EMBEDDING_PROVIDER || 'openai',
    model: process.env.EMBEDDING_MODEL || 'text-embedding-3-small',
    baseUrl: process.env.OLLAMA_BASE_URL // For local deployment
  }
};

This allows users to switch providers easily, optimizing for performance, privacy, or cost.

💡 Key Features

🔍 Real-time Context Retrieval

Instead of guessing, the chatbot searches first and then generates answers.

🔗 Source Attribution

Each response includes a link to the documentation, letting users verify answers.

⏳ Streaming Responses

No waiting! Answers generate in real-time, improving responsiveness.

⚙️ Multi-Provider LLM Support

Deploy with OpenAI for scale or Ollama for private, local hosting.

🤖 Fallback Handling

If documentation doesn’t contain an answer, the chatbot transparently explains the limitation instead of fabricating responses.

🚀 Try It Yourself

Want to build a MongoDB-RAG-powered assistant? Here’s how to get started:

1️⃣ Install MongoDB-RAG

npm install mongodb-rag

2️⃣ Configure Your Environment

MONGODB_URI=your_atlas_connection_string
EMBEDDING_PROVIDER=openai
EMBEDDING_API_KEY=your_api_key
EMBEDDING_MODEL=text-embedding-3-small

3️⃣ Initialize the Chatbot

import { MongoRAG } from 'mongodb-rag';
import express from 'express';

const rag = new MongoRAG(config);
const app = express();

app.post('/api/chat', async (req, res) => {
  const { query } = req.body;
  const results = await rag.search(query);
  res.json({ answer: results });
});

🌩️ Production Considerations

Where to Host?

We deployed our assistant on Vercel for:

Serverless scalability
Fast global CDN
Easy Git-based deployments

Which LLM for Production?

OpenAI – Best for reliability & speed
Ollama – Best for privacy-first self-hosted setups

EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small

🔮 What’s Next?

Future improvements include:

Better query reformulation to improve retrieval accuracy
User feedback integration to refine responses over time
Conversation memory for context-aware follow-ups

🎬 Conclusion

By combining MongoDB Atlas Vector Search with modern LLMs, we built an assistant that transforms documentation into an interactive experience.

Try it out in our docs, and let us know what you think! 🚀

🔗 Resources

📘 MongoDB-RAG Docs
🔗 GitHub Repository
📦 NPM Package

🧠 Why We Created This Workshop​

📚 What You'll Learn​

💡 Who Should Take This Workshop?​

🚀 A Hands-On Approach​

📈 Real-World Applications​

🛠️ Getting Started​

🔮 Looking Ahead​

🤔 Your Feedback Matters​

📖 TL;DR​

🤔 Why Build a Documentation Assistant?​

🛠️ How We Built It​

1️⃣ Document Ingestion​

2️⃣ Vector Search with MongoDB Atlas​

3️⃣ Streaming Responses for a Real Chat Experience​

4️⃣ Multi-Provider LLM Support​

💡 Key Features​

🔍 Real-time Context Retrieval​

🔗 Source Attribution​

⏳ Streaming Responses​

⚙️ Multi-Provider LLM Support​

🤖 Fallback Handling​

🚀 Try It Yourself​

1️⃣ Install MongoDB-RAG​

2️⃣ Configure Your Environment​

3️⃣ Initialize the Chatbot​

🌩️ Production Considerations​

Where to Host?​

Which LLM for Production?​

🔮 What’s Next?​

🎬 Conclusion​

🔗 Resources​