Skip to main content

2 posts tagged with "ai"

View All Tags

Build AI Applications with MongoDB: A Complete RAG Workshop

· 4 min read
Michael Lynn
Developer Advocate @ MongoDB

Since releasing MongoDB-RAG earlier this year, I've received a consistent stream of questions from developers about best practices for building production-ready AI applications. While the library makes RAG implementation much simpler, many developers are looking for end-to-end guidance on the entire development journey.

That's why I'm excited to announce our new MongoDB-RAG Workshop - a comprehensive, hands-on guide to building intelligent applications with MongoDB Atlas Vector Search.

🧠 Why We Created This Workshop

Building modern AI applications isn't just about connecting to an LLM API. It requires:

  • Understanding vector embeddings and semantic search
  • Organizing and storing your knowledge base efficiently
  • Implementing retrieval mechanisms that deliver relevant context
  • Creating a scalable architecture that performs well in production

This workshop addresses all these challenges, providing a clear path from concept to production.

📚 What You'll Learn

Our new workshop walks you through the complete process of building a production-ready RAG application:

  1. Understanding RAG Fundamentals
    Before diving into code, we explore how vector search works, why embeddings matter, and the core RAG architecture patterns.

  2. Setting Up MongoDB Atlas
    Learn how to create and configure a MongoDB Atlas cluster with Vector Search capabilities - the foundation of your AI application.

  3. Creating Vector Embeddings
    Master techniques for generating and managing vector embeddings from various text sources, including handling different providers (OpenAI, Ollama, and more).

  4. Building a Complete RAG Application
    Develop a full-featured application that ingests documents, performs semantic search, and generates contextually relevant responses.

  5. Advanced Techniques
    Take your application to the next level with hybrid search, re-ranking, query expansion, and other advanced retrieval strategies.

  6. Production Deployment
    Learn best practices for scaling, monitoring, and optimizing your RAG application in production.

💡 Who Should Take This Workshop?

This workshop is perfect for:

  • Backend Developers looking to add AI capabilities to existing applications
  • AI Engineers who want to build more robust retrieval systems
  • Technical Leaders evaluating RAG architecture patterns
  • Full-Stack Developers building end-to-end AI applications

No prior experience with vector databases is required, though basic familiarity with MongoDB and Node.js will help you get the most out of the material.

🚀 A Hands-On Approach

What makes this workshop special is its hands-on nature. You won't just read about concepts - you'll implement them step-by-step:

// By the end of the workshop, you'll be writing code like this
async function advancedRAGPipeline(query) {
// Step 1: Expand query with variations
const expandedQueries = await expandQuery(query);

// Step 2: Retrieve from multiple collections
const initialResults = await retrieveFromMultipleSources(expandedQueries);

// Step 3: Rerank results
const rerankedResults = await rerankResults(initialResults, query);

// Step 4: Generate response with the LLM
const response = await generateResponse(query, rerankedResults);

return {
answer: response,
sources: rerankedResults.map(r => ({
document: r.documentId,
source: r.metadata?.source,
score: r.score
}))
};
}

You'll build real components that solve common challenges:

  • Document chunking strategies for optimal retrieval
  • Caching mechanisms for performance optimization
  • Hybrid search implementations
  • Microservice architectures for production deployment

📈 Real-World Applications

The workshop focuses on practical applications that solve real business problems:

  • Customer Support Systems that retrieve accurate information from knowledge bases
  • Research Assistants that can analyze and retrieve information from scientific literature
  • Content Recommendation Engines powered by semantic similarity
  • Intelligent Document Search across enterprise content

🛠️ Getting Started

The workshop is available now in our documentation. To begin:

  1. Make sure you have a MongoDB Atlas account
  2. Install Node.js on your development machine
  3. Head over to our Workshop Introduction

🔮 Looking Ahead

This workshop represents the beginning of our commitment to helping developers build sophisticated AI applications. In the coming months, we'll be expanding the content with:

  • Multi-modal RAG implementations (text + images)
  • Enterprise-scale architectures
  • Performance optimization techniques
  • Integration with popular AI frameworks

🤔 Your Feedback Matters

As you work through the workshop, we'd love to hear your feedback. What challenges are you facing? What additional topics would you like to see covered? Your input will help shape future content.

Building AI applications doesn't have to be complicated. With MongoDB-RAG and this workshop, you have everything you need to create intelligent, context-aware applications that deliver real value.

Happy building!

Building an Intelligent Documentation Assistant with MongoDB-RAG

· 4 min read
Michael Lynn
Developer Advocate @ MongoDB

📖 TL;DR

Ever wished your documentation could just answer questions directly instead of forcing users to sift through endless pages? That’s exactly what we built with the MongoDB-RAG Documentation Assistant. In this article, I’ll walk you through the why, what, and how of building a chatbot that retrieves precise, relevant information from MongoDB-RAG’s own documentation.

🤔 Why Build a Documentation Assistant?

Traditional documentation search is useful, but it often leaves users with more questions than answers. Developers need to read through entire pages, infer context, and piece together solutions. Instead, we wanted something:

Conversational – Answers questions in natural language
Context-aware – Finds relevant documentation snippets instead of just keywords
Fast & Accurate – Uses vector search to surface precise answers
Transparent – Links to original sources so users can verify answers
Scalable – Handles multiple LLM providers, including OpenAI and Ollama

Our solution? A chatbot powered by MongoDB-RAG, showcasing exactly what our tool was built for: retrieval-augmented generation (RAG) using MongoDB Atlas Vector Search.


🛠️ How We Built It

We structured the assistant around four core components:

1️⃣ Document Ingestion

To make documentation searchable, we need to process it into vector embeddings. We use semantic chunking to break long docs into meaningful pieces before ingestion.

const chunker = new Chunker({
strategy: 'semantic',
maxChunkSize: 500,
overlap: 50
});

const documents = await loadMarkdownFiles('./docs');
const chunks = await Promise.all(
documents.map(doc => chunker.chunkDocument(doc))
);

await rag.ingestBatch(chunks.flat());

📝 Why Semantic Chunking? Instead of blindly splitting text, we preserve contextual integrity by overlapping related sections.


2️⃣ Vector Search with MongoDB Atlas

Once ingested, we use MongoDB Atlas Vector Search to find the most relevant documentation snippets based on a user’s query.

const searchResults = await rag.search(query, { 
maxResults: 6,
filter: { 'metadata.type': 'documentation' }
});

MongoDB’s $vectorSearch operator ensures we retrieve the closest matching content, ranked by relevance.


3️⃣ Streaming Responses for a Real Chat Experience

To improve user experience, we stream responses incrementally as they’re generated.

router.post('/chat', async (req, res) => {
const { query, history = [], stream = true } = req.body;

const context = await ragService.search(query);

if (stream) {
res.writeHead(200, {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive'
});

await llmService.generateResponse(query, context, history, res);
} else {
const answer = await llmService.generateResponse(query, context, history);
res.json({ answer, sources: context });
}
});

With this approach:

  • Responses appear in real-time instead of waiting for full generation 🚀
  • Developers can get partial answers quickly while longer responses load

4️⃣ Multi-Provider LLM Support

The assistant supports multiple embedding providers, including OpenAI and self-hosted Ollama.

const config = {
embedding: {
provider: process.env.EMBEDDING_PROVIDER || 'openai',
model: process.env.EMBEDDING_MODEL || 'text-embedding-3-small',
baseUrl: process.env.OLLAMA_BASE_URL // For local deployment
}
};

This allows users to switch providers easily, optimizing for performance, privacy, or cost.


💡 Key Features

🔍 Real-time Context Retrieval

Instead of guessing, the chatbot searches first and then generates answers.

🔗 Source Attribution

Each response includes a link to the documentation, letting users verify answers.

Streaming Responses

No waiting! Answers generate in real-time, improving responsiveness.

⚙️ Multi-Provider LLM Support

Deploy with OpenAI for scale or Ollama for private, local hosting.

🤖 Fallback Handling

If documentation doesn’t contain an answer, the chatbot transparently explains the limitation instead of fabricating responses.


🚀 Try It Yourself

Want to build a MongoDB-RAG-powered assistant? Here’s how to get started:

1️⃣ Install MongoDB-RAG

npm install mongodb-rag

2️⃣ Configure Your Environment

MONGODB_URI=your_atlas_connection_string
EMBEDDING_PROVIDER=openai
EMBEDDING_API_KEY=your_api_key
EMBEDDING_MODEL=text-embedding-3-small

3️⃣ Initialize the Chatbot

import { MongoRAG } from 'mongodb-rag';
import express from 'express';

const rag = new MongoRAG(config);
const app = express();

app.post('/api/chat', async (req, res) => {
const { query } = req.body;
const results = await rag.search(query);
res.json({ answer: results });
});

🌩️ Production Considerations

Where to Host?

We deployed our assistant on Vercel for:

  • Serverless scalability
  • Fast global CDN
  • Easy Git-based deployments

Which LLM for Production?

  • OpenAI – Best for reliability & speed
  • Ollama – Best for privacy-first self-hosted setups
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small

🔮 What’s Next?

Future improvements include:

  • Better query reformulation to improve retrieval accuracy
  • User feedback integration to refine responses over time
  • Conversation memory for context-aware follow-ups

🎬 Conclusion

By combining MongoDB Atlas Vector Search with modern LLMs, we built an assistant that transforms documentation into an interactive experience.

Try it out in our docs, and let us know what you think! 🚀

🔗 Resources

📘 MongoDB-RAG Docs
🔗 GitHub Repository
📦 NPM Package