Building a Complete RAG Application

In this section, you'll build a complete RAG application that:

Ingests documents from multiple sources
Creates and stores vector embeddings
Performs semantic search
Integrates with an LLM to generate responses

Project Structure

Let's start by creating a proper project structure:

rag-workshop/
├── .env                    # Environment variables
├── package.json            # Dependencies
├── data/                   # Sample documents
│   ├── articles/           # Text documents
│   └── qa-pairs.json       # Sample Q&A pairs for testing
├── src/
│   ├── config.js           # Configuration
│   ├── ingest.js           # Document ingestion
│   ├── search.js           # Vector search
│   ├── generate.js         # LLM response generation
│   ├── utils/              # Utility functions
│   │   ├── chunker.js      # Document chunking
│   │   └── formatter.js    # Response formatting
│   └── index.js            # Main application
└── tests/                  # Tests
    └── app.test.js         # Test suite

Create these directories:

mkdir -p data/articles src/utils tests

Setting Up Dependencies

Update your package.json to include all necessary dependencies:

{
  "name": "rag-workshop",
  "version": "1.0.0",
  "description": "RAG application workshop with MongoDB Atlas",
  "main": "src/index.js",
  "scripts": {
    "start": "node src/index.js",
    "ingest": "node src/ingest.js",
    "search": "node src/search.js",
    "test": "jest"
  },
  "dependencies": {
    "dotenv": "^16.3.1",
    "express": "^4.18.2",
    "mongodb-rag": "^0.53.0",
    "openai": "^4.10.0",
    "fs-extra": "^11.1.1"
  },
  "devDependencies": {
    "jest": "^29.6.2"
  }
}

Install the dependencies:

npm install

Creating Sample Data

Let's create some sample data to work with. Create a file at data/articles/mongodb-atlas.md:

# MongoDB Atlas: The Cloud Database Service

MongoDB Atlas is a fully-managed cloud database service developed by MongoDB. It handles all the complexity of deploying, managing, and healing your deployments on the cloud service provider of your choice (AWS, Azure, and GCP).

## Key Features

### Automated Deployment and Management
MongoDB Atlas automates deployment, maintenance, and scaling. You can deploy a cluster with a few clicks or API calls.

### Security
MongoDB Atlas provides multiple layers of security for your database:
- Network isolation with VPC peering
- IP whitelisting
- Advanced authentication
- Field-level encryption
- RBAC (Role-Based Access Control)
- LDAP integration

### Monitoring and Alerts
The service includes built-in monitoring tools and customizable alerts based on over a dozen different metrics.

### Automated Backups
Continuous backups with point-in-time recovery ensure your data is protected.

### Scaling Options
Scale up or down without application downtime. Auto-scaling provisions storage capacity automatically.

### Global Clusters
Create globally distributed clusters that route data to the closest available region to minimize latency.

## Integrations

MongoDB Atlas integrates with popular services and tools:
- AWS services (Lambda, SageMaker, etc.)
- Google Cloud (Firebase, DataFlow, etc.)
- Microsoft Azure services
- Kafka
- Kubernetes

## Atlas Vector Search

MongoDB Atlas Vector Search enables you to build vector search applications by storing embeddings and performing k-nearest neighbor (k-NN) search.

Key capabilities include:
- Store vector embeddings along with your operational data
- Build semantic search applications
- Power recommendation engines
- Implement AI-powered applications

Create another file at data/articles/rag-overview.md:

# Retrieval-Augmented Generation (RAG): An Overview

Retrieval-Augmented Generation (RAG) is an AI framework that enhances large language models (LLMs) by retrieving relevant information from external knowledge sources to ground the model's responses in factual, up-to-date information.

## How RAG Works

The RAG process typically consists of three main stages:

1. **Retrieval**: The system queries a knowledge base to find information relevant to the input prompt
2. **Augmentation**: Retrieved information is added to the context provided to the LLM
3. **Generation**: The LLM generates a response based on both the prompt and the retrieved information

## Benefits of RAG

Retrieval-Augmented Generation offers several advantages:

### Reduced Hallucinations
By grounding responses in retrieved facts, RAG significantly reduces the tendency of LLMs to generate plausible-sounding but incorrect information.

### Up-to-date Information
RAG systems can access recent information beyond the LLM's training cutoff date, keeping responses current.

### Domain Specialization
RAG enables general-purpose LLMs to provide expert-level responses in specialized domains by retrieving domain-specific information.

### Transparency and Attribution
Information sources can be tracked and cited, improving transparency and trustworthiness.

### Cost Efficiency
Retrieving information can be more efficient than training ever-larger models to memorize more facts.

## Implementation Considerations

When implementing RAG, several factors must be considered:

### Knowledge Base Design
The structure, format, and organization of the knowledge base significantly impact retrieval effectiveness.

### Embedding Strategy
How documents are converted to vector embeddings affects search quality.

### Chunking Approach
The method used to divide documents into chunks can impact retrieval precision.

### Retrieval Algorithms
Different retrieval methods (BM25, vector search, hybrid approaches) have varying effectiveness depending on the use case.

### Context Window Management
Efficiently using the LLM's context window is essential for complex queries requiring multiple retrieved documents.

## Common Challenges

RAG implementations often face several challenges:

- Balancing retrieval precision and recall
- Handling contradictory information from multiple sources
- Managing context window limitations
- Addressing retrieval latency in real-time applications

Create a sample Q&A pairs file at data/qa-pairs.json:

[
  {
    "question": "What is MongoDB Atlas?",
    "expected_source": "mongodb-atlas.md"
  },
  {
    "question": "What security features does MongoDB Atlas offer?",
    "expected_source": "mongodb-atlas.md"
  },
  {
    "question": "How does RAG reduce hallucinations?",
    "expected_source": "rag-overview.md"
  },
  {
    "question": "What are the three main stages of RAG?",
    "expected_source": "rag-overview.md"
  },
  {
    "question": "What is Atlas Vector Search used for?",
    "expected_source": "mongodb-atlas.md"
  }
]

Configuration File

Create a configuration file at src/config.js:

require('dotenv').config();

module.exports = {
  mongodb: {
    uri: process.env.MONGODB_URI,
    database: 'rag_workshop',
    collection: 'documents'
  },
  embedding: {
    provider: process.env.EMBEDDING_PROVIDER || 'openai',
    apiKey: process.env.EMBEDDING_API_KEY,
    model: process.env.EMBEDDING_MODEL || 'text-embedding-3-small',
    dimensions: 1536
  },
  llm: {
    provider: 'openai',
    apiKey: process.env.OPENAI_API_KEY,
    model: process.env.LLM_MODEL || 'gpt-3.5-turbo'
  },
  chunking: {
    strategy: 'semantic',
    maxChunkSize: 500,
    overlap: 50
  },
  search: {
    maxResults: 5,
    minScore: 0.7,
    returnSources: true
  }
};

Document Ingestion

Create the document ingestion script at src/ingest.js:

const fs = require('fs-extra');
const path = require('path');
const { MongoRAG, Chunker } = require('mongodb-rag');
const config = require('./config');

// Initialize MongoRAG
const rag = new MongoRAG({
  mongoUrl: config.mongodb.uri,
  database: config.mongodb.database,
  collection: config.mongodb.collection,
  embedding: config.embedding
});

// Create a chunker
const chunker = new Chunker({
  strategy: config.chunking.strategy,
  maxChunkSize: config.chunking.maxChunkSize,
  overlap: config.chunking.overlap
});

// Function to read and process markdown files
async function ingestMarkdownFiles(directory) {
  try {
    // Get all markdown files
    const files = await fs.readdir(directory);
    const markdownFiles = files.filter(file => file.endsWith('.md'));
    
    console.log(`Found ${markdownFiles.length} markdown files to process`);
    
    // Process each file
    for (const filename of markdownFiles) {
      const filePath = path.join(directory, filename);
      const content = await fs.readFile(filePath, 'utf-8');
      
      // Create document object
      const document = {
        id: path.basename(filename, '.md'),
        content: content,
        metadata: {
          source: filename,
          type: 'markdown',
          created: new Date().toISOString(),
          filename: filename
        }
      };
      
      console.log(`Processing ${filename}...`);
      
      // Chunk the document
      const chunks = await chunker.chunkDocument(document);
      console.log(`Created ${chunks.length} chunks from ${filename}`);
      
      // Ingest the chunks
      const result = await rag.ingestBatch(chunks);
      console.log(`Ingested ${result.processed} chunks from ${filename}`);
    }
    
    console.log('Document ingestion complete!');
    
  } catch (error) {
    console.error('Error ingesting documents:', error);
  } finally {
    await rag.close();
  }
}

// Main function
async function main() {
  const articlesDir = path.join(__dirname, '../data/articles');
  
  console.log('Starting document ingestion...');
  console.log(`Using ${config.chunking.strategy} chunking strategy`);
  console.log(`Max chunk size: ${config.chunking.maxChunkSize} characters`);
  console.log(`Chunk overlap: ${config.chunking.overlap} characters`);
  
  await rag.connect();
  await ingestMarkdownFiles(articlesDir);
}

// Run the ingestion process
main().catch(console.error);

Vector Search

Create the search functionality at src/search.js:

const { MongoRAG } = require('mongodb-rag');
const config = require('./config');

// Initialize MongoRAG
const rag = new MongoRAG({
  mongoUrl: config.mongodb.uri,
  database: config.mongodb.database,
  collection: config.mongodb.collection,
  embedding: config.embedding,
  search: {
    maxResults: config.search.maxResults,
    minScore: config.search.minScore
  }
});

/**
 * Search for documents relevant to a query
 * @param {string} query - The search query
 * @param {Object} options - Search options
 * @returns {Promise<Array>} - Search results
 */
async function searchDocuments(query, options = {}) {
  try {
    await rag.connect();
    
    console.log(`Searching for: "${query}"`);
    const results = await rag.search(query, {
      maxResults: options.maxResults || config.search.maxResults,
      filter: options.filter || {}
    });
    
    // Post-process results if needed
    const processedResults = results.map(result => {
      // Add source document if returnSources is enabled
      if (config.search.returnSources) {
        return {
          ...result,
          source: result.metadata?.filename || 'unknown'
        };
      }
      return result;
    });
    
    return processedResults;
    
  } catch (error) {
    console.error('Search error:', error);
    throw error;
  } finally {
    await rag.close();
  }
}

// Export for use in other modules
module.exports = {
  searchDocuments
};

// If run directly, perform a test search
if (require.main === module) {
  const testQuery = process.argv[2] || 'What is MongoDB Atlas?';
  
  searchDocuments(testQuery)
    .then(results => {
      console.log(`Found ${results.length} results:`);
      results.forEach((result, i) => {
        console.log(`\nResult ${i+1} (score: ${result.score.toFixed(4)}):`);
        console.log(`Source: ${result.source}`);
        console.log(`Content: ${result.content.substring(0, 150)}...`);
      });
    })
    .catch(console.error);
}

LLM Response Generation

Create the response generation module at src/generate.js:

const { OpenAI } = require('openai');
const config = require('./config');
const { searchDocuments } = require('./search');

// Initialize OpenAI client
const openai = new OpenAI({
  apiKey: config.llm.apiKey
});

/**
 * Generate a response using RAG
 * @param {string} query - User query
 * @param {Object} options - Generation options
 * @returns {Promise<Object>} - Generated response and metadata
 */
async function generateResponse(query, options = {}) {
  try {
    // Step 1: Retrieve relevant documents
    const searchResults = await searchDocuments(query, {
      maxResults: options.maxResults || 3
    });
    
    if (searchResults.length === 0) {
      return {
        answer: "I couldn't find any relevant information to answer your question.",
        sources: []
      };
    }
    
    // Step 2: Format context from retrieved documents
    const context = searchResults
      .map(result => `Source: ${result.source}\nContent: ${result.content}`)
      .join('\n\n');
    
    // Step 3: Create prompt with context
    const messages = [
      {
        role: 'system',
        content: `You are a helpful assistant. Answer the user's question based ONLY on the provided context. 
                 If the context doesn't contain relevant information, say "I don't have enough information to answer that question."
                 Always cite your sources at the end of your answer.`
      },
      {
        role: 'user',
        content: `Context:\n${context}\n\nQuestion: ${query}`
      }
    ];
    
    // Step 4: Generate response using LLM
    const completion = await openai.chat.completions.create({
      model: config.llm.model,
      messages: messages,
      temperature: options.temperature || 0.3,
      max_tokens: options.maxTokens || 500
    });
    
    // Step 5: Return formatted response with sources
    return {
      answer: completion.choices[0].message.content,
      sources: searchResults.map(result => ({
        source: result.source,
        score: result.score
      }))
    };
    
  } catch (error) {
    console.error('Error generating response:', error);
    throw error;
  }
}

// Export for use in other modules
module.exports = {
  generateResponse
};

// If run directly, perform a test generation
if (require.main === module) {
  const testQuery = process.argv[2] || 'What are the security features of MongoDB Atlas?';
  
  generateResponse(testQuery)
    .then(response => {
      console.log('\n🤖 Generated Answer:');
      console.log(response.answer);
      
      console.log('\n📚 Sources:');
      response.sources.forEach((source, i) => {
        console.log(`${i+1}. ${source.source} (relevance: ${source.score.toFixed(4)})`);
      });
    })
    .catch(console.error);
}

Main Application

Create the main application file at src/index.js:

const express = require('express');
const fs = require('fs-extra');
const path = require('path');
const { generateResponse } = require('./generate');
const { searchDocuments } = require('./search');
const config = require('./config');

// Initialize Express app
const app = express();
app.use(express.json());
app.use(express.urlencoded({ extended: true }));

// Health check endpoint
app.get('/healthz', (req, res) => {
  res.status(200).send('OK');
});

// Search endpoint
app.get('/api/search', async (req, res) => {
  try {
    const { query, maxResults } = req.query;
    
    if (!query) {
      return res.status(400).json({
        error: 'Missing required parameter: query'
      });
    }
    
    const results = await searchDocuments(query, {
      maxResults: maxResults ? parseInt(maxResults) : undefined
    });
    
    res.json({
      query,
      results
    });
    
  } catch (error) {
    console.error('Search API error:', error);
    res.status(500).json({
      error: 'An error occurred during search',
      message: error.message
    });
  }
});

// RAG endpoint
app.post('/api/rag', async (req, res) => {
  try {
    const { query, options } = req.body;
    
    if (!query) {
      return res.status(400).json({
        error: 'Missing required parameter: query'
      });
    }
    
    const response = await generateResponse(query, options);
    res.json(response);
    
  } catch (error) {
    console.error('RAG API error:', error);
    res.status(500).json({
      error: 'An error occurred during response generation',
      message: error.message
    });
  }
});

// Load and answer test questions
app.get('/api/test', async (req, res) => {
  try {
    const qaFilePath = path.join(__dirname, '../data/qa-pairs.json');
    const qaData = await fs.readJSON(qaFilePath);
    
    const results = [];
    
    for (const qa of qaData) {
      const response = await generateResponse(qa.question);
      
      // Check if the expected source appears in the sources
      const foundExpectedSource = response.sources.some(
        source => source.source.includes(qa.expected_source)
      );
      
      results.push({
        question: qa.question,
        answer: response.answer,
        expected_source: qa.expected_source,
        found_expected_source: foundExpectedSource,
        sources: response.sources
      });
    }
    
    res.json({
      total: qaData.length,
      correct_sources: results.filter(r => r.found_expected_source).length,
      results
    });
    
  } catch (error) {
    console.error('Test API error:', error);
    res.status(500).json({
      error: 'An error occurred during test',
      message: error.message
    });
  }
});

// Start the server
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`🚀 RAG application server running on port ${PORT}`);
  console.log(`📝 API Documentation:`);
  console.log(`   - GET  /healthz             Health check`);
  console.log(`   - GET  /api/search?query=X  Vector search`);
  console.log(`   - POST /api/rag              RAG response generation`);
  console.log(`   - GET  /api/test             Run test suite`);
});

Running the Application

Now it's time to run your application:

First, ingest the documents:
```
npm run ingest
```

Test the search functionality:

node src/search.js "What security features does MongoDB Atlas provide?"

Test the RAG response generation:

node src/generate.js "What are the main benefits of using RAG?"

Start the complete application:
```
npm start
```

Testing the API

Once your application is running, you can test the API endpoints:

Search Endpoint

curl "http://localhost:3000/api/search?query=What%20is%20MongoDB%20Atlas?"

RAG Endpoint

curl -X POST http://localhost:3000/api/rag \
  -H "Content-Type: application/json" \
  -d '{"query":"How does RAG reduce hallucinations?"}'

Test Suite

curl http://localhost:3000/api/test

Evaluating Your RAG Application

To evaluate the effectiveness of your RAG application, check:

Retrieval Precision: Are the correct documents being retrieved?
Response Accuracy: Are the generated answers correct and based on the retrieved information?
Source Attribution: Does the system correctly cite its sources?
Handling Edge Cases: Does it properly handle questions outside its knowledge base?

The /api/test endpoint helps evaluate these aspects automatically.

Next Steps

Congratulations! You've built a complete RAG application with MongoDB Atlas. In the next section, we'll explore advanced techniques to enhance your application's capabilities.

Project Structure​

Setting Up Dependencies​

Creating Sample Data​

Configuration File​

Document Ingestion​

Vector Search​

LLM Response Generation​

Main Application​

Running the Application​

Testing the API​

Search Endpoint​

RAG Endpoint​

Test Suite​

Evaluating Your RAG Application​

Next Steps​