Quick Recap
So far in this series, we’ve learned what MCP is, built servers, connected them to Claude, covered security, and explored production deployment. But one big question remains: how do you connect MCP to a database?
Think about it. An AI model that can query your database directly. Not you copying and pasting results manually, but the AI going in, reading, and analyzing on its own. It’s incredibly tempting — but also comes with serious risks.
This episode is exactly about that: building an MCP Server that connects your database to AI — securely and under control.
Why MCP + Database?
Let me share a few real scenarios that show why this combination is powerful:
Scenario 1: Sales analysis. The sales manager wants to know “How did last month’s sales compare to the same period last year?” Instead of someone writing SQL, the manager asks Claude directly. Claude queries through the MCP Server and analyzes the results.
Scenario 2: Debugging. A developer wants to understand why a specific user is having issues. Instead of manually running several queries, they ask AI to “check the status of user account #4521.” The AI runs multiple queries and produces a comprehensive report.
Scenario 3: Reporting. Weekly performance reports need to be generated. The AI runs the queries, analyzes results, and prepares the report automatically.
Secure Architecture
Before anything else, let me establish a critical principle: a database MCP Server should be read-only. Unless you really know what you’re doing and have sufficient security layers in place.
Why? Because AI models can make mistakes. If one has write access and fires off an incorrect DELETE or UPDATE, it’s a disaster. So, rule number one:
Security Layers
A well-built database MCP Server has multiple layers of protection:
Layer 1 — Restricted user: Create a database user that can only run SELECT on specified tables. No INSERT, no UPDATE, no DELETE.
Layer 2 — Query filtering: Before executing any query, verify it’s a SELECT. Reject everything else.
Layer 3 — Result limits: Cap the maximum number of returned rows (e.g., 1000). If someone tries to dump the entire database, they can’t.
Layer 4 — Timeouts: If a query runs longer than, say, 30 seconds, cancel it. Heavy queries shouldn’t bring down the server.
Layer 5 — Complete logging: Log every query that runs. Who executed it, what they ran, and what came back.
Building a Database MCP Server
Now let’s get practical. We’ll build an MCP Server that connects to PostgreSQL and provides three tools:
- query: Execute safe SELECT queries
- list_tables: List accessible tables
- describe_table: Show the structure of a specific table
Why these three? Because the AI first needs to understand what tables exist and what their structure is before writing queries. Without schema information, AI writes blind queries and gets poor results.
query tool. But without knowing the database structure, AI can’t write effective queries. Always include schema discovery tools.The list_tables Tool
The simplest tool. It returns the list of tables the AI can access. The important detail: it doesn’t show all tables — only those on a whitelist.
// Allowed tables - only these are visible to AI
const ALLOWED_TABLES = [
'orders', 'products', 'customers', 'categories'
];
// Tool: list_tables
{
name: "list_tables",
description: "List all accessible database tables",
handler: async () => {
return ALLOWED_TABLES.map(t => ({ name: t }));
}
}
The describe_table Tool
Returns a table’s structure — column names, types, and which one is the primary key. This information helps the AI build correct queries.
// Tool: describe_table
{
name: "describe_table",
inputSchema: {
type: "object",
properties: {
table_name: { type: "string" }
},
required: ["table_name"]
},
handler: async ({ table_name }) => {
if (!ALLOWED_TABLES.includes(table_name)) {
throw new Error("Access denied");
}
const result = await db.query(`
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = $1
ORDER BY ordinal_position
`, [table_name]);
return result.rows;
}
}
The query Tool
The most important and sensitive tool. It has multiple verification layers:
// Tool: query
{
name: "query",
inputSchema: {
type: "object",
properties: {
sql: { type: "string", description: "SELECT query to run" }
},
required: ["sql"]
},
handler: async ({ sql }) => {
// Layer 1: Only SELECT allowed
const normalized = sql.trim().toLowerCase();
if (!normalized.startsWith('select')) {
throw new Error("Only SELECT queries are allowed");
}
// Layer 2: Block dangerous keywords
const blocked = ['insert','update','delete','drop','alter','create'];
if (blocked.some(kw => normalized.includes(kw))) {
throw new Error("Query contains blocked keywords");
}
// Layer 3: Add LIMIT if missing
if (!normalized.includes('limit')) {
sql = sql + ' LIMIT 1000';
}
// Layer 4: Execute with timeout
const result = await db.query(sql, [], { timeout: 30000 });
return result.rows;
}
}
Practical Example: Sales Analysis
Let’s see how this works in practice. Suppose you ask Claude:
“Compare Q1 sales this year with last year. Which product category had the highest growth?”
Claude first calls list_tables to see what’s available. Then calls describe_table for orders, products, and categories to understand their structure. Then it builds an appropriate query:
SELECT c.name as category,
SUM(CASE WHEN o.created_at >= '2026-01-01'
THEN o.total ELSE 0 END) as sales_2026,
SUM(CASE WHEN o.created_at >= '2025-01-01'
AND o.created_at < '2025-04-01'
THEN o.total ELSE 0 END) as sales_2025
FROM orders o
JOIN products p ON o.product_id = p.id
JOIN categories c ON p.category_id = c.id
WHERE o.created_at >= '2025-01-01'
GROUP BY c.name
ORDER BY (sales_2026 - sales_2025) DESC
It retrieves the results and delivers a complete analysis — without you writing a single line of SQL.
Different Databases
I’ve been using PostgreSQL as an example, but a database MCP Server can connect to any database:
PostgreSQL: Best choice for getting started. Excellent JSON support, full-text search, and complex queries.
MySQL/MariaDB: If your project already uses MySQL, it works perfectly. The difference is mainly in the connection driver and some query syntax.
SQLite: Great for small, local projects. File-based and requires no installation. But not suitable for multiple concurrent users.
MongoDB: If your data is NoSQL, you can build an MCP Server that uses the aggregation pipeline. However, security filtering becomes more complex.
Ready-Made Servers
The good news is you don’t have to build from scratch. Several pre-built MCP Servers for databases exist:
- @modelcontextprotocol/server-postgres: Official Anthropic server for PostgreSQL
- @modelcontextprotocol/server-sqlite: Official server for SQLite
- Community servers: For MySQL, MongoDB, Redis, and many more
These are a great starting point. You can use them directly or fork and customize them.
Setting Up the Official PostgreSQL Server
The easiest way to start is with the official server. Just add it to your Claude Desktop config:
{
"mcpServers": {
"postgres": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-postgres",
"postgresql://readonly_user:password@localhost/mydb"
]
}
}
}
Always use a read-only user and put the password in an environment variable, not directly in the config.
Performance Optimization
When AI starts querying, optimization becomes important:
Schema Caching: Cache table structure information. Don’t query the database for it every time. Table schemas rarely change.
Query Analysis: Before running a query, check with EXPLAIN how heavy it is. If it’s doing a full table scan on a million-row table, raise a warning.
Result Summarization: When results are very large, instead of returning all rows, provide a summary. For example, “54,321 rows found. Showing the first 10.”
Connection Pooling: Use a connection pool so you don’t create a new connection for every query. We covered this in more detail in the previous episode.
Data Privacy and Sensitivity
This section is critical, and most people overlook it.
When AI has database access, data gets sent to the AI model. If you’re using the Claude API, Anthropic has stated they don’t use data for training — but you should still be aware of these points:
1. Filter sensitive data: Remove columns like passwords, credit card numbers, and medical information from results. Even if the AI doesn’t need them, it’s better that it never sees them.
2. Masking: Some data like phone numbers and emails can be masked. For example, “555-***-4567” instead of the full number.
3. Row-Level Security: PostgreSQL supports Row-Level Security (RLS). You can configure it so the read-only user only sees specific rows.
4. Audit Trail: Log every query with user details. If something goes wrong, you need to be able to trace it back.
Common Mistakes
Let me call out the mistakes people make most often:
Mistake 1: Granting write access. “But I just need one simple UPDATE!” — No. If you truly need write access, build a separate tool with human approval.
Mistake 2: No schema discovery. Without knowing table structures, AI writes bad queries and gets bad results.
Mistake 3: Running without timeouts. A heavy query can lock your database. Always set timeouts.
Mistake 4: Ignoring result volume. If a query returns 1 million rows, both memory and AI token costs explode. Always set a LIMIT.
Mistake 5: Connecting directly to production. First test on a database copy (replica). Make sure queries don’t put excessive load on the primary server.
Wrapping Up
Connecting MCP to a database is one of the most powerful — and most sensitive — things you can do. The keys to success are:
- Keep it read-only — write operations only with human approval
- Layer your security — restricted user + query filtering + result limits
- Include schema discovery — the AI needs to understand the structure
- Filter sensitive data — passwords, credit cards, personal information
- Log everything — for tracking and debugging