MongoDB内存使用就像一个"智慧缓存",但过高的内存占用往往是系统健康的预警信号。理解内存使用机制并掌握调优技巧,是每个MongoDB DBA的必修课。
一、MongoDB内存使用机制深度解析
1.1 WiredTiger存储引擎内存架构
现代MongoDB(3.2+)使用WiredTiger存储引擎,其内存架构采用多层次缓存设计:
MongoDB内存架构层次:
┌─────────────────────────────────────────┐
│ 应用层连接内存 │
│ - 连接池缓存 │
│ - 排序/聚合中间结果 │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ WiredTiger缓存层 │
│ ┌──────────────────────────────────┐ │
│ │ 内部缓存(Internal Cache) │ │
│ │ • 数据页缓存 │ │
│ │ • 索引页缓存 │ │
│ └──────────────────────────────────┘ │
│ ┌──────────────────────────────────┐ │
│ │ 文件系统缓存(OS Cache) │ │
│ │ • 压缩数据缓存 │ │
│ │ • 日志缓冲区 │ │
│ └──────────────────────────────────┘ │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ 操作系统内存管理 │
│ • 虚拟内存管理 │
│ • 页面交换(Swap) │
│ • 透明大页(Transparent HugePages) │
└─────────────────────────────────────────┘
1.2 内存使用计算公式
MongoDB的总内存使用量可以通过以下公式估算:
// MongoDB内存使用估算公式
function estimateMemoryUsage(config) {
// 1. WiredTiger缓存
const wiredTigerCache = config.wiredTigerCacheSizeGB * 1024 * 1024 * 1024;
// 2. 连接内存(每个连接约1MB)
const connectionMemory = config.maxConnections * 1024 * 1024;
// 3. 聚合/排序内存(默认100MB)
const sortAggMemory = 100 * 1024 * 1024;
// 4. 其他开销(约10%)
const otherOverhead = (wiredTigerCache + connectionMemory + sortAggMemory) * 0.1;
const totalMemory = wiredTigerCache + connectionMemory + sortAggMemory + otherOverhead;
return {
wiredTigerCacheMB: wiredTigerCache / (1024 * 1024),
connectionMemoryMB: connectionMemory / (1024 * 1024),
sortAggMemoryMB: sortAggMemory / (1024 * 1024),
otherOverheadMB: otherOverhead / (1024 * 1024),
totalMemoryMB: totalMemory / (1024 * 1024),
totalMemoryGB: totalMemory / (1024 * 1024 * 1024)
};
}
// 示例:默认配置估算
const defaultConfig = {
wiredTigerCacheSizeGB: 1, // 默认1GB或50%可用内存
maxConnections: 100
};
console.log(estimateMemoryUsage(defaultConfig));
二、内存占用诊断工具箱
2.1 实时内存监控脚本
// 实时内存监控与诊断系统
class MemoryDiagnosticSystem {
constructor(thresholds = {
cacheUsage: 0.9, // 缓存使用率90%
dirtyCache: 0.2, // 脏缓存比例20%
connectionRatio: 0.8, // 连接使用率80%
swapUsage: 0.1 // Swap使用率10%
}) {
this.thresholds = thresholds;
this.history = [];
this.MAX_HISTORY = 1000;
}
// 获取完整内存诊断报告
async getFullMemoryDiagnosis() {
const report = {
timestamp: new Date(),
wiredTiger: {},
connections: {},
operations: {},
system: {},
alerts: [],
recommendations: []
};
try {
// 1. 获取服务器状态
const serverStatus = db.runCommand({ serverStatus: 1 });
// 2. 分析WiredTiger缓存
report.wiredTiger = this.analyzeWiredTigerCache(serverStatus);
// 3. 分析连接内存
report.connections = this.analyzeConnections(serverStatus);
// 4. 分析操作内存使用
report.operations = this.analyzeOperationsMemory(serverStatus);
// 5. 获取系统级内存信息
report.system = await this.getSystemMemoryInfo();
// 6. 生成告警
report.alerts = this.generateMemoryAlerts(report);
// 7. 生成优化建议
report.recommendations = this.generateRecommendations(report);
// 保存历史记录
this.saveToHistory(report);
return report;
} catch (error) {
console.error(`内存诊断失败: ${error.message}`);
return null;
}
}
// 分析WiredTiger缓存
analyzeWiredTigerCache(serverStatus) {
const wt = serverStatus.wiredTiger?.cache || {};
const cacheUsage = wt['bytes currently in the cache'] / wt['maximum bytes configured'];
const dirtyRatio = wt['tracked dirty bytes in the cache'] / wt['bytes currently in the cache'];
return {
maxConfigured: this.formatBytes(wt['maximum bytes configured']),
currentlyUsed: this.formatBytes(wt['bytes currently in the cache']),
dirtyBytes: this.formatBytes(wt['tracked dirty bytes in the cache']),
cacheUsage: cacheUsage,
dirtyRatio: dirtyRatio,
evictionStats: {
pagesEvicted: wt['pages evicted from cache'],
checkpointBlocked: wt['pages evicted by application threads'],
evictionWalkPasses: wt['eviction walk passes']
}
};
}
// 分析连接内存
analyzeConnections(serverStatus) {
const conn = serverStatus.connections || {};
const usage = conn.current / conn.available;
return {
current: conn.current,
available: conn.available,
usage: usage,
memoryPerConnection: 1024 * 1024, // 估算1MB/连接
totalMemory: conn.current * 1024 * 1024
};
}
// 分析操作内存使用
analyzeOperationsMemory(serverStatus) {
const mem = serverStatus.mem || {};
const tcmalloc = serverStatus.tcmalloc || {};
return {
resident: this.formatBytes(mem.resident * 1024 * 1024),
virtual: this.formatBytes(mem.virtual * 1024 * 1024),
mapped: this.formatBytes(mem.mapped * 1024 * 1024),
mappedWithJournal: this.formatBytes(mem.mappedWithJournal * 1024 * 1024),
tcmalloc: {
heapSize: this.formatBytes(tcmalloc.generic?.heap_size || 0),
freeBytes: this.formatBytes(tcmalloc.generic?.current_allocated_bytes || 0)
}
};
}
// 获取系统内存信息(需要系统命令支持)
async getSystemMemoryInfo() {
// 这里简化处理,实际中可以通过执行系统命令获取
return {
total: '16 GB', // 示例值
used: '12 GB',
free: '4 GB',
swapUsed: '512 MB',
swapTotal: '2 GB'
};
}
// 生成内存告警
generateMemoryAlerts(report) {
const alerts = [];
// WiredTiger缓存使用率过高
if (report.wiredTiger.cacheUsage > this.thresholds.cacheUsage) {
alerts.push({
level: 'WARNING',
type: 'CACHE_USAGE_HIGH',
message: `WiredTiger缓存使用率 ${(report.wiredTiger.cacheUsage * 100).toFixed(1)}%`,
details: `当前使用: ${report.wiredTiger.currentlyUsed}, 最大配置: ${report.wiredTiger.maxConfigured}`
});
}
// 脏缓存比例过高
if (report.wiredTiger.dirtyRatio > this.thresholds.dirtyCache) {
alerts.push({
level: 'WARNING',
type: 'DIRTY_CACHE_HIGH',
message: `脏缓存比例 ${(report.wiredTiger.dirtyRatio * 100).toFixed(1)}%`,
details: '可能导致写入性能下降'
});
}
// 连接使用率过高
if (report.connections.usage > this.thresholds.connectionRatio) {
alerts.push({
level: 'WARNING',
type: 'CONNECTION_USAGE_HIGH',
message: `连接使用率 ${(report.connections.usage * 100).toFixed(1)}%`,
details: `当前连接: ${report.connections.current}/${report.connections.available}`
});
}
return alerts;
}
// 生成优化建议
generateRecommendations(report) {
const recommendations = [];
// 缓存相关建议
if (report.wiredTiger.cacheUsage > 0.8) {
recommendations.push({
priority: 'HIGH',
category: 'CACHE',
action: '考虑增加WiredTiger缓存大小',
command: 'mongod --wiredTigerCacheSizeGB=X',
details: '当前缓存使用率较高,增加缓存可提高性能'
});
}
if (report.wiredTiger.dirtyRatio > 0.15) {
recommendations.push({
priority: 'MEDIUM',
category: 'CACHE',
action: '检查写入负载,考虑调整checkpoint间隔',
command: 'db.adminCommand({setParameter:1, wiredTigerEngineRuntimeConfig:"checkpoint=(wait=60)"})',
details: '脏缓存比例较高,可能影响写入性能'
});
}
// 连接相关建议
if (report.connections.usage > 0.7) {
recommendations.push({
priority: 'MEDIUM',
category: 'CONNECTIONS',
action: '优化应用连接池配置',
details: '减少空闲连接,设置合理的连接超时'
});
}
return recommendations;
}
// 格式化字节显示
formatBytes(bytes) {
if (bytes === 0) return '0 Bytes';
const k = 1024;
const sizes = ['Bytes', 'KB', 'MB', 'GB', 'TB'];
const i = Math.floor(Math.log(bytes) / Math.log(k));
return parseFloat((bytes / Math.pow(k, i)).toFixed(2)) + ' ' + sizes[i];
}
// 保存历史记录
saveToHistory(report) {
this.history.push(report);
if (this.history.length > this.MAX_HISTORY) {
this.history.shift();
}
}
// 显示诊断报告
displayDiagnosisReport(report) {
if (!report) {
console.log('无法生成诊断报告');
return;
}
console.log('\n' + '='.repeat(80));
console.log('MongoDB内存诊断报告');
console.log('='.repeat(80));
console.log(`时间: ${report.timestamp.toLocaleString()}`);
// WiredTiger缓存信息
console.log('\n🔵 WiredTiger缓存分析:');
console.log(` 最大配置: ${report.wiredTiger.maxConfigured}`);
console.log(` 当前使用: ${report.wiredTiger.currentlyUsed}`);
console.log(` 使用率: ${(report.wiredTiger.cacheUsage * 100).toFixed(1)}%`);
console.log(` 脏缓存: ${report.wiredTiger.dirtyBytes}`);
console.log(` 脏缓存比例: ${(report.wiredTiger.dirtyRatio * 100).toFixed(1)}%`);
// 连接信息
console.log('\n🔗 连接分析:');
console.log(` 当前连接: ${report.connections.current}`);
console.log(` 可用连接: ${report.connections.available}`);
console.log(` 使用率: ${(report.connections.usage * 100).toFixed(1)}%`);
console.log(` 估算内存: ${this.formatBytes(report.connections.totalMemory)}`);
// 操作内存
console.log('\n💾 操作内存分析:');
console.log(` 常驻内存: ${report.operations.resident}`);
console.log(` 虚拟内存: ${report.operations.virtual}`);
console.log(` 映射内存: ${report.operations.mapped}`);
// 系统内存
console.log('\n🖥️ 系统内存:');
console.log(` 总内存: ${report.system.total}`);
console.log(` 已使用: ${report.system.used}`);
console.log(` 空闲: ${report.system.free}`);
console.log(` Swap使用: ${report.system.swapUsed}/${report.system.swapTotal}`);
// 告警
if (report.alerts.length > 0) {
console.log('\n🚨 内存告警:');
report.alerts.forEach(alert => {
const icon = {
'WARNING': '⚠️',
'CRITICAL': '🚨'
}[alert.level] || 'ℹ️';
console.log(` ${icon} [${alert.level}] ${alert.message}`);
console.log(` 详情: ${alert.details}`);
});
}
// 优化建议
if (report.recommendations.length > 0) {
console.log('\n💡 优化建议:');
report.recommendations.forEach((rec, idx) => {
const priorityIcon = {
'HIGH': '🔴',
'MEDIUM': '🟡',
'LOW': '🟢'
}[rec.priority] || '⚪';
console.log(` ${priorityIcon} ${rec.category}: ${rec.action}`);
if (rec.command) {
console.log(` 命令: ${rec.command}`);
}
console.log(` 说明: ${rec.details}`);
});
}
}
// 持续监控模式
startContinuousMonitoring(intervalSeconds = 30) {
console.log(`启动内存持续监控,间隔: ${intervalSeconds}秒`);
console.log('按 Ctrl+C 停止监控\n');
let iteration = 0;
const interval = setInterval(async () => {
iteration++;
console.log(`\n📊 内存检查 #${iteration} - ${new Date().toLocaleTimeString()}`);
const report = await this.getFullMemoryDiagnosis();
this.displayDiagnosisReport(report);
// 如果发现严重问题,立即告警
if (report && report.alerts.some(a => a.level === 'CRITICAL')) {
console.log('\n🚨 检测到严重问题,建议立即处理!');
}
}, intervalSeconds * 1000);
// 处理退出
process.on('SIGINT', () => {
clearInterval(interval);
console.log('\n内存监控已停止');
this.displaySummary();
process.exit(0);
});
}
// 显示监控摘要
displaySummary() {
if (this.history.length === 0) {
console.log('没有收集到监控数据');
return;
}
console.log('\n' + '='.repeat(80));
console.log('内存监控总结报告');
console.log('='.repeat(80));
// 分析历史趋势
const cacheUsageHistory = this.history.map(h => h.wiredTiger.cacheUsage);
const avgCacheUsage = cacheUsageHistory.reduce((a, b) => a + b, 0) / cacheUsageHistory.length;
const maxCacheUsage = Math.max(...cacheUsageHistory);
console.log(`监控次数: ${this.history.length}`);
console.log(`平均缓存使用率: ${(avgCacheUsage * 100).toFixed(1)}%`);
console.log(`最大缓存使用率: ${(maxCacheUsage * 100).toFixed(1)}%`);
// 统计告警类型
const alertCounts = {};
this.history.forEach(h => {
h.alerts.forEach(alert => {
alertCounts[alert.type] = (alertCounts[alert.type] || 0) + 1;
});
});
if (Object.keys(alertCounts).length > 0) {
console.log('\n📊 告警统计:');
Object.entries(alertCounts).forEach(([type, count]) => {
const percentage = (count / this.history.length * 100).toFixed(1);
console.log(` ${type}: ${count}次 (${percentage}%)`);
});
}
}
}
// 使用示例
const memoryDiagnostics = new MemoryDiagnosticSystem();
// 运行一次诊断
memoryDiagnostics.getFullMemoryDiagnosis().then(report => {
memoryDiagnostics.displayDiagnosisReport(report);
});
// 或者启动持续监控
// memoryDiagnostics.startContinuousMonitoring(60);
2.2 深度内存分析工具
// 深度内存分析工具包
class DeepMemoryAnalyzer {
constructor() {
this.collectionStats = new Map();
this.indexStats = new Map();
}
// 分析所有集合的内存使用
async analyzeCollectionMemoryUsage() {
const collections = db.getCollectionNames();
const analysis = [];
console.log(`分析 ${collections.length} 个集合的内存使用情况...`);
for (const collName of collections) {
try {
console.log(`正在分析: ${collName}`);
const stats = db[collName].stats({
scale: 1024 * 1024, // MB
indexDetails: true
});
// 获取索引统计
const indexDetails = await this.analyzeIndexes(collName);
// 估算内存使用
const memoryEstimate = this.estimateCollectionMemory(stats, indexDetails);
analysis.push({
name: collName,
sizeMB: stats.size,
storageMB: stats.storageSize,
count: stats.count,
avgObjSize: stats.avgObjSize,
indexes: stats.nindexes,
indexSizeMB: stats.totalIndexSize,
memoryEstimate: memoryEstimate,
hotScore: this.calculateHotScore(stats, indexDetails)
});
// 保存详细统计
this.collectionStats.set(collName, stats);
} catch (error) {
console.log(`无法分析集合 ${collName}: ${error.message}`);
}
}
// 排序并生成报告
this.generateCollectionMemoryReport(analysis);
return analysis;
}
// 分析索引内存使用
async analyzeIndexes(collectionName) {
const collection = db.getCollection(collectionName);
const indexes = [];
try {
// 获取索引统计
const indexStats = collection.aggregate([{ $indexStats: {} }]).toArray();
// 获取索引大小信息
const collStats = db[collectionName].stats();
if (collStats.indexSizes) {
Object.entries(collStats.indexSizes).forEach(([indexName, sizeBytes]) => {
const statsEntry = indexStats.find(s => s.name === indexName);
indexes.push({
name: indexName,
sizeMB: sizeBytes / (1024 * 1024),
accesses: statsEntry ? statsEntry.accesses.ops : 0,
since: statsEntry ? statsEntry.accesses.since : null,
spec: statsEntry ? statsEntry.spec : {}
});
});
}
this.indexStats.set(collectionName, indexes);
} catch (error) {
console.log(`无法分析索引 ${collectionName}: ${error.message}`);
}
return indexes;
}
// 估算集合内存使用
estimateCollectionMemory(stats, indexes) {
// 1. 数据缓存估算(假设30%的数据在缓存中)
const dataCache = stats.size * 0.3;
// 2. 索引缓存估算(假设50%的索引在缓存中)
const indexCache = stats.totalIndexSize * 0.5;
// 3. 活跃工作集估算
const workingSet = Math.min(stats.size * 0.2, 1024); // 最多1GB工作集
// 4. 排序/聚合内存(基于操作统计)
const operationMemory = 100; // 基础100MB
const total = dataCache + indexCache + workingSet + operationMemory;
return {
dataCacheMB: dataCache,
indexCacheMB: indexCache,
workingSetMB: workingSet,
operationMemoryMB: operationMemory,
totalMB: total
};
}
// 计算集合热评分
calculateHotScore(stats, indexes) {
let score = 0;
// 基于文档数量(权重20%)
const docScore = Math.min(stats.count / 1000000, 1) * 20;
score += docScore;
// 基于索引访问(权重30%)
const totalAccesses = indexes.reduce((sum, idx) => sum + idx.accesses, 0);
const accessScore = Math.min(Math.log10(totalAccesses + 1) / 3, 1) * 30;
score += accessScore;
// 基于索引大小(权重20%)
const indexRatio = stats.totalIndexSize / (stats.size || 1);
const indexScore = Math.min(indexRatio * 2, 1) * 20;
score += indexScore;
// 基于文档大小(权重30%)
const avgDocSizeKB = stats.avgObjSize / 1024;
const sizeScore = Math.min(avgDocSizeKB / 10, 1) * 30;
score += sizeScore;
return Math.min(score, 100);
}
// 生成集合内存报告
generateCollectionMemoryReport(analysis) {
// 按总内存估计排序
const sortedByMemory = [...analysis].sort((a, b) =>
b.memoryEstimate.totalMB - a.memoryEstimate.totalMB
);
// 按热评分排序
const sortedByHotScore = [...analysis].sort((a, b) =>
b.hotScore - a.hotScore
);
console.log('\n' + '='.repeat(100));
console.log('集合内存使用分析报告');
console.log('='.repeat(100));
// 内存使用最多的集合
console.log('\n🏆 内存使用TOP 10集合:');
sortedByMemory.slice(0, 10).forEach((coll, idx) => {
console.log(`${idx + 1}. ${coll.name}`);
console.log(` 文档数: ${coll.count.toLocaleString()}`);
console.log(` 数据大小: ${coll.sizeMB.toFixed(2)} MB`);
console.log(` 索引大小: ${coll.indexSizeMB.toFixed(2)} MB`);
console.log(` 估算内存: ${coll.memoryEstimate.totalMB.toFixed(2)} MB`);
console.log(` 热评分: ${coll.hotScore.toFixed(1)}/100`);
});
// 最热门的集合
console.log('\n🔥 最热门集合(TOP 10):');
sortedByHotScore.slice(0, 10).forEach((coll, idx) => {
const heatIcon = coll.hotScore > 80 ? '🔥' : coll.hotScore > 60 ? '⚡' : '🌡️';
console.log(`${idx + 1}. ${coll.name} ${heatIcon}`);
console.log(` 热评分: ${coll.hotScore.toFixed(1)}`);
console.log(` 估算缓存: ${coll.memoryEstimate.dataCacheMB.toFixed(2)} MB`);
console.log(` 估算索引缓存: ${coll.memoryEstimate.indexCacheMB.toFixed(2)} MB`);
});
// 索引使用分析
console.log('\n📊 索引内存使用分析:');
const totalIndexSize = analysis.reduce((sum, coll) => sum + coll.indexSizeMB, 0);
const totalDataSize = analysis.reduce((sum, coll) => sum + coll.sizeMB, 0);
const indexToDataRatio = (totalIndexSize / totalDataSize * 100).toFixed(1);
console.log(` 总索引大小: ${totalIndexSize.toFixed(2)} MB`);
console.log(` 总数据大小: ${totalDataSize.toFixed(2)} MB`);
console.log(` 索引/数据比例: ${indexToDataRatio}%`);
if (indexToDataRatio > 30) {
console.log(` ⚠️ 索引比例较高,建议检查索引设计`);
}
// 识别潜在问题
console.log('\n🔍 潜在问题识别:');
// 大集合识别
const largeCollections = analysis.filter(coll => coll.sizeMB > 1024); // >1GB
if (largeCollections.length > 0) {
console.log(` 发现 ${largeCollections.length} 个大集合(>1GB):`);
largeCollections.slice(0, 5).forEach(coll => {
console.log(` - ${coll.name}: ${(coll.sizeMB/1024).toFixed(2)} GB`);
});
}
// 索引过多的集合
const heavyIndexed = analysis.filter(coll => coll.indexes > 5);
if (heavyIndexed.length > 0) {
console.log(` 发现 ${heavyIndexed.length} 个索引过多的集合(>5个索引):`);
heavyIndexed.slice(0, 5).forEach(coll => {
console.log(` - ${coll.name}: ${coll.indexes} 个索引`);
});
}
// 生成优化建议
this.generateMemoryOptimizationSuggestions(analysis);
}
// 生成内存优化建议
generateMemoryOptimizationSuggestions(analysis) {
console.log('\n💡 内存优化建议:');
// 1. 压缩建议
const compressionCandidates = analysis.filter(coll =>
coll.storageMB > coll.sizeMB * 1.5 && coll.sizeMB > 500
);
if (compressionCandidates.length > 0) {
console.log(' 1. 考虑启用压缩的集合:');
compressionCandidates.slice(0, 5).forEach(coll => {
const ratio = (coll.storageMB / coll.sizeMB).toFixed(2);
console.log(` - ${coll.name}: 存储效率 ${ratio}`);
console.log(` 命令: db.${coll.name}.compact()`);
});
}
// 2. 索引优化建议
const inefficientIndexes = [];
for (const [collName, indexes] of this.indexStats.entries()) {
const unusedIndexes = indexes.filter(idx => idx.accesses === 0);
if (unusedIndexes.length > 0) {
inefficientIndexes.push({
collection: collName,
indexes: unusedIndexes
});
}
}
if (inefficientIndexes.length > 0) {
console.log(' 2. 未使用的索引:');
inefficientIndexes.slice(0, 3).forEach(item => {
console.log(` - ${item.collection}:`);
item.indexes.slice(0, 3).forEach(idx => {
console.log(` ${idx.name} (${idx.sizeMB.toFixed(2)}MB)`);
});
});
}
// 3. 分区建议
const partitionCandidates = analysis.filter(coll =>
coll.count > 1000000 && coll.hotScore > 70
);
if (partitionCandidates.length > 0) {
console.log(' 3. 考虑分区的集合:');
partitionCandidates.slice(0, 3).forEach(coll => {
console.log(` - ${coll.name}: ${coll.count.toLocaleString()} 文档`);
});
}
// 4. 归档建议
const archiveCandidates = analysis.filter(coll => {
// 检查是否有时间字段
try {
const sample = db[coll.name].findOne({}, { _id: 0 });
const hasDateField = Object.values(sample || {}).some(
val => val instanceof Date
);
return hasDateField && coll.sizeMB > 2048; // >2GB
} catch {
return false;
}
});
if (archiveCandidates.length > 0) {
console.log(' 4. 考虑归档的集合:');
archiveCandidates.slice(0, 3).forEach(coll => {
console.log(` - ${coll.name}: ${(coll.sizeMB/1024).toFixed(2)} GB`);
});
}
}
// 分析内存使用模式
analyzeMemoryPatterns(days = 7) {
console.log(`\n📈 分析最近${days}天的内存使用模式...`);
// 这里需要访问历史监控数据
// 实际应用中可以从监控系统获取数据
const patterns = {
peakHours: [],
growthTrend: 0,
seasonalPatterns: []
};
console.log(' 内存使用模式分析需要历史监控数据支持');
console.log(' 建议使用MongoDB Atlas或第三方监控工具');
return patterns;
}
}
// 使用示例
const memoryAnalyzer = new DeepMemoryAnalyzer();
// 分析集合内存使用
memoryAnalyzer.analyzeCollectionMemoryUsage();
三、内存优化实战方案
3.1 WiredTiger缓存优化配置
# MongoDB配置文件优化示例 (mongod.conf)
systemLog:
destination: file
path: /var/log/mongodb/mongod.log
logAppend: true
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
engine: wiredTiger
wiredTiger:
engineConfig:
# 🔥 关键配置:缓存大小,推荐值为以下两者中较小的:
# 1. 50%的(RAM - 1GB)
# 2. 256GB(WiredTiger最大限制)
cacheSizeGB: 8
# 内部缓存最大值(默认与cacheSizeGB相同)
# maximumCacheOverflowFileSizeGB: 0
# 检查点间隔(秒)
checkpoint: (wait=60)
# 日志压缩
journalCompressor: snappy
# 数据块压缩
blockCompressor: zstd # 可选:none, snappy, zlib, zstd
collectionConfig:
# 集合配置
blockCompressor: zstd
indexConfig:
# 索引前缀压缩
prefixCompression: true
# 操作限制
operationProfiling:
mode: slowOp
slowOpThresholdMs: 100
# 限制排序内存使用(默认100MB)
# filter: { "planCacheKey": { $exists: false } }
# 网络配置
net:
port: 27017
bindIp: 127.0.0.1
maxIncomingConnections: 1000 # 根据实际情况调整
wireObjectCheck: false # 减少对象检查开销
# 进程管理
processManagement:
fork: true
pidFilePath: /var/run/mongodb/mongod.pid
# 安全配置
security:
authorization: enabled
keyFile: /var/lib/mongodb/keyfile
3.2 动态调整内存参数
// 动态内存参数调整工具
class MemoryTuner {
constructor() {
this.baselineMetrics = null;
this.tuningHistory = [];
}
// 自动调优WiredTiger缓存
async autoTuneWiredTigerCache() {
console.log('开始自动调优WiredTiger缓存...');
// 1. 获取当前系统信息
const systemInfo = await this.getSystemInfo();
const currentStatus = db.runCommand({ serverStatus: 1 });
// 2. 计算推荐缓存大小
const recommendedCache = this.calculateOptimalCacheSize(systemInfo, currentStatus);
// 3. 获取当前配置
const currentCache = await this.getCurrentCacheSize();
console.log(`当前缓存: ${currentCache}GB`);
console.log(`推荐缓存: ${recommendedCache}GB`);
// 4. 判断是否需要调整
if (Math.abs(recommendedCache - currentCache) > 1) {
console.log(`建议调整缓存大小到 ${recommendedCache}GB`);
const confirm = await this.promptConfirmation('是否应用新的缓存大小?');
if (confirm) {
await this.adjustCacheSize(recommendedCache);
console.log('缓存大小已调整,需要重启MongoDB服务生效');
}
} else {
console.log('当前缓存大小合理,无需调整');
}
return {
current: currentCache,
recommended: recommendedCache,
adjusted: confirm ? recommendedCache : null
};
}
// 计算最优缓存大小
calculateOptimalCacheSize(systemInfo, mongoStatus) {
const totalMemoryGB = systemInfo.totalMemoryGB;
// 规则1: 保留至少1GB给操作系统和其他进程
const availableForMongo = totalMemoryGB - 1;
// 规则2: 计算工作集大小
const workingSetGB = this.calculateWorkingSetSize(mongoStatus);
// 规则3: WiredTiger最大支持256GB
const maxWiredTigerCache = 256;
// 推荐值:工作集大小、可用内存的50%、256GB 三者取最小
const recommended = Math.min(
workingSetGB * 1.2, // 工作集的120%
availableForMongo * 0.5, // 可用内存的50%
maxWiredTigerCache // WiredTiger最大限制
);
// 确保最小1GB
return Math.max(1, Math.floor(recommended));
}
// 计算工作集大小
calculateWorkingSetSize(mongoStatus) {
// 工作集 = 活跃数据 + 索引 + 操作内存
let workingSetMB = 0;
// 获取所有数据库统计
const dbs = db.adminCommand({ listDatabases: 1 }).databases;
dbs.forEach(dbInfo => {
if (!['admin', 'local', 'config'].includes(dbInfo.name)) {
try {
const dbStats = db.getSiblingDB(dbInfo.name).runCommand({
dbStats: 1,
scale: 1024 * 1024 // MB
});
// 假设20%的数据是活跃的
const activeData = dbStats.dataSize * 0.2;
// 假设50%的索引是活跃的
const activeIndexes = dbStats.indexSize * 0.5;
workingSetMB += activeData + activeIndexes;
} catch (error) {
// 跳过无法访问的数据库
}
}
});
// 转换为GB
const workingSetGB = workingSetMB / 1024;
return workingSetGB;
}
// 获取当前缓存大小
async getCurrentCacheSize() {
const config = db.adminCommand({ getCmdLineOpts: 1 });
const wiredTigerConfig = config.parsed?.storage?.wiredTiger?.engineConfig;
if (wiredTigerConfig?.cacheSizeGB) {
return wiredTigerConfig.cacheSizeGB;
}
// 从启动参数中查找
const argv = config.argv || [];
for (let i = 0; i < argv.length; i++) {
if (argv[i] === '--wiredTigerCacheSizeGB' && argv[i + 1]) {
return parseFloat(argv[i + 1]);
}
}
// 默认值
return 1;
}
// 调整缓存大小
async adjustCacheSize(newSizeGB) {
console.log(`正在调整缓存大小为 ${newSizeGB}GB...`);
// 这里需要修改配置文件或启动参数
// 实际应用中需要根据部署方式调整
const recommendation = {
action: 'update_config',
configFile: '/etc/mongod.conf',
changes: [
{
section: 'storage.wiredTiger.engineConfig',
key: 'cacheSizeGB',
value: newSizeGB
}
],
restartCommand: 'sudo systemctl restart mongod'
};
console.log('请手动执行以下操作:');
console.log(`1. 编辑配置文件: ${recommendation.configFile}`);
console.log(`2. 将 storage.wiredTiger.engineConfig.cacheSizeGB 设置为 ${newSizeGB}`);
console.log(`3. 重启MongoDB: ${recommendation.restartCommand}`);
return recommendation;
}
// 优化索引内存使用
optimizeIndexMemory() {
console.log('\n优化索引内存使用...');
const recommendations = [];
// 分析所有集合的索引
const collections = db.getCollectionNames();
collections.forEach(collName => {
try {
const coll = db.getCollection(collName);
const stats = coll.stats();
// 检查索引数量
if (stats.nindexes > 10) {
recommendations.push({
collection: collName,
issue: '索引数量过多',
current: stats.nindexes,
suggestion: '考虑合并或删除不常用的索引',
command: `db.${collName}.getIndexes()`
});
}
// 检查索引大小
const indexToDataRatio = stats.totalIndexSize / (stats.size || 1);
if (indexToDataRatio > 2) {
recommendations.push({
collection: collName,
issue: '索引大小远超数据大小',
ratio: indexToDataRatio.toFixed(2),
suggestion: '检查索引设计,删除重复或冗余索引'
});
}
} catch (error) {
// 跳过无法访问的集合
}
});
// 显示建议
if (recommendations.length > 0) {
console.log('索引优化建议:');
recommendations.slice(0, 5).forEach(rec => {
console.log(`- ${rec.collection}: ${rec.issue}`);
console.log(` 当前: ${rec.current || rec.ratio}`);
console.log(` 建议: ${rec.suggestion}`);
});
} else {
console.log('索引使用情况良好');
}
return recommendations;
}
// 获取系统信息
async getSystemInfo() {
// 这里简化处理,实际中可以通过系统命令获取
return {
totalMemoryGB: 16,
freeMemoryGB: 4,
cpuCores: 8,
os: 'Linux'
};
}
// 用户确认提示
async promptConfirmation(question) {
// 实际应用中可以使用readline模块
return true; // 简化处理
}
}
// 使用示例
const tuner = new MemoryTuner();
// 自动调优缓存
tuner.autoTuneWiredTigerCache();
// 优化索引内存
tuner.optimizeIndexMemory();
3.3 紧急内存问题处理脚本
// MongoDB内存紧急处理工具箱
class MemoryEmergencyToolkit {
constructor() {
this.problemHistory = [];
}
// 诊断内存紧急问题
diagnoseMemoryEmergency() {
console.log('🔍 诊断内存紧急问题...');
const diagnosis = {
timestamp: new Date(),
issues: [],
immediateActions: [],
rootCauses: []
};
try {
// 1. 检查服务器状态
const status = db.runCommand({ serverStatus: 1 });
// 2. 检查WiredTiger缓存
const wtCache = status.wiredTiger?.cache || {};
const cacheUsage = wtCache['bytes currently in the cache'] / wtCache['maximum bytes configured'];
if (cacheUsage > 0.95) {
diagnosis.issues.push({
type: 'CACHE_CRITICAL',
severity: 'CRITICAL',
message: 'WiredTiger缓存使用率超过95%',
details: `使用率: ${(cacheUsage * 100).toFixed(1)}%`
});
diagnosis.immediateActions.push('清理WiredTiger缓存');
diagnosis.rootCauses.push('缓存配置过小或工作集过大');
}
// 3. 检查连接内存
const conn = status.connections;
const connUsage = conn.current / conn.available;
if (connUsage > 0.9) {
diagnosis.issues.push({
type: 'CONNECTIONS_CRITICAL',
severity: 'CRITICAL',
message: '连接数使用率超过90%',
details: `当前: ${conn.current}, 可用: ${conn.available}`
});
diagnosis.immediateActions.push('减少应用连接或重启MongoDB');
diagnosis.rootCauses.push('连接泄露或配置不当');
}
// 4. 检查操作内存
const currentOps = db.currentOp().inprog;
const largeOps = currentOps.filter(op =>
op.memUsage > 100 * 1024 * 1024 && // 超过100MB
(op.secs_running || 0) > 60 // 运行超过60秒
);
if (largeOps.length > 0) {
diagnosis.issues.push({
type: 'LARGE_OPERATIONS',
severity: 'HIGH',
message: `发现 ${largeOps.length} 个大内存操作`,
details: '可能消耗过多内存'
});
diagnosis.immediateActions.push('终止大内存操作');
diagnosis.rootCauses.push('查询或聚合操作设计不当');
}
// 5. 检查系统Swap使用
// 实际中需要通过系统命令获取
// 6. 检查内存分配器
const tcmalloc = status.tcmalloc || {};
if (tcmalloc.generic?.heap_size > 10 * 1024 * 1024 * 1024) { // >10GB
diagnosis.issues.push({
type: 'MALLOC_FRAGMENTATION',
severity: 'MEDIUM',
message: '内存分配器碎片化',
details: `堆大小: ${(tcmalloc.generic.heap_size / (1024*1024*1024)).toFixed(2)}GB`
});
}
} catch (error) {
diagnosis.issues.push({
type: 'DIAGNOSIS_FAILED',
severity: 'CRITICAL',
message: '诊断失败',
details: error.message
});
}
this.displayEmergencyReport(diagnosis);
this.problemHistory.push(diagnosis);
return diagnosis;
}
// 紧急处理措施
emergencyProcedures() {
console.log('\n🚨 紧急处理措施:');
const procedures = [
{
id: 1,
action: '清理WiredTiger缓存',
command: 'db.adminCommand({clearLog: "global"})',
description: '清理全局日志,释放部分内存',
risk: '低',
impact: '可能丢失部分诊断信息'
},
{
id: 2,
action: '终止大内存操作',
command: 'db.killOp(opid)',
description: '终止消耗过多内存的操作',
risk: '中',
impact: '操作被中断,需要客户端重试'
},
{
id: 3,
action: '减少连接数',
command: 'db.adminCommand({setParameter: 1, maxConnections: 500})',
description: '临时减少最大连接数',
risk: '中',
impact: '新连接可能被拒绝'
},
{
id: 4,
action: '刷新文件系统缓存',
command: 'echo 3 > /proc/sys/vm/drop_caches',
description: '清理Linux文件系统缓存',
risk: '高',
impact: '可能导致性能下降,需要root权限'
},
{
id: 5,
action: '重启MongoDB',
command: 'sudo systemctl restart mongod',
description: '彻底重启释放所有内存',
risk: '高',
impact: '服务暂时不可用'
}
];
procedures.forEach(proc => {
console.log(`${proc.id}. ${proc.action}`);
console.log(` 命令: ${proc.command}`);
console.log(` 风险: ${proc.risk}, 影响: ${proc.impact}`);
console.log(` 描述: ${proc.description}`);
});
return procedures;
}
// 清理WiredTiger缓存
cleanWiredTigerCache() {
console.log('清理WiredTiger缓存...');
try {
// 1. 清理全局日志
db.adminCommand({ clearLog: 'global' });
console.log('✅ 全局日志已清理');
// 2. 清理慢查询日志
db.setProfilingLevel(0);
db.system.profile.drop();
db.setProfilingLevel(1, 100);
console.log('✅ 慢查询日志已清理');
// 3. 清理诊断数据
db.adminCommand({ clearLog: 'diagnostic' });
console.log('✅ 诊断日志已清理');
return true;
} catch (error) {
console.error(`清理失败: ${error.message}`);
return false;
}
}
// 终止大内存操作
killLargeMemoryOperations(thresholdMB = 100) {
console.log(`查找并终止内存使用超过${thresholdMB}MB的操作...`);
const operations = db.currentOp().inprog;
const largeOps = operations.filter(op => {
// 估算操作内存使用
const memUsage = this.estimateOperationMemory(op);
return memUsage > thresholdMB * 1024 * 1024;
});
if (largeOps.length === 0) {
console.log('✅ 未发现大内存操作');
return [];
}
console.log(`发现 ${largeOps.length} 个大内存操作:`);
const killedOps = [];
largeOps.forEach(op => {
console.log(`\n操作ID: ${op.opid}`);
console.log(`命名空间: ${op.ns}`);
console.log(`运行时间: ${op.secs_running}秒`);
console.log(`操作: ${op.op}`);
if (op.command) {
console.log(`命令: ${JSON.stringify(op.command).substring(0, 100)}...`);
}
// 在实际应用中,这里需要用户确认
// const shouldKill = confirm('是否终止此操作?');
// if (shouldKill) {
// db.killOp(op.opid);
// console.log('操作已终止');
// killedOps.push(op.opid);
// }
});
return killedOps;
}
// 估算操作内存使用
estimateOperationMemory(operation) {
// 简化估算,实际中需要更复杂的逻辑
let estimatedMemory = 0;
if (operation.op === 'query') {
// 查询操作:基于返回文档大小估算
estimatedMemory = 10 * 1024 * 1024; // 10MB基础
} else if (operation.op === 'insert' || operation.op === 'update') {
// 写入操作:基于写入数据量
estimatedMemory = 5 * 1024 * 1024; // 5MB基础
} else if (operation.op === 'command' && operation.command?.aggregate) {
// 聚合操作:可能使用大量内存
estimatedMemory = 50 * 1024 * 1024; // 50MB基础
} else if (operation.op === 'getmore') {
// 游标获取:基于批次大小
estimatedMemory = 2 * 1024 * 1024; // 2MB基础
}
// 基于运行时间调整
const runtimeFactor = Math.min((operation.secs_running || 0) / 10, 5);
estimatedMemory *= (1 + runtimeFactor);
return estimatedMemory;
}
// 优化内存配置
optimizeMemoryConfig() {
console.log('优化内存配置...');
const optimizations = [];
// 检查当前配置
const config = db.adminCommand({ getCmdLineOpts: 1 });
// 1. 检查WiredTiger缓存
const currentCache = this.getCurrentCacheSize();
const recommendedCache = this.calculateRecommendedCache();
if (currentCache < recommendedCache * 0.8) {
optimizations.push({
parameter: 'wiredTigerCacheSizeGB',
current: currentCache,
recommended: recommendedCache,
command: 'mongod --wiredTigerCacheSizeGB=' + recommendedCache
});
}
// 2. 检查连接配置
const maxConnections = config.parsed?.net?.maxIncomingConnections || 65536;
if (maxConnections > 10000) {
optimizations.push({
parameter: 'maxIncomingConnections',
current: maxConnections,
recommended: 10000,
command: 'net.maxIncomingConnections: 10000'
});
}
// 3. 检查聚合内存限制
const internalQueryExecMaxBlockingSortBytes =
db.adminCommand({ getParameter: 1, internalQueryExecMaxBlockingSortBytes: 1 })
.internalQueryExecMaxBlockingSortBytes || 100 * 1024 * 1024;
if (internalQueryExecMaxBlockingSortBytes > 100 * 1024 * 1024) {
optimizations.push({
parameter: 'internalQueryExecMaxBlockingSortBytes',
current: internalQueryExecMaxBlockingSortBytes / (1024*1024) + 'MB',
recommended: '100MB',
command: 'setParameter internalQueryExecMaxBlockingSortBytes=104857600'
});
}
// 显示优化建议
if (optimizations.length > 0) {
console.log('配置优化建议:');
optimizations.forEach(opt => {
console.log(`\n- ${opt.parameter}:`);
console.log(` 当前: ${opt.current}`);
console.log(` 推荐: ${opt.recommended}`);
console.log(` 命令: ${opt.command}`);
});
} else {
console.log('✅ 内存配置合理');
}
return optimizations;
}
// 获取当前缓存大小
getCurrentCacheSize() {
// 简化实现,同前文
return 1;
}
// 计算推荐缓存大小
calculateRecommendedCache() {
// 简化实现
return 8;
}
// 显示紧急报告
displayEmergencyReport(diagnosis) {
console.log('\n' + '='.repeat(80));
console.log('内存紧急问题诊断报告');
console.log('='.repeat(80));
console.log(`时间: ${diagnosis.timestamp.toLocaleString()}`);
if (diagnosis.issues.length === 0) {
console.log('✅ 未发现紧急内存问题');
return;
}
// 按严重程度排序
const severityOrder = { CRITICAL: 0, HIGH: 1, MEDIUM: 2, LOW: 3 };
const sortedIssues = [...diagnosis.issues].sort(
(a, b) => severityOrder[a.severity] - severityOrder[b.severity]
);
console.log('\n🚨 发现问题:');
sortedIssues.forEach(issue => {
const icon = {
CRITICAL: '🔴',
HIGH: '🟠',
MEDIUM: '🟡',
LOW: '🟢'
}[issue.severity] || '⚪';
console.log(`${icon} ${issue.type}: ${issue.message}`);
console.log(` 详情: ${issue.details}`);
});
if (diagnosis.immediateActions.length > 0) {
console.log('\n⚡ 立即行动:');
diagnosis.immediateActions.forEach((action, idx) => {
console.log(`${idx + 1}. ${action}`);
});
}
if (diagnosis.rootCauses.length > 0) {
console.log('\n🔍 根本原因:');
diagnosis.rootCauses.forEach((cause, idx) => {
console.log(`${idx + 1}. ${cause}`);
});
}
}
// 生成恢复计划
generateRecoveryPlan() {
console.log('\n📋 内存问题恢复计划:');
const plan = [
{
phase: '立即行动(5分钟内)',
actions: [
'使用 emergencyProcedures() 选择并执行紧急措施',
'监控内存使用变化',
'记录所有操作步骤'
]
},
{
phase: '短期缓解(1小时内)',
actions: [
'优化索引配置',
'调整连接池设置',
'设置内存使用告警'
]
},
{
phase: '中期解决(24小时内)',
actions: [
'重新评估工作集大小',
'优化查询和聚合操作',
'考虑数据归档或分区'
]
},
{
phase: '长期预防(1周内)',
actions: [
'建立内存使用基线',
'实现自动化监控',
'定期进行内存健康检查',
'制定容量规划策略'
]
}
];
plan.forEach(phase => {
console.log(`\n${phase.phase}:`);
phase.actions.forEach((action, idx) => {
console.log(` ${idx + 1}. ${action}`);
});
});
return plan;
}
}
// 使用示例
const emergencyKit = new MemoryEmergencyToolkit();
// 诊断紧急问题
emergencyKit.diagnoseMemoryEmergency();
// 查看紧急处理措施
emergencyKit.emergencyProcedures();
// 生成恢复计划
emergencyKit.generateRecoveryPlan();
四、内存问题预防策略
4.1 监控与告警配置
# 内存监控告警规则配置
memory_monitoring:
# WiredTiger缓存监控
wiredtiger_cache:
warning_threshold: 80% # 使用率超过80%告警
critical_threshold: 90% # 使用率超过90%紧急告警
check_interval: 60s # 检查间隔
# 连接内存监控
connections:
warning_threshold: 70% # 连接使用率
max_connections: 1000 # 最大连接数
# 操作内存监控
operations:
max_sort_memory: 100MB # 排序内存限制
max_agg_memory: 100MB # 聚合内存限制
# 系统内存监控
system:
swap_warning: 10% # Swap使用警告
swap_critical: 30% # Swap使用紧急
# 趋势分析
trends:
memory_growth_warning: 10%/day # 日增长警告
peak_usage_analysis: true # 峰值使用分析
seasonal_patterns: true # 季节模式分析
4.2 容量规划工具
// 内存容量规划工具
class MemoryCapacityPlanner {
constructor(historicalData = []) {
this.historicalData = historicalData;
this.growthRates = {};
}
// 预测未来内存需求
forecastMemoryRequirements(months = 12, confidenceLevel = 0.95) {
console.log(`预测未来 ${months} 个月的内存需求...`);
const forecast = {
timestamp: new Date(),
months: months,
confidenceLevel: confidenceLevel,
scenarios: {
optimistic: {},
realistic: {},
pessimistic: {}
},
recommendations: []
};
// 分析历史增长趋势
this.analyzeGrowthTrends();
// 计算基础工作集
const baseWorkingSet = this.calculateBaseWorkingSet();
// 预测不同场景
forecast.scenarios.optimistic = this.calculateScenario(
baseWorkingSet,
this.growthRates.avg * 0.8, // 80%的平均增长率
months
);
forecast.scenarios.realistic = this.calculateScenario(
baseWorkingSet,
this.growthRates.avg, // 平均增长率
months
);
forecast.scenarios.pessimistic = this.calculateScenario(
baseWorkingSet,
this.growthRates.avg * 1.2, // 120%的平均增长率
months
);
// 生成建议
forecast.recommendations = this.generateCapacityRecommendations(forecast);
this.displayForecastReport(forecast);
return forecast;
}
// 分析增长趋势
analyzeGrowthTrends() {
if (this.historicalData.length < 2) {
console.log('历史数据不足,使用默认增长率');
this.growthRates = {
avg: 0.1, // 月增长10%
min: 0.05,
max: 0.15,
trend: 'steady'
};
return;
}
// 计算月增长率
const growthRates = [];
for (let i = 1; i < this.historicalData.length; i++) {
const prev = this.historicalData[i - 1].workingSetGB;
const curr = this.historicalData[i].workingSetGB;
if (prev > 0) {
const growthRate = (curr - prev) / prev;
growthRates.push(growthRate);
}
}
const avg = growthRates.reduce((a, b) => a + b, 0) / growthRates.length;
const min = Math.min(...growthRates);
const max = Math.max(...growthRates);
// 判断趋势
let trend = 'steady';
if (avg > 0.15) trend = 'rapid_growth';
else if (avg < 0.05) trend = 'slow_growth';
this.growthRates = {
avg: avg,
min: min,
max: max,
trend: trend
};
console.log(`平均月增长率: ${(avg * 100).toFixed(1)}%`);
console.log(`增长趋势: ${trend}`);
}
// 计算基础工作集
calculateBaseWorkingSet() {
// 获取当前数据库统计
let totalDataGB = 0;
let totalIndexGB = 0;
const dbs = db.adminCommand({ listDatabases: 1 }).databases;
dbs.forEach(dbInfo => {
if (!['admin', 'local', 'config'].includes(dbInfo.name)) {
try {
const dbStats = db.getSiblingDB(dbInfo.name).runCommand({
dbStats: 1,
scale: 1024 * 1024 * 1024 // GB
});
totalDataGB += dbStats.dataSize;
totalIndexGB += dbStats.indexSize;
} catch (error) {
// 跳过无法访问的数据库
}
}
});
// 工作集估算(活跃数据 + 活跃索引)
const workingSetGB = totalDataGB * 0.2 + totalIndexGB * 0.5;
return {
totalDataGB: totalDataGB,
totalIndexGB: totalIndexGB,
workingSetGB: workingSetGB,
timestamp: new Date()
};
}
// 计算场景预测
calculateScenario(baseWorkingSet, monthlyGrowthRate, months) {
const predictions = [];
let current = baseWorkingSet.workingSetGB;
for (let i = 1; i <= months; i++) {
current = current * (1 + monthlyGrowthRate);
predictions.push({
month: i,
workingSetGB: current,
totalDataGB: baseWorkingSet.totalDataGB * (1 + monthlyGrowthRate) ** i,
totalIndexGB: baseWorkingSet.totalIndexGB * (1 + monthlyGrowthRate) ** i,
// 推荐缓存大小 = 工作集的120%,最小1GB,最大256GB
recommendedCacheGB: Math.min(
Math.max(current * 1.2, 1),
256
),
// 推荐总内存 = 缓存的2倍 + 10GB(操作系统和其他)
recommendedTotalMemoryGB: Math.min(
Math.max(current * 1.2 * 2 + 10, 16),
512
)
});
}
return {
monthlyGrowthRate: monthlyGrowthRate,
predictions: predictions,
finalPrediction: predictions[predictions.length - 1]
};
}
// 生成容量建议
generateCapacityRecommendations(forecast) {
const recommendations = [];
const finalRealistic = forecast.scenarios.realistic.finalPrediction;
// 1. 内存建议
if (finalRealistic.recommendedTotalMemoryGB > 256) {
recommendations.push({
type: 'MEMORY_UPGRADE',
priority: 'HIGH',
message: '需要升级服务器内存',
details: `预计 ${forecast.months} 个月后需要 ${finalRealistic.recommendedTotalMemoryGB}GB 内存`,
timeline: '3个月内'
});
}
// 2. 缓存配置建议
if (finalRealistic.recommendedCacheGB > 128) {
recommendations.push({
type: 'CACHE_CONFIG',
priority: 'MEDIUM',
message: '需要调整WiredTiger缓存配置',
details: `建议设置 wiredTigerCacheSizeGB=${Math.floor(finalRealistic.recommendedCacheGB)}`,
timeline: '1个月内'
});
}
// 3. 架构调整建议
if (finalRealistic.totalDataGB > 1000) {
recommendations.push({
type: 'ARCHITECTURE_REVIEW',
priority: 'HIGH',
message: '考虑分片集群架构',
details: `数据量预计超过 ${Math.floor(finalRealistic.totalDataGB)}GB`,
timeline: '6个月内'
});
}
// 4. 数据管理建议
if (finalRealistic.workingSetGB / finalRealistic.totalDataGB < 0.1) {
recommendations.push({
type: 'DATA_MANAGEMENT',
priority: 'MEDIUM',
message: '数据归档策略',
details: '活跃数据比例较低,考虑归档旧数据',
timeline: '3个月内'
});
}
return recommendations;
}
// 显示预测报告
displayForecastReport(forecast) {
console.log('\n' + '='.repeat(80));
console.log('内存容量规划报告');
console.log('='.repeat(80));
console.log(`预测周期: ${forecast.months} 个月`);
console.log(`置信水平: ${forecast.confidenceLevel * 100}%`);
// 现实场景预测
const realistic = forecast.scenarios.realistic;
const final = realistic.finalPrediction;
console.log('\n📊 现实场景预测:');
console.log(` 月增长率: ${(realistic.monthlyGrowthRate * 100).toFixed(1)}%`);
console.log(` 当前工作集: ${final.workingSetGB.toFixed(1)}GB`);
console.log(` ${forecast.months}个月后工作集: ${final.workingSetGB.toFixed(1)}GB`);
console.log(` 推荐缓存大小: ${final.recommendedCacheGB.toFixed(1)}GB`);
console.log(` 推荐总内存: ${final.recommendedTotalMemoryGB.toFixed(1)}GB`);
// 预测表格
console.log('\n📈 月度预测表:');
console.log('月份 | 工作集(GB) | 数据总量(GB) | 推荐缓存(GB)');
console.log('-'.repeat(60));
realistic.predictions.slice(0, 6).forEach(pred => {
console.log(
`${pred.month.toString().padStart(4)} | ` +
`${pred.workingSetGB.toFixed(1).padStart(10)} | ` +
`${pred.totalDataGB.toFixed(1).padStart(12)} | ` +
`${pred.recommendedCacheGB.toFixed(1).padStart(12)}`
);
});
if (realistic.predictions.length > 6) {
console.log('...');
const last = realistic.predictions[realistic.predictions.length - 1];
console.log(
`${last.month.toString().padStart(4)} | ` +
`${last.workingSetGB.toFixed(1).padStart(10)} | ` +
`${last.totalDataGB.toFixed(1).padStart(12)} | ` +
`${last.recommendedCacheGB.toFixed(1).padStart(12)}`
);
}
// 建议
if (forecast.recommendations.length > 0) {
console.log('\n💡 容量规划建议:');
const priorityOrder = { HIGH: 0, MEDIUM: 1, LOW: 2 };
const sortedRecs = [...forecast.recommendations].sort(
(a, b) => priorityOrder[a.priority] - priorityOrder[b.priority]
);
sortedRecs.forEach(rec => {
const icon = {
HIGH: '🔴',
MEDIUM: '🟡',
LOW: '🟢'
}[rec.priority] || '⚪';
console.log(`${icon} ${rec.type}: ${rec.message}`);
console.log(` 详情: ${rec.details}`);
console.log(` 时间线: ${rec.timeline}`);
});
}
// 行动计划
console.log('\n📋 建议行动计划:');
const actionPlan = [
{ time: '立即', action: '建立基线监控', owner: 'DBA团队' },
{ time: '1个月', action: '优化当前配置', owner: 'DBA团队' },
{ time: '3个月', action: '评估升级需求', owner: '基础设施团队' },
{ time: '6个月', action: '架构评审', owner: '架构师团队' }
];
actionPlan.forEach(item => {
console.log(` ${item.time}: ${item.action} (${item.owner})`);
});
}
}
// 使用示例
const planner = new MemoryCapacityPlanner();
// 预测12个月的内存需求
planner.forecastMemoryRequirements(12);
4.3 自动化内存健康检查
// 自动化内存健康检查系统
class MemoryHealthCheckSystem {
constructor(config = {
checkInterval: 3600000, // 每小时检查一次
retentionDays: 90, // 保留90天数据
alertConfig: {
enabled: true,
channels: ['email', 'slack'],
recipients: ['dba-team@company.com']
}
}) {
this.config = config;
this.checkHistory = [];
this.healthScores = new Map(); // collectionName -> healthScore
}
// 执行健康检查
async performHealthCheck() {
console.log('执行内存健康检查...');
const check = {
timestamp: new Date(),
score: 0,
categories: {},
issues: [],
recommendations: []
};
try {
// 1. 缓存健康检查
const cacheHealth = await this.checkCacheHealth();
check.categories.cache = cacheHealth;
check.score += cacheHealth.score * 0.4; // 40%权重
// 2. 连接健康检查
const connectionHealth = await this.checkConnectionHealth();
check.categories.connection = connectionHealth;
check.score += connectionHealth.score * 0.2; // 20%权重
// 3. 操作健康检查
const operationHealth = await this.checkOperationHealth();
check.categories.operation = operationHealth;
check.score += operationHealth.score * 0.2; // 20%权重
// 4. 配置健康检查
const configHealth = await this.checkConfigHealth();
check.categories.config = configHealth;
check.score += configHealth.score * 0.2; // 20%权重
// 5. 总体评估
check.overall = this.evaluateOverallHealth(check.score);
// 6. 生成报告
this.generateHealthReport(check);
// 7. 保存检查结果
this.saveCheckResult(check);
// 8. 发送告警(如果需要)
if (this.config.alertConfig.enabled && check.overall.level === 'UNHEALTHY') {
await this.sendHealthAlert(check);
}
return check;
} catch (error) {
console.error(`健康检查失败: ${error.message}`);
return null;
}
}
// 检查缓存健康
async checkCacheHealth() {
const status = db.runCommand({ serverStatus: 1 });
const wtCache = status.wiredTiger?.cache || {};
const cacheUsage = wtCache['bytes currently in the cache'] / wtCache['maximum bytes configured'];
const dirtyRatio = wtCache['tracked dirty bytes in the cache'] / wtCache['bytes currently in the cache'];
let score = 100;
const issues = [];
// 使用率评分
if (cacheUsage > 0.9) {
score -= 40;
issues.push({
type: 'CACHE_USAGE_HIGH',
severity: 'HIGH',
message: '缓存使用率超过90%',
value: `${(cacheUsage * 100).toFixed(1)}%`
});
} else if (cacheUsage > 0.8) {
score -= 20;
issues.push({
type: 'CACHE_USAGE_WARNING',
severity: 'MEDIUM',
message: '缓存使用率超过80%',
value: `${(cacheUsage * 100).toFixed(1)}%`
});
}
// 脏缓存评分
if (dirtyRatio > 0.2) {
score -= 20;
issues.push({
type: 'DIRTY_CACHE_HIGH',
severity: 'MEDIUM',
message: '脏缓存比例超过20%',
value: `${(dirtyRatio * 100).toFixed(1)}%`
});
}
// 驱逐率评分
const evictionRate = wtCache['pages evicted from cache'] / (wtCache['pages read into cache'] || 1);
if (evictionRate > 0.1) {
score -= 10;
issues.push({
type: 'HIGH_EVICTION_RATE',
severity: 'LOW',
message: '缓存驱逐率较高',
value: `${(evictionRate * 100).toFixed(1)}%`
});
}
return {
score: Math.max(0, score),
metrics: {
cacheUsage: cacheUsage,
dirtyRatio: dirtyRatio,
evictionRate: evictionRate
},
issues: issues
};
}
// 检查连接健康
async checkConnectionHealth() {
const status = db.runCommand({ serverStatus: 1 });
const conn = status.connections || {};
const usage = conn.current / conn.available;
let score = 100;
const issues = [];
if (usage > 0.8) {
score -= 30;
issues.push({
type: 'CONNECTION_USAGE_HIGH',
severity: 'HIGH',
message: '连接使用率超过80%',
value: `${(usage * 100).toFixed(1)}%`
});
} else if (usage > 0.6) {
score -= 15;
issues.push({
type: 'CONNECTION_USAGE_WARNING',
severity: 'MEDIUM',
message: '连接使用率超过60%',
value: `${(usage * 100).toFixed(1)}%`
});
}
// 检查连接数趋势
if (this.checkHistory.length >= 3) {
const recentConnections = this.checkHistory.slice(-3).map(h =>
h.categories.connection?.metrics?.current || 0
);
const increasingTrend = this.detectIncreasingTrend(recentConnections);
if (increasingTrend && usage > 0.5) {
score -= 10;
issues.push({
type: 'CONNECTION_GROWTH',
severity: 'LOW',
message: '检测到连接数增长趋势',
value: `最近增长: ${recentConnections[recentConnections.length-1] - recentConnections[0]}`
});
}
}
return {
score: Math.max(0, score),
metrics: {
current: conn.current,
available: conn.available,
usage: usage
},
issues: issues
};
}
// 检查操作健康
async checkOperationHealth() {
const status = db.runCommand({ serverStatus: 1 });
const currentOps = db.currentOp().inprog;
let score = 100;
const issues = [];
// 检查大内存操作
const largeOps = currentOps.filter(op => {
const memUsage = this.estimateOperationMemory(op);
return memUsage > 100 * 1024 * 1024; // 超过100MB
});
if (largeOps.length > 0) {
score -= largeOps.length * 5; // 每个大操作扣5分
issues.push({
type: 'LARGE_MEMORY_OPS',
severity: 'MEDIUM',
message: `发现 ${largeOps.length} 个大内存操作`,
value: largeOps.length
});
}
// 检查长运行操作
const longOps = currentOps.filter(op => (op.secs_running || 0) > 300); // 超过5分钟
if (longOps.length > 0) {
score -= longOps.length * 3; // 每个长操作扣3分
issues.push({
type: 'LONG_RUNNING_OPS',
severity: 'LOW',
message: `发现 ${longOps.length} 个长运行操作`,
value: longOps.length
});
}
// 检查排序内存使用
const mem = status.mem || {};
if (mem.resident > mem.mapped * 1.5) {
score -= 10;
issues.push({
type: 'HIGH_RESIDENT_MEMORY',
severity: 'MEDIUM',
message: '常驻内存明显高于映射内存',
value: `${(mem.resident / 1024).toFixed(1)}GB / ${(mem.mapped / 1024).toFixed(1)}GB`
});
}
return {
score: Math.max(0, score),
metrics: {
largeOperations: largeOps.length,
longRunningOperations: longOps.length,
residentMemoryGB: mem.resident / 1024,
mappedMemoryGB: mem.mapped / 1024
},
issues: issues
};
}
// 检查配置健康
async checkConfigHealth() {
const config = db.adminCommand({ getCmdLineOpts: 1 });
let score = 100;
const issues = [];
const recommendations = [];
// 检查WiredTiger缓存配置
const cacheSize = await this.getConfiguredCacheSize();
const recommendedCache = await this.calculateRecommendedCache();
if (cacheSize < recommendedCache * 0.5) {
score -= 30;
issues.push({
type: 'UNDERSIZED_CACHE',
severity: 'HIGH',
message: '缓存配置过小',
value: `${cacheSize}GB / 推荐 ${recommendedCache}GB`
});
recommendations.push({
action: '增加WiredTiger缓存',
command: `--wiredTigerCacheSizeGB=${recommendedCache}`,
priority: 'HIGH'
});
} else if (cacheSize > recommendedCache * 2) {
score -= 10;
issues.push({
type: 'OVERSIZED_CACHE',
severity: 'LOW',
message: '缓存配置过大',
value: `${cacheSize}GB / 推荐 ${recommendedCache}GB`
});
}
// 检查连接配置
const maxConnections = config.parsed?.net?.maxIncomingConnections || 65536;
if (maxConnections > 10000) {
score -= 10;
issues.push({
type: 'HIGH_MAX_CONNECTIONS',
severity: 'LOW',
message: '最大连接数配置过高',
value: maxConnections
});
recommendations.push({
action: '优化最大连接数',
command: 'net.maxIncomingConnections: 10000',
priority: 'LOW'
});
}
return {
score: Math.max(0, score),
issues: issues,
recommendations: recommendations
};
}
// 评估总体健康状态
evaluateOverallHealth(score) {
if (score >= 90) {
return { level: 'EXCELLENT', color: 'green', emoji: '✅' };
} else if (score >= 75) {
return { level: 'HEALTHY', color: 'blue', emoji: '👍' };
} else if (score >= 60) {
return { level: 'WARNING', color: 'yellow', emoji: '⚠️' };
} else {
return { level: 'UNHEALTHY', color: 'red', emoji: '🚨' };
}
}
// 生成健康报告
generateHealthReport(check) {
console.log('\n' + '='.repeat(80));
console.log('MongoDB内存健康检查报告');
console.log('='.repeat(80));
console.log(`时间: ${check.timestamp.toLocaleString()}`);
console.log(`总体健康: ${check.overall.emoji} ${check.overall.level} (${check.score.toFixed(1)}分)`);
// 各分类得分
console.log('\n📊 分类得分:');
Object.entries(check.categories).forEach(([category, data]) => {
console.log(` ${category}: ${data.score.toFixed(1)}分`);
});
// 问题汇总
const allIssues = [];
Object.values(check.categories).forEach(category => {
allIssues.push(...(category.issues || []));
});
if (allIssues.length > 0) {
console.log('\n🚨 发现问题:');
const severityOrder = { HIGH: 0, MEDIUM: 1, LOW: 2 };
const sortedIssues = allIssues.sort(
(a, b) => severityOrder[a.severity] - severityOrder[b.severity]
);
sortedIssues.slice(0, 10).forEach(issue => {
const icon = {
HIGH: '🔴',
MEDIUM: '🟡',
LOW: '🟢'
}[issue.severity] || '⚪';
console.log(`${icon} ${issue.type}: ${issue.message}`);
console.log(` 值: ${issue.value}`);
});
}
// 建议汇总
const allRecommendations = [];
Object.values(check.categories).forEach(category => {
allRecommendations.push(...(category.recommendations || []));
});
if (allRecommendations.length > 0) {
console.log('\n💡 优化建议:');
const priorityOrder = { HIGH: 0, MEDIUM: 1, LOW: 2 };
const sortedRecs = allRecommendations.sort(
(a, b) => priorityOrder[a.priority] - priorityOrder[b.priority]
);
sortedRecs.forEach(rec => {
const icon = {
HIGH: '🔴',
MEDIUM: '🟡',
LOW: '🟢'
}[rec.priority] || '⚪';
console.log(`${icon} ${rec.action}`);
if (rec.command) {
console.log(` 命令: ${rec.command}`);
}
});
}
// 趋势分析(如果有历史数据)
if (this.checkHistory.length >= 2) {
const recentScores = this.checkHistory.slice(-5).map(h => h.score);
const avgScore = recentScores.reduce((a, b) => a + b, 0) / recentScores.length;
console.log('\n📈 趋势分析:');
console.log(` 最近5次平均分: ${avgScore.toFixed(1)}`);
console.log(` 当前分: ${check.score.toFixed(1)}`);
if (check.score < avgScore - 10) {
console.log(' ⚠️ 健康度下降明显');
} else if (check.score > avgScore + 10) {
console.log(' ✅ 健康度有所改善');
}
}
}
// 保存检查结果
saveCheckResult(check) {
this.checkHistory.push(check);
// 清理旧数据
const cutoff = new Date();
cutoff.setDate(cutoff.getDate() - this.config.retentionDays);
this.checkHistory = this.checkHistory.filter(
item => item.timestamp > cutoff
);
}
// 发送健康告警
async sendHealthAlert(check) {
console.log('发送健康告警...');
const alertMessage = {
title: `🚨 MongoDB内存健康告警: ${check.overall.level}`,
timestamp: check.timestamp,
score: check.score,
level: check.overall.level,
issues: []
};
// 收集所有问题
Object.values(check.categories).forEach(category => {
category.issues.forEach(issue => {
if (issue.severity === 'HIGH') {
alertMessage.issues.push(issue);
}
});
});
// 这里可以集成邮件、Slack、钉钉等通知
// 例如: sendEmail(this.config.alertConfig.recipients, alertMessage);
console.log(`已生成告警消息,共 ${alertMessage.issues.length} 个严重问题`);
}
// 辅助方法
detectIncreasingTrend(values, threshold = 0.1) {
if (values.length < 2) return false;
const first = values[0];
const last = values[values.length - 1];
const growth = (last - first) / first;
return growth > threshold;
}
estimateOperationMemory(op) {
// 简化实现,同前文
return 0;
}
async getConfiguredCacheSize() {
// 简化实现,同前文
return 1;
}
async calculateRecommendedCache() {
// 简化实现
return 8;
}
// 启动定期健康检查
startPeriodicChecks() {
console.log(`启动定期健康检查,间隔: ${this.config.checkInterval / 1000 / 60}分钟`);
// 立即执行一次
this.performHealthCheck();
// 设置定时器
setInterval(() => {
this.performHealthCheck();
}, this.config.checkInterval);
}
}
// 使用示例
const healthCheck = new MemoryHealthCheckSystem();
// 执行一次健康检查
healthCheck.performHealthCheck();
// 或者启动定期检查
// healthCheck.startPeriodicChecks();
五、总结
通过本文的系统介绍,我们掌握了MongoDB内存问题的全面解决方案:
5.1 关键知识点总结
- 理解内存架构:WiredTiger多层次缓存设计
- 掌握监控方法:实时监控、深度分析、趋势预测
- 学会问题诊断:从现象到根源的系统分析方法
- 掌握优化技巧:配置调优、索引优化、操作优化
- 实施预防策略:容量规划、健康检查、告警配置
5.2 黄金法则
- 不要害怕高内存使用:MongoDB利用内存提升性能是正常现象
- 关注使用模式而非绝对值:内存使用率比使用量更重要
- 预防优于治疗:定期健康检查比问题发生后再处理更有效
- 数据驱动决策:基于监控数据而非直觉进行优化
5.3 行动清单
✅ 立即行动:
- 部署基础监控系统
- 建立内存使用基线
- 配置关键告警规则
✅ 短期优化:
- 调整WiredTiger缓存大小
- 优化索引设计
- 审查大内存操作
✅ 长期规划:
- 实施容量规划流程
- 建立定期健康检查机制
- 制定架构演进路线图
记住:内存管理是一门平衡艺术,需要在性能、成本和稳定性之间找到最佳平衡点。希望本文能为您的MongoDB内存优化之旅提供有力支持!


2万+

被折叠的 条评论
为什么被折叠?



