Analyze mongodb3.x slow query log with filebeat and elasticsearch
The log format for MongoDB v.3.x compare with later version are very different. And it is not working with default filebeat pipeline. However, there is a way we can parse it
Default filebeat mongodb pipeline
The grok part below are not working in the old version of mongodb logs
log example:
{ aggregate: "orders", pipeline: [ { $match: { items._profiles.type: "sale", items._profiles.Id: ObjectId('xxxxxxxx'), createdAt: { $gte: new Date(1483203600000), $lt: new Date(1623776399999) } } }, { $unwind: "$items" }, { $unwind: "$items._profiles" }, { $match: { items._profiles.type: "sale", items._profiles.Id: ObjectId('xxxxxxxxxx'), createdAt: { $gte: new Date(1483203600000), $lt: new Date(1623776399999) } } }, { $group: { _id: "$_id", riskScoreByOrder: { $avg: "$_riskScore" } } }, { $group: { _id: 0, avgRiskScore: { $avg: "$riskScoreByOrder" } } } ], allowDiskUse: true, maxTimeMS: 60000, cursor: {}, $readPreference: { mode: "secondaryPreferred" }, $db: "testDB" } planSummary: IXSCAN { items._profiles.Id: 1, items._profiles.type: 1, createdAt: -1 } keysExamined:120572 docsExamined:120518 cursorExhausted:1 numYields:1039 nreturned:1 reslen:283 locks:{ Global: { acquireCount: { r: 2308 }, acquireWaitCount: { r: 1017 }, timeAcquiringMicros: { r: 11453345 } }, Database: { acquireCount: { r: 1154 } }, Collection: { acquireCount: { r: 1154 } } }
pipeline:
GET _ingest/pipeline/filebeat-7.13.2-mongodb-log-pipeline
{
"filebeat-7.13.2-mongodb-log-pipeline" : {
"description" : "Pipeline for parsing MongoDB logs",
"processors" : [
{
"set" : {
"field" : "event.ingested",
"value" : "{{_ingest.timestamp}}"
}
},
{
"grok" : {
"ignore_missing" : true,
"field" : "message",
"patterns" : [
"""%{TIMESTAMP_ISO8601:mongodb.log.timestamp}%{SPACE}%{MONGO3_SEVERITY:log.level}%{SPACE}%{MONGO3_COMPONENT:mongodb.log.component}%{SPACE}(?:\[%{DATA:mongodb.log.context}\])?%{SPACE}%{GREEDYDATA:message}"""
]
}
},
{
"rename" : {
"field" : "@timestamp",
"target_field" : "event.created"
}
},
{
"date" : {
"field" : "mongodb.log.timestamp",
"target_field" : "@timestamp",
"formats" : [
"yyyy-MM-dd'T'HH:mm:ss.SSSZZ"
]
}
},
...
],
"on_failure" : [
{
"set" : {
"field" : "error.message",
"value" : "{{ _ingest.on_failure_message }}"
}
}
]
}
}
Custom pipeline
We can copy the official pipeline and create a custom one with extra grok to easy parsing the data.
Added one grok
and one convert
to change the time_taken to interger
PUT _ingest/pipeline/mongodb-log-pipeline-custom
{
"description": "Pipeline for parsing MongoDB logs",
"processors": [
{
"set": {
"field": "event.ingested",
"value": "{{_ingest.timestamp}}"
}
},
{
"grok": {
"ignore_missing": true,
"field": "message",
"patterns": [
"""(-|%{WORD:severity}\:\s%{GREEDYDATA:details}\s+)command\s+%{GREEDYDATA:database}\.%{GREEDYDATA:collection}\s+command:\s+%{WORD:command_type}\s+%{GREEDYDATA:mongo_query}\s+protocol:%{GREEDYDATA:protocal}\s%{GREEDYDATA:time_taken}ms""",
"""command\s+%{GREEDYDATA:database}\.%{GREEDYDATA:collection}\s+command:\s+%{WORD:command_type}\s+%{GREEDYDATA:mongo_query}\s+protocol:%{GREEDYDATA:protocal}\s%{GREEDYDATA:time_taken}ms"""
]
}
},
...,
{
"convert": {
"field": "time_taken",
"type": "integer"
}
}
],
"on_failure": [
{
"set": {
"field": "error.message",
"value": "{{ _ingest.on_failure_message }}"
}
}
]
}
Modify filebeat module
add the custom pipeline to the mongodb module
- module: mongodb
# All logs
log:
enabled: true
var.paths: ["/var/log/mongodb/mongod.log"]
input:
pipeline: mongodb-log-pipeline-custom
Grafana + Elasticsearch graph
At the end, you can use the graph to quickly find out the slow query and monitor the trend
MongoDB Slow query determination
TBC