Filebeat with nginx custom log format set up

Izek Chen
3 min readSep 15, 2020

Background

For setting up the custom Nginx log parsing, there are something areas you need to pay attention to. When filebeat start, it will initiate a PUT request to elasticsearch to create or update the default pipeline. For Ex, “filebeat-7.7.1-nginx-access-default“ is the default pipeline of filebeat 7.7.1.
If you have multiple version of filebeat, then you will have lots of version of pipelines
But when you only use one version of filebeat with multiple version of Nginx, then things got tricky. The black hole we fall in is that we have default filebeat running on several production env and we try to test the customize logs on the staging server. We run the command and forced to use Nginx module.

We didn’t notice that it overwrite the pipeline already with the custom default.json file and spend tons of time stuck in how is that happened.

Procedure

Actually, the procedure is pretty straightforward

  1. Modify Nginx custom log format
log_format auth_password_log '$remote_addr  $env_final - $http_authorization [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $upstream_response_time';

2. working on the grok pattern

(-|%{WORD:http.request.x_environment_id}) is for the $env_final from Nginx logs

(%{NGINX_HOST} )?"?(?:%{NGINX_ADDRESS_LIST:nginx.access.remote_ip_list}|%{NOTSPACE:source.address}) (-|%{WORD:http.request.x_environment_id}) - %{DATA:user.name} \[%{HTTPDATE:nginx.access.time}\] "%{DATA:nginx.access.info}" %{NUMBER:http.response.status_code:long} %{NUMBER:http.response.body.bytes:long} "%{DATA:http.request.referrer}" "%{DATA:user_agent.original}" (-|%{NUMBER:http.response.time:float})

3. create custom pipeline and update with the pattern

  • Get the default pipeline:
GET _ingest/pipeline/filebeat-7.7.1-nginx-access-default
  • use the data you get to create a new pipeline
PUT _ingest/pipeline/filebeat-pipeline-custom
{
"description" : "Pipeline for parsing Nginx access logs. Requires the geoip and user_agent plugins.",
"processors" : [
{
"grok" : {
"field" : "message",
"patterns" : [
"""(%{NGINX_HOST} )?"?(?:%{NGINX_ADDRESS_LIST:nginx.access.remote_ip_list}|%{NOTSPACE:source.address}) (-|%{WORD:http.request.x_environment_id}) - %{DATA:user.name} \[%{HTTPDATE:nginx.access.time}\] "%{DATA:nginx.access.info}" %{NUMBER:http.response.status_code:long} %{NUMBER:http.response.body.bytes:long} "%{DATA:http.request.referrer}" "%{DATA:user_agent.original}" (-|%{NUMBER:http.response.time:float})"""
],
"pattern_definitions" : {
"NGINX_HOST" : "(?:%{IP:destination.ip}|%{NGINX_NOTSEPARATOR:destination.domain})(:%{NUMBER:destination.port})?",
"NGINX_NOTSEPARATOR" : "[^\t ,:]+",
"NGINX_ADDRESS_LIST" : """(?:%{IP}|%{WORD})("?,?\s*(?:%{IP}|%{WORD}))*"""
},
"ignore_missing" : true
}
},
{
"grok" : {
"patterns" : [
"%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}",
""
],
"ignore_missing" : true,
"field" : "nginx.access.info"
}
},
{
"remove" : {
"field" : "nginx.access.info"
}
},
{
"split" : {
"field" : "nginx.access.remote_ip_list",
"separator" : """"?,?\s+""",
"ignore_missing" : true
}
},
{
"split" : {
"field" : "nginx.access.origin",
"separator" : """"?,?\s+""",
"ignore_missing" : true
}
},
{
"set" : {
"value" : "",
"field" : "source.address",
"if" : "ctx.source?.address == null"
}
},
{
"script" : {
"source" : "boolean isPrivate(def dot, def ip) { try { StringTokenizer tok = new StringTokenizer(ip, dot); int firstByte = Integer.parseInt(tok.nextToken()); int secondByte = Integer.parseInt(tok.nextToken()); if (firstByte == 10) { return true; } if (firstByte == 192 && secondByte == 168) { return true; } if (firstByte == 172 && secondByte >= 16 && secondByte <= 31) { return true; } if (firstByte == 127) { return true; } return false; } catch (Exception e) { return false; } } try { ctx.source.address = null; if (ctx.nginx.access.remote_ip_list == null) { return; } def found = false; for (def item : ctx.nginx.access.remote_ip_list) { if (!isPrivate(params.dot, item)) { ctx.source.address = item; found = true; break; } } if (!found) { ctx.source.address = ctx.nginx.access.remote_ip_list[0]; }} catch (Exception e) { ctx.source.address = null; }",
"params" : {
"dot" : "."
},
"if" : "ctx.nginx?.access?.remote_ip_list != null && ctx.nginx.access.remote_ip_list.length > 0",
"lang" : "painless"
}
},
{
"remove" : {
"field" : "source.address",
"if" : "ctx.source.address == null"
}
},
{
"grok" : {
"field" : "source.address",
"patterns" : [
"^%{IP:source.ip}$"
],
"ignore_failure" : true
}
},
{
"remove" : {
"field" : "message"
}
},
{
"rename" : {
"field" : "@timestamp",
"target_field" : "event.created"
}
},
{
"date" : {
"field" : "nginx.access.time",
"target_field" : "@timestamp",
"formats" : [
"dd/MMM/yyyy:H:m:s Z"
],
"on_failure" : [
{
"append" : {
"field" : "error.message",
"value" : "{{ _ingest.on_failure_message }}"
}
}
]
}
},
{
"remove" : {
"field" : "nginx.access.time"
}
},
{
"user_agent" : {
"field" : "user_agent.original"
}
},
{
"geoip" : {
"ignore_missing" : true,
"field" : "source.ip",
"target_field" : "source.geo"
}
},
{
"geoip" : {
"target_field" : "source.as",
"properties" : [
"asn",
"organization_name"
],
"ignore_missing" : true,
"database_file" : "GeoLite2-ASN.mmdb",
"field" : "source.ip"
}
},
{
"rename" : {
"field" : "source.as.asn",
"target_field" : "source.as.number",
"ignore_missing" : true
}
},
{
"rename" : {
"ignore_missing" : true,
"field" : "source.as.organization_name",
"target_field" : "source.as.organization.name"
}
}
],
"on_failure" : [
{
"set" : {
"field" : "error.message",
"value" : "{{ _ingest.on_failure_message }}"
}
}
]

}
  • update the nginx.yml in the module.d directory
- module: nginx
# Access logs
access:
enabled: true
input:
pipeline: filebeat-pipeline-custom

Done

--

--