Search for [error] in message content

Hello,

I am new in ELK and I am sorry if my question was already asked.
I am facing issue by creating a query for displaying the messages which contain [error] (including the square brackets) string in the message part. I tried to escape the [ and ], but I had no success. The result always displays all the messages which contain the string error, it does not matter if there are no square brackets.
Is there any solution for that?

[ ] are reserved characters and must be escaped via backslashes \\[. https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_reserved_characters
How do you escape them?

Hello @Mikhail_Shustov,
Thank you for your replay.
Yes, I tried.
On request:

GET /filebeat-2020.03.25/_search
{
"query" : {
"query_string" : {
"query" : "\[error\]",
"fields" : ["message"]
}
}
}

I still see in the output messages like:

'message" : "[INFO ] 2020-03-25T07:55:17,624Z [flow-worker] statemachine.StaffedFlowHospital.flowErrored - Flow [b06d70c9-a175-46d5-b3ec-d1dca838fe6a] error allowed to propagate'

Sorry, the query above is with double \.
Is not visible in the post, but I am executing it with double \.

Hello @Mikhail_Shustov,

Any ideas what is causing that issue.
I tried with an older version of ELK, the result is the same.

Thanks.

If your field is a text field any contents will have been chopped into individual words in a search index. This process is called "analysis" and the default configuration uses the "standard" Analyzer optimised for text like this comment. Analyzers typically throw away punctuation and lowercase words. You can see what effects they have using the _analyze api:

GET /_analyze
{
  "analyzer":"standard",
  "text":"[error] Something happened."
}

This produces this output:

{
  "tokens" : [
	{
	  "token" : "error",
	  "start_offset" : 1,
	  "end_offset" : 6,
	  "type" : "<ALPHANUM>",
	  "position" : 0
	},
	{
	  "token" : "something",
	  "start_offset" : 8,
	  "end_offset" : 17,
	  "type" : "<ALPHANUM>",
	  "position" : 1
	},
	{
	  "token" : "happened",
	  "start_offset" : 18,
	  "end_offset" : 26,
	  "type" : "<ALPHANUM>",
	  "position" : 2
	}
  ]
}

Note that the tokens put in the index are stripped of punctuation - including the square brackets so you cannot search for a square bracket because they are not in the index.

You can modify your choice of analyzer to a less aggressive one that preserves the brackets in the index but I expect a better approach is to pre-process the docs so that the log level (warn/error/debug etc) is a separate structured keyword field in your JSON docs. This structured field would allow you to display analytic charts summarising logs by type etc.
You can pre-process your docs in any number of ways:

  • using regex patterns in custom client code
  • using logstash or other forms of ETL tool (many of which recognise common log formats)
  • using ingest-pipelines

Hello @Mark_Harwood,

Thank you for your answer.
I solved the issue by using filters in logstash. The solution was to add a new field depending on the string "[ERROR]" in the message part.

Thanks,
Tihomir

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.