How to search for a bracket using simple_query_string?

I'm trying to search for documents that contain a bracket in the text using simple_query_string. The documentation here says that "To use one of these characters literally, escape it with a preceding backslash ()."

However, when I try this

{
  "query": {
    "simple_query_string": {
      "fields": [ "source_title" ],
      "query": "\("
    }
  }
}

I get this error:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "json_parse_exception",
        "reason" : "Unrecognized character escape '(' (code 40)\n at [Source: (org.elasticsearch.common.io.stream.ByteBufferStreamInput); line: 5, column: 19]"
      }
    ],
    "type" : "json_parse_exception",
    "reason" : "Unrecognized character escape '(' (code 40)\n at [Source: (org.elasticsearch.common.io.stream.ByteBufferStreamInput); line: 5, column: 19]"
  },
  "status" : 400
}

How do I search for brackets using simple_query_string?

What is the mapping for that field?

                "source_title": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },

I do not think parenthesis requires escaping, so would try without the escaping and see if the error disappears or changes.

You are querying a filed mapped as text, which by default uses the standard analyzer. I believe this removes parentesis from the indexed text so you will not be able to run your query against that field. You can use the analyze API to see how the contant and query string are tokenized, which will help you troubleshoot this type of issues.

The source_title.keyword sub field will contain the full string indexed as one component (as long as it is no longer than 256 characters. You should be able to query this field if they are not too long, but will probably need to use a wildcard query, which can be very slow.

1 Like

Hi @Christian_Dahlqvist

Thanks so much for the information. You're right that when I search for the literal bracket, I get no matches. I've also confirmed with the analyze API that the brackets are removed during the tokenization. I'll have to figure out whether I want to keep this functionality or not.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.