Regexp parsing_exception for \[mytag\]. Why do regexp escapes for [, ], (, ), break the parser?

Trying to use regexp to match a value in title that has surrounding [] or () characters however seeing an unexpected error when escaping these characters. Have tried both \[ and \\[, looking for some guidance.

'{
    "query": {
        "regexp":{
            "title": {
                "value": "\[mytag\]",
            }
        }
    }
}'

Seeing the following error:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "parsing_exception",
        "reason" : "Failed to parse",
        "line" : 6,
        "col" : 26
      }
    ],
    "type" : "parsing_exception",
    "reason" : "Failed to parse",
    "line" : 6,
    "col" : 26,
    "caused_by" : {
      "type" : "json_parse_exception",
      "reason" : "Unrecognized character escape '[' (code 91)\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@3596d5c9; line: 6, column: 29]"
    }
  },
  "status" : 400
}

Hi @foobarbaz,

for me it works fine with \\; see the following example (tested on Elasticsearch 5.2.1):

DELETE library

PUT library
{
   "mappings": {
      "quotes": {
         "properties": {
            "content": {
               "type": "text",
               "fields": {
                  "raw": {
                      "type": "keyword"
                  }
               }
            }
         }
      }
   }
}

POST /library/quotes
{
    "content": "Hello [here] I am"
}

GET /library/quotes/_search
{
    "query": {
        "regexp":{
            "content.raw": {
                "value": ".*\\[here\\].*"
            }
        }
    }
}

By the way, if you want to store something like tags, it might be worth checking whether you can index them as an array. Then there is no need for regexes.

Daniel

1 Like

Daniel,

Thanks so much for this! Worked and ended up taking your advice about arrays, per your advice took this approach.

Appreciate your time and help!

Cheers mate,
Joshua

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.