I'm trying to do a regex field on a not_analyzed field that contain text that conform to the following:
word word keyOne=(20),keyTwo=(30)
I've tried various permutations to get a regexp query working to find documents with two or more numbers inside the round brackets after keyOne
:
{
"query": {
"regexp": {
"message_raw": "keyOne=\\([0-9]{2,}\\)"
},
"analyzer": "simple"
}
}
This does not match:
{
"_index" : "test",
"_type" : "messages",
"_id" : "AVUQAVW63lfkFHq9cox-",
"matched" : false,
"explanation" : {
"value" : 0.0,
"description" : "Failure to meet condition(s) of required/prohibited clause(s)",
"details" : [ {
"value" : 0.0,
"description" : "no match on required clause (message_raw:/keyOne=\\([0-9]{2,}\\)/)",
"details" : [ {
"value" : 0.0,
"description" : "message_raw:/keyOne=\\([0-9]{2,}\\)/ doesn't match id 0",
"details" : [ ]
} ]
}, {
"value" : 0.0,
"description" : "match on required clause, product of:",
"details" : [ {
"value" : 0.0,
"description" : "# clause",
"details" : [ ]
}, {
"value" : 1.0,
"description" : "_type:messages, product of:",
"details" : [ {
"value" : 1.0,
"description" : "boost",
"details" : [ ]
}, {
"value" : 1.0,
"description" : "queryNorm",
"details" : [ ]
} ]
} ]
} ]
}
}
The only match I could get, was the following:
{
"query": {
"regexp": {
"message_raw": "[0-9]{2,}"
},
"analyzer": "simple"
}
}
And the explanation:
{
"_index" : "test",
"_type" : "messages",
"_id" : "AVUQAVW63lfkFHq9cox-",
"matched" : true,
"explanation" : {
"value" : 1.0,
"description" : "sum of:",
"details" : [ {
"value" : 1.0,
"description" : "message_raw:/[0-9]{2,}/, product of:",
"details" : [ {
"value" : 1.0,
"description" : "boost",
"details" : [ ]
}, {
"value" : 1.0,
"description" : "queryNorm",
"details" : [ ]
} ]
}, {
"value" : 0.0,
"description" : "match on required clause, product of:",
"details" : [ {
"value" : 0.0,
"description" : "# clause",
"details" : [ ]
}, {
"value" : 1.0,
"description" : "_type:messages, product of:",
"details" : [ {
"value" : 1.0,
"description" : "boost",
"details" : [ ]
}, {
"value" : 1.0,
"description" : "queryNorm",
"details" : [ ]
} ]
} ]
} ]
}
}
I know it would be best to parse the message before sending it into Elasticsearch, but I'd still like to know why the regular expression doesn't work properly.
Any thoughts? Should I create an issue?