Hi All,
if you can help me with understanding what is complement in regexp queries really is by example
I've read documentation, however it's still not clear for me
e.g. I have following regexp query that tries to select only Android 4.0 user agents, the problem is that Windows Phone 8.1 has same Android 4.0 part inside, so I've created following:
"regexp" : {
"userAgent" : {
"value" : "~(.*Windows Phone)(.*)Android 4\\.0(.*)",
"flags": "COMPLEMENT"
}
}
the problem is that user agent values like
Mozilla/5.0 (Mobile; Windows Phone 8.1; Android 4.0; ARM; Trident/7.0; Touch; rv:11.0; IEMobile/11.0; Microsoft; Lumia 640 Dual SIM) like iPhone OS 7_0_3 Mac OS X AppleWebKit/537 (KHTML, like Gecko) Mobile Safari/537
do match the regexp, while I've tried to "exclude" Windows Phone" from the match
What I'm missing? How to think about complement ~ in lucene regexps?
Isn't my query tells something like:
every value that doesn't have Windows Phone at the beginning, then has anything else, then "Android 4.0" and then once again anything else
Update:
I've managed to get what I want with Intersection, however in manual it's advised to rethink approach and not to use it
"regexp" : {
"userAgent" : {
"value" : "~(.*Windows Phone.*)&.*Android 4\\.0.*",
"flags": "COMPLEMENT|INTERSECTION"
}
}
Thanks in advance