Regex elasticsearch python API


(Diegoorellanaga) #1

Hi,

I have been trying to make a regex ES search using the python API but I have not been able to. In my regex I'm trying to detect 8 consecutive digits: ".(^|[^\d])\d{8}($|[^\d])." the app in https://regex101.com/ tells me this regex is ok. But when I use it on the ES python API I don't get what I should. I get no hits, but if I change d{8} for d{1} I get hits so I guess is a regex problem somehow. (and yes I have 8 consecutive digits names in that index)

> from elasticsearch import Elasticsearch
> es = Elasticsearch(['xxx.xxx.xxx:9200'])
> match = es.search(index="suricata-fileinfo-2017.07", doc_type='suricata-fileinfo', body={"query": {"regexp" : {"fileinfo_filename" : ".*(^|[^\d])\d{8}($|[^\d]).*"}}}, request_timeout=60)

Any idea what is wrong?


(Daniel Mitterdorfer) #2

Hi @diegoorellanaga,

I don't think this is Python specific. In the regexp query docs I do not see that the \d character class is supported. Can you replace your regex with ".*(^|[^0-9])[0-9]{8}($|[^0-9]).*" instead and retry?

Daniel


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.