Wildcard in search query


(anonymous) #1

Hi,
i am using sense to extract data from elasticseacrh using wildcard query. I have analysed field 'data'
GET /_search
{
"query": {
"wildcard": {
"data": "http*.???"
}
}
}
This is not working.
Actually i need to extract only url from the field which has string and url exist in between that string. The extracted url should be in the form like http://localhost?get.pdf


(Alexander Reelsen) #2

Hey,

while this might work using the uax url tokenizer I strongly recommend extracting URLs as a preprocessing step before indexing your documents, which will make it endlessly easier to search for urls - also much faster.

You could write your own ingest processor as mentioned in that blog post for example.

--Alex


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.