Wildcard in search query

anonymous · December 5, 2016, 5:06am

Hi,
i am using sense to extract data from elasticseacrh using wildcard query. I have analysed field 'data'
GET /_search
{
"query": {
"wildcard": {
"data": "http*.???"
}
}
}
This is not working.
Actually i need to extract only url from the field which has string and url exist in between that string. The extracted url should be in the form like http://localhost?get.pdf

spinscale · December 5, 2016, 9:25am

Hey,

while this might work using the uax url tokenizer I strongly recommend extracting URLs as a preprocessing step before indexing your documents, which will make it endlessly easier to search for urls - also much faster.

You could write your own ingest processor as mentioned in that blog post for example.

--Alex

system · January 2, 2017, 9:25am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.