i am using sense to extract data from elasticseacrh using wildcard query. I have analysed field 'data'
GET /_search
"query": {
"wildcard": {
"data": "http*.???"
This is not working.
Actually i need to extract only url from the field which has string and url exist in between that string. The extracted url should be in the form like http://localhost?get.pdf

while this might work using the uax url tokenizer I strongly recommend extracting URLs as a preprocessing step before indexing your documents, which will make it endlessly easier to search for urls - also much faster.

You could write your own ingest processor as mentioned in that blog post for example.


