Query text which starting with slash by ElasticSearch 6.4.3

Hi all,
Here is my data structure in Elasticsearch

{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 4,
    "successful" : 4,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 5.7112346,
    "hits" : [
      {
        "_index" : "psql-widget-raw-log-2019.03",
        "_type" : "fluentd",
        "_id" : "33590",
        "_score" : 5.7112346,
        "_source" : {
          "id" : 33590,
          "domain_info_id" : 1,
          "path_string" : "/test.html?p1=aaa&p2=bbb",
          "user_group_id" : "gid01",
          "user_id" : "uid-01",
          "ctime" : "2019-03-19 02:29:15.026200+0000",
          "@timestamp" : "2019-03-19T10:29:15.000000000+08:00"
        }
      }
    ]
  }
}

And I wanna get data which path_string is starting with "/test.html",

I've been tried prefix, wildcard , escaping slash,
but it's seems not working.

I can only get data by match

Any help would be greatly appreciated

What is the mapping?

BTW, not sure what your usecase is but may be you'd like to look at the path tokenizer.

Thanks, @dadoonet

Here is my mapping

PUT /_template/psql-widget-raw-log-format
{
  "index_patterns" : ["psql-widget-raw-log-*"],
   "settings":{  
    "number_of_shards": 2
   },
  "mappings": {
    "fluentd":{
        "properties":{
            "id":{
                "type":"integer"
            },
            "domain_info_id":{
                "type":"integer"
            }
        }
    }
  }
}

Usecase is like this in sql

select count(*) from psql-widget-raw-log where path_string like '/test.html%'

I still couldn't query by prefix, wildcard, regexp after changed value in field from '/test.html' to '%2Ftest.html' ( or '%2Ftest.html', '\%2Ftest.html')

It's worked when using other field like 'user_group_id' or 'user_id', so I thought that I querying it in right way

Thanks for your help

I'd use a path tokenizer. See https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pathhierarchy-tokenizer.html

So I should change my mapping like this

PUT /_template/psql-widget-raw-log-format
{
  "index_patterns" : ["psql-widget-raw-log-*"],
   "settings":{  
    "number_of_shards": 2
   },
  "mappings": {
    "fluentd":{
        "properties":{
            "id":{
                "type":"integer"
            },
            "domain_info_id":{
                "type":"integer"
            },
            "path_string":{
                "type": "string",
                "analyzer":{"tokenizer": "path_hierarchy"} 
            }
        }
    }
  }
}

and then using query like this

curl -X GET "localhost:9200/psql-widget-raw-log-*/_search?pretty" -H 'Content-Type: application/json' -d'
{
    "query": {
        "bool": {
            "must": [
                {"prefix": {"path_string": "/test.html"}} 
            ],
            "filter":[ 
                {"range" : 
                    {"@timestamp" : 
                        {"gte" : "2019-03-19T10:00:00","lte" : "2019-03-20T00:59:59",
                        "time_zone": "+08:00"}
                    }
                },
                {"match": {"domain_info_id": "1"}}
            ]
        }
    }
}'

and get all data that path_string is starting with "/test.html", e.q. "/test.html%3Fa%3Dthanku", "/test.html",....etc

Is this correct?

Not exactly. You need to define an analyzer first. Then use the analyzer in your mapping.
Then you can try just with a term or match query instead of a prefix query.

If you need further help, please provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.