How to use regexp query filter in ElasticSearch Source Filtering?


(Srikanth) #1

For given below test data of ElasticSearch index, how do we eliminate the field name(s) containing UUID?
Can we make use of regular expressions here? Per my understanding, regexp will match the values but not keys.

{
"_index": "twitter",
"_type": "test_data",
"_id": "05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28",
"_score": 8.843938,
"_source": {
"test_data-created_time": 1485858118,
"test_data-status": 1,
"cluster-05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28": "ElasticSearch"
}
}

I am able to achieve this requirement with below query, but there are many fields matching UUID as part of field name. Anyone have better thoughts on how to take this forward?

GET /twitter/test_data/_search
{
"_source": {
"excludes": [
"cluster-05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28"
]
}
}


(David Pilato) #2

May be use https://www.elastic.co/guide/en/elasticsearch/reference/6.2/mapping-field-names-field.html


(Srikanth) #3

Can we use regex to exclude the list of fields from "_field_names"?


(David Pilato) #4

Do you mean excluding documents that have that field?


(Srikanth) #5

Hi David, it is not excluding the document altogether.. but only excluding the matching fields from output. The better way to put the question is as follows.

Can we exclude the fields "cluster-05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28", "cluster-05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28_1" and "cluster-05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28_2" while querying using regex in "_field_names"?

Given Input:
{
"_id": "05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28",
"_index": "cluster",
"_type": "cluster_info",
"_score": 8.843938,
"_source": {
"cluster-created_time": 1485858118,
"cluster-status": "active",
"cluster_uuid": "05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28",
"cluster-05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28": "abc",
"cluster-05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28_1": "pqr",
"cluster-05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28_2": "xyz"
}
}

Desired Output:
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_id": "05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28",
"_index": "cluster",
"_type": "cluster_info",
"_score": 1,
"_source": {
"cluster-created_time": 1485858118,
"cluster-status": "active",
"cluster-uuid": "05ed7d0d-ec9e-4e48-a5a3-cdef11e7be28"
}
}
]
}


(David Pilato) #6

So you are probably looking for https://www.elastic.co/guide/en/elasticsearch/reference/6.2/search-request-source-filtering.html


(Srikanth) #7

Thanks for the reply David. I am relatively new to ES and coping up with ES tech terminologies.

Below query is giving partial solution as there are few other fields conflicting: "cluster-*-*-*-*-*"

GET /twitter/test_data/_search
{
  "_source": {
    "excludes": [
    "cluster-*-*-*-*-*"
    ]
  }
}

Instead, using this regex "cluster-[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}" is not giving the results though.
Any idea why this regex is not accepted as part of source filtering excludes pattern
Or, are there any ways to make this regex work as part of excludes pattern?


(David Pilato) #8

I think only wildcards are supported not regex.

May be you should better do that on your application level.


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.