Querying Nested Structure Without Hardcoding Keys


#1

Hi,

I am totally new to Elasticsearch 6.0 and need help in querying documents with a nested structure (below). I have kept the example below simple, but in total the payload is around 28 KB.

My Query
What I am trying to ask is "Give me documents where any image.wl_w = 909". It works if I know the image id (e.g. below overlay_json.images.600960.wl_w), but the image id could be anything, so hardcoding is not an option

GET cwl-2018.02.06/_search
{
    "query": {
        "bool" : {
            "must": [
            {"match" : { "overlay_json.images.600960.wl_w" : "909" }}]}
        }
}

JSON Structure

  "_source": {
    "overlay_id": 12223,
    "overlay_created_at": "2017-09-04 21:09:41.592475+00:00",
    "overlay_json": {
      "images": [
        {
          "600960": [
            {
              "wl_c": 11793,
              "wl_w": 2028,
              ...
            },
            {
              "wl_c": 999,
              "wl_w": 1000,
              ...
            }]
        },
        {
          "83343": [
            {
              "wl_c": 4351,
              "wl_w": 3442,
              ...
            },
            {
              "wl_c": 56642,
              "wl_w": 909,
              ...
            }]
        },
        ...
        ]}}

** Snippet of Mapping Created by ES**

{
  "cwl-2018.02.06": {
    "mappings": {
      "data_overlays": {
        "properties": {
          "overlay_created_at": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "overlay_id": {
            "type": "long"
          },
          "overlay_json": {
            "properties": {
              "images": {
                "properties": {
                  "154528": {
                    "properties": {
                      "wl_c": {
                        "type": "long"
                      },
                      "wl_w": {
                        "type": "long"
                      }
                    }
                  },
                  "154531": {
                    "properties": {
                      "wl_c": {
                        "type": "long"
                      },
                      "wl_w": {
                        "type": "long"
                      },
                    }
                  }
                },
              }
            }
          }
        }
      }
    }
  }
}

Any help is much appreciated
Thanks
Jag


(Mark Harwood) #2

Generally, the left hand side of "foo" : "bar" JSON documents doesn't want to be an endlessly growing list of values. The list of unique fieldnames is one area in which elasticsearch is not endlessly scalable.
If you need huge numbers of fieldnames then you need to look at ways of shifting these values to the right hand side of the JSON e.g.
nested objects:

{
    "my_key_val_props": [ 
         { "key" : "foo1021371", "value": "bar"},
         { "key" : "foo9090112", "value": "bar"}
   ]

}

Or concatenated strings:

{
  "keyAndValue" : "foo1021371_bar"
}  

There's pros and cons to these alternative models but endless new field names is generally a bad practice.


#3

Thank you for the quick reply! I have transformed the json according to your suggestion and it works!


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.