Totally lost, why won't simple queries work?

Hello,

For some reason I cannot get a search on anything but _all to work for my index. I currently have an index (mapping included) that is populated using the ingest attachment plugin. I am trying to search against a few specific fields using simple_query_string (specifically document.name, document.section.name, and document.section.content) but they just don't work.

First, my mapping:

{
  "fbs": {
    "mappings": {
      "institution": {
        "properties": {
          "document": {
            "type": "nested",
            "properties": {
              "expiration_date": {
                "type": "date"
              },
              "flags": {
                "type": "text",
                "norms": false
              },
              "id": {
                "type": "keyword"
              },
              "is_active": {
                "type": "boolean"
              },
              "is_current": {
                "type": "boolean"
              },
              "name": {
                "type": "text"
              },
              "section": {
                "type": "nested",
                "properties": {
                  "created_at": {
                    "type": "date"
                  },
                  "data": {
                    "type": "text",
                    "fields": {
                      "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                      }
                    }
                  },
                  "file": {
                    "type": "nested",
                    "properties": {
                      "author": {
                        "type": "text",
                        "fields": {
                          "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                          }
                        }
                      },
                      "content": {
                        "type": "text"
                      },
                      "content_length": {
                        "type": "long"
                      },
                      "content_type": {
                        "type": "text",
                        "fields": {
                          "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                          }
                        }
                      },
                      "date": {
                        "type": "date"
                      },
                      "language": {
                        "type": "text",
                        "fields": {
                          "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                          }
                        }
                      },
                      "title": {
                        "type": "text",
                        "fields": {
                          "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                          }
                        }
                      }
                    }
                  },
                  "filename": {
                    "type": "text",
                    "norms": false
                  },
                  "fingerprint": {
                    "type": "text",
                    "norms": false
                  },
                  "flags": {
                    "type": "text",
                    "norms": false
                  },
                  "id": {
                    "type": "keyword"
                  },
                  "is_active": {
                    "type": "boolean"
                  },
                  "name": {
                    "type": "text"
                  },
                  "updated_at": {
                    "type": "date"
                  }
                }
              },
              "start_date": {
                "type": "date"
              }
            }
          },
          "id": {
            "type": "keyword"
          },
          "name_en": {
            "type": "text",
            "norms": false
          },
          "name_fr": {
            "type": "text",
            "norms": false
          },
          "region_id": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

Here's a document in my index:

{
  "_index": "fbs",
  "_type": "institution",
  "_id": "7691fcaa54659612b42b8a158d5d4b9e.pdf",
  "_version": 1,
  "found": true,
  "_source": {
    "name_fr": "Western University",
    "document": {
      "is_active": 1,
      "name": "Full-time Collective Agreement",
      "flags": "full-time",
      "section": {
        "filename": "7691fcaa54659612b42b8a158d5d4b9e.pdf",
        "is_active": 1,
        "file": {
          "date": "2005-03-17T18:16:19Z",
          "content_type": "application/pdf",
          "author": "TSm",
          "language": "en",
          "title": "Collective Agreement",
          "content": """<CONTENT OF PDF>
""",
          "content_length": 2791
        },
        "data": "<BASE 64 PDF>",
        "updated_at": "2017-02-23",
        "name": "Table of Contents",
        "flags": "full-time",
        "fingerprint": "a4ae893e292dd8a58e906eba7ddb8d70",
        "created_at": "2017-02-23",
        "id": 1
      },
      "id": 1,
      "is_current": 0,
      "expiration_date": "2006-06-30",
      "start_date": "2002-07-01"
    },
    "region_id": 2,
    "name_en": "Western University"
  }
}

If I perform this query, I get the result I want:

GET fbs/institution/_search
    "query": {
      "simple_query_string": {
        "query": "table",
        "fields": [
          "_all"  
        ]
      }
    }
}

If I perform this query, it fails -- when I try to explain why I get "no matching term"

GET fbs/institution/7691fcaa54659612b42b8a158d5d4b9e.pdf/_explain
{
    "query": {
      "simple_query_string": {
        "query": "table",
        "fields": [
          "document.section.name"  
        ]
      }
    }
}

{
  "_index": "fbs",
  "_type": "institution",
  "_id": "7691fcaa54659612b42b8a158d5d4b9e.pdf",
  "matched": false,
  "explanation": {
    "value": 0,
    "description": "Failure to meet condition(s) of required/prohibited clause(s)",
    "details": [
      {
        "value": 0,
        "description": "no match on required clause (document.section.name:table)",
        "details": [
          {
            "value": 0,
            "description": "no matching term",
            "details": []
          }
        ]
      },
...

So I'm positive I've done something very stupid here, but I can't for the life of me figure out what the hell I've done wrong. Anyone please point me in the right direction?!

John

It's because you are using nested fields here which requires a specific way to query on them.

Look at https://www.elastic.co/guide/en/elasticsearch/reference/5.2/query-dsl-nested-query.html

1 Like

Thanks for getting back to me, but it still doesn't seem to work.. reading the docs it appears this query should return correctly with a hit does it not?

GET fbs/institution/_search
{
    "query": {
      "nested": {
        "path": "document",
        "query": {
          "term": {
            "document.section.name": "Table of Contents"
          }
        }
      }
    }
}

I've also tried document.section for the path with no luck.

But you have more than one level of nested here.

Could you provide an example of what you mean of what should be a working query?

I've tried this as well to no avail, if that's what you're suggesting:

GET fbs/institution/_search
{
    "query": {
      "nested": {
        "path": "document",
        "query": {
          "nested": {
            "path": "document.section",
            "query": {
              "term": {
                 "document.section.name": "Table of Contents"
              }
            }
          }
        }
      }
    }
}

I've tried this as well to no avail, if that's what you're suggesting:

Yes that's what I'm suggesting.

Note that your are using a Term query on a text field which is analyzed.

Table of Contents can't match in such a case. table should I believe.

And you were correct! Doh of course a literal wouldn't work on an analyzed field.

Thank you -- I'm starting to lose myself in the forrest here so I appreciate you pointing that out. Now that I have at least something working here, I can start playing with it and see where I get. If I have any more problems I'll ping but just having something actually working that makes sense helps a lot.

Out of curiosity, did this behavior change in a meaningful way? I literally had a different version of this without mappings (dynamic) where my queries worked fine without all this nested stuff so that's what was really throwing me off this time. I don't quite understand why one worked and the other didn't.

If you are using dynamic mappings, then your mapping does not have any nested objects.
Just regular objects.

That's probably why it works in one case and not in the other one.

Ah, makes sense.

If I could trouble you for one more question... Thanks to your help I've built my query and it all seems to be working exactly (at least for now) as I intended.

One last thing: In my old dynamic based mapping, I was doing an agg to return the total number of unique institutions returned in the results. This is in the mapping we're discussing done by counting the total unique _source.id (not _source._id) values we have in our search results. I tried just re-using what I had before that was working and adjusting the field names but it's not working any more... any thoughts?

    "aggs": {
        "total_institutions": {
            "cardinality": {
                "field": "id"
            }
        }
    }

I don't see any source.id field in your mapping. What are you talking about? May be provide an example?

FWIW you can use copy_to feature in the mapping to copy a nested value to a parent field so the aggregation will behave as it was previously?

My 2 cents

In the mapping there is this:

"id": {
            "type": "keyword"
          },
          "name_en": {
            "type": "text",
            "norms": false
          },

It's not clear in the mapping but the various 'id' fields are actually all different keys in a database I've included just so I can reference any section. But specifically the ID field above (which is top-level to the institution type) is what I used to count unique values of in an aggregation so I knew how many unique institutions were returned in the results. That's what isn't working anymore as it just returns 0 (although I would have expected it to return 1 since I did return one result)

Make more sense now?

I can see id but I can't see source.id in mapping or in the sample document.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.