Is there some wrong usage of my query json?


(Zi Yi Wang(Keen)) #1

Here is my es settings:

// http://10.94.107.21:9200/ugc

{
  "ugc": {
    "aliases": {
      
    },
    "mappings": {
      "article": {
        "_all": {
          "store": true,
          "analyzer": "ik_smart"
        },
        "properties": {
          "authorDept1": {
            "type": "string"
          },
          "authorDept2": {
            "type": "string"
          },
          "authorDept3": {
            "type": "string"
          },
          "authorDept4": {
            "type": "string"
          },
          "authorDept5": {
            "type": "string"
          },
          "authorDisplayName": {
            "type": "string"
          },
          "authorDisplayNamePinYin": {
            "type": "string"
          },
          "authorEmail": {
            "type": "string"
          },
          "authorEmployeeId": {
            "type": "string"
          },
          "authorId": {
            "type": "long"
          },
          "authorUsername": {
            "type": "string"
          },
          "content": {
            "type": "string",
            "boost": 4.0,
            "analyzer": "ik_smart",
            "include_in_all": true
          },
          "id": {
            "type": "long"
          },
          "publishTime": {
            "type": "string"
          },
          "tags": {
            "type": "string",
            "boost": 10.0,
            "analyzer": "ik_smart",
            "include_in_all": true
          },
          "title": {
            "type": "string",
            "boost": 10.0,
            "analyzer": "ik_smart",
            "include_in_all": true
          }
        }
      }
    },
    "settings": {
      "index": {
        "creation_date": "1480584398456",
        "number_of_shards": "5",
        "number_of_replicas": "1",
        "uuid": "ly7I_JPFQZSX5xRL1XOGow",
        "version": {
          "created": "2040199"
        }
      }
    },
    "warmers": {
      
    }
  }
}

curl http://10.94.107.21:9200/ugc/article/_search

response string:


{
  "took": 15,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 47,
    "max_score": 1.0,
    "hits": [
      {
        "_index": "ugc",
        "_type": "article",
        "_id": "3858",
        "_score": 1.0,
        "_source": {
          "id": 3858,
          "title": "漫谈TCP",
          "content": "content of this document",
          "publishTime": "Nov 9, 2016 6:10:14 PM",
          "tags": [
            "tcp"
          ],
          "authorId": 5768,
          "authorUsername": "yejianfeng",
          "authorDisplayName": "叶剑峰",
          "authorDisplayNamePinYin": [
            "yejianfeng",
            "xiejianfeng"
          ],
          "authorEmail": "yejianfeng@bing.com",
          "authorEmployeeId": "05616",
          "authorDept1": "CEO",
          "authorDept2": "工程生产力部",
          "authorDept3": "运力中心",
          "authorDept4": "产品技术中心",
          "authorDept5": "运力中心技术部"
        }
      },
      {
        "_index": "ugc",
        "_type": "article",
        "_id": "3852",
        "_score": 1.0,
        "_source": {
          "id": 3852,
          "title": ""
.......

And, my query json is:


{
   "query": {
       "filtered":{
           "filter":{
               "bool":{
                   "must":{
                       "bool": {
                           "must":{"term":{"authorDept2": "工程生产力部"}}
                       }
                   },
                   "must":{
                       "bool": {
                           
                       }
                   },
                   "must":{
                       "bool": {
                           
                       }
                   }
               }
           }
       }
   }
}

the query response:


{
    "took": 3,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 0,
        "max_score": null,
        "hits": []
    }
}

Is there some wrong usage of my query string?
I hope I can get document from es by filter only.

Thanks


(Mark Harwood) #2

In all likelihood it is the dodgy JSON.
You have 3 "must" clauses and JSON parsers will typically not error when this happens but take the last of the values for the duplicate keys (so the value { bool: {} } ). The first "must" value with the useful looking criteria will be ignored


(Zi Yi Wang(Keen)) #3

First of all, I did doubt the three must block. But, even if I try to use one block only. Not work too.

one block like:


{
           "filter":{
               "bool":{
                   "must":{
                       "term":{"authorDept2": "工程生产力部"}
                   }
               }
           }
       }


(Mark Harwood) #4

You can debug why your doc is not matching using the explain API [1]
It will likely be a mismatch between the un-tokenized query (untokenized because you used the term query). If you use the match query instead of the term query you can normally rely on your query terms matching what ever tokenization policy is used to index your content.

You can debug what tokens are being created in your index using the analyze API [2]

[1] https://www.elastic.co/guide/en/elasticsearch/reference/5.0/search-explain.html
[2] https://www.elastic.co/guide/en/elasticsearch/reference/5.0/_testing_analyzers.html


(Zi Yi Wang(Keen)) #5

I try to use explain api to find out what's wrong in my query, but there is no more useful message there.

the full output:


{
    "_index": "ugc",
    "_type": "article",
    "_id": "_explain",
    "_version": 2,
    "found": true,
    "_source": {
        "query": {
            "filtered": {
                "filter": {
                    "bool": {
                        "must": {
                            "bool": {
                                "must": {
                                    "term": {
                                        "authorDept2": "工程生产力部"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

On the other side, All I want to do is make a filter query which use multi filed. Like a logic statement:


filedA = 1 AND (filedB = "some word" OR fieldC = "some value")

Is there a best way for me ?


(Mark Harwood) #6

That looks like you've created a document in your index with the ID "_explain" containing the source of a query.
What version of elasticsearch are you on and what command did you issue to get to this point?


(Zi Yi Wang(Keen)) #7

My elasticsearch version is 2.4.1.
Thank for notice me that I have created a document that ID is _explain. It's my mistake.

I am a new bee on elasticsearch. In this issue, I want to create a query which can filter data by logic statement.

For example:
If there is 5 filed in my es: Fild1, Fild2, Fild3, Fild4, Fild5. Field1 ~ Field5 maybe contains some Chinese or Japenese character. I use ik analysis to cut word and do analysis.

I want to search data from es by filter like:

(Field1 = 1 OR Fidle2 = "some text") AND (Field3 = "some text" OR Field4 = 10002 OR Field5 = "some content")

And, I want to know what should I do.


(Mark Harwood) #8

Roughly:

 bool
    must
        bool
            should
                field1:some text
                field2: some text
        bool
            should
                field3: some text
                field4: 10002
                field5: some content

See https://www.elastic.co/blog/lost-in-translation-boolean-operations-and-filters-in-the-bool-query


(Zi Yi Wang(Keen)) #9

Thanks for you reply, Give me some time to read this article. I'ill try what this post say.


(system) #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.