Need help with Elasticsearch query

I am fairly new with ES queries. Need urgent help with this:

Elastic has sql extension.
I don't know much about rest query but this is how I will write sql

GET _sql?format=txt
{
  "query": """
              select uuid from "index-name" where keywords in ('google', 'microsoft','tesla') and 
              timestamp between 'date1' and 'date2' group by uuid
        """
}

Hey Sachin, thanks for your response.

But what I need is to get the uuids which satisfies both the conditions. That is,

select uuid from "index-name" where (keywords in ('google', 'microsoft','tesla') and timestamp between 'date1' and 'date2') AND (keywords in ('apple', 'youtube'','spotify') and timestamp between 'date1' and 'date2') group by uuid

In my example, it is user_uuid 1234. How do I write this in ES query? Can this be done in a single query? do I need to use any aggregation? I am very new to ES query. Need help!

This works.
    {
  "query": {
    "bool": {
      "must": [
        {
        "bool": {
          "should": [
            { "term" : { "keywords" : "google" } },
            { "term" : { "keywords" : "microsoft" } },
            { "term" : { "keywords" : "tesla" } }
          ],
          "minimum_should_match": 1, 
          "must": [
            { "range": { 
                "@timestamp": {
                  "gte": "now-5d/d",
                  "lte": "now",
                  "format": "date_optional_time"
                } 
              } 
            }
          ]
        }}
      ]
    }
  },
  "aggs": {
         "uniq_uuids": {
             "terms": {
                 "field": "user_uuid.keyword",
                 "size": 10000
             }
         }
     }
}

But when I try to add another bool inside must array, it returns zero hits. See below:

POST _search
{
  "query": {
    "bool": {
      "must": [
        {
        "bool": {
          "should": [
            { "term" : { "keywords" : "google" } },
            { "term" : { "keywords" : "microsoft" } },
            { "term" : { "keywords" : "tesla" } }
          ],
          "minimum_should_match": 1, 
          "must": [
            { "range": { 
                "@timestamp": {
                  "gte": "now-5d/d",
                  "lte": "now",
                  "format": "date_optional_time"
                } 
              } 
            }
          ]
        }},
        {"bool": {
          "should": [
            { "term" : { "keywords" : "apple" } },
            { "term" : { "keywords" : "youtube" } },
            { "term" : { "keywords" : "spotify" } }
          ],
          "minimum_should_match": 1, 
          "must": [
            { "range": { 
                "@timestamp": {
                  "gte": "now-9d/d",
                  "lte": "now",
                  "format": "date_optional_time"
                } 
              } 
            }
          ]
        }}
      ]
    }
  },
  "aggs": {
         "uniq_uuids": {
             "terms": {
                 "field": "user_uuid.keyword",
                 "size": 10000
             }
         }
     }
}

Do you have a document which should match the second query? What does this document look like?

Yes. When I try the queries individually, they return correct documents. But when I add both the queries together inside must, it returns empty result.

For your ref:

You did not share a JSON document which is supposed to match your query.
Could you do that please?

Even better: could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.

Here's an example of the raw JSON of the documents:

{
  "user_uuid": 1234,
  "keywords": "apple",
  "@timestamp": "2020-01-15",
},
{
  "user_uuid": 1234,
  "keywords": "google",
  "@timestamp": "2020-01-21",
},
{
  "user_uuid": 9876,
  "keywords": "youtube",
  "@timestamp": "2020-01-25",
}

The query I am trying is mentioned above. Simple example query can be:

POST _search
{
  "query": {
    "bool": {
      "must": [
        {
        "bool": {
          "should": [
            { "term" : { "keywords" : "google" } },
            { "term" : { "keywords" : "microsoft" } },
            { "term" : { "keywords" : "tesla" } }
          ],
          "minimum_should_match": 1
        }},
        {"bool": {
          "should": [
            { "term" : { "keywords" : "apple" } },
            { "term" : { "keywords" : "youtube" } },
            { "term" : { "keywords" : "spotify" } }
          ],
          "minimum_should_match": 1
        }}
      ]
    }
  },
  "aggs": {
         "uniq_uuids": {
             "terms": {
                 "field": "user_uuid.keyword",
                 "size": 10000
             }
         }
     }
}

what I am trying to find is the user_uuid which satisfies both the bool conditions. In this case, the user_uuid 1234. please let me know if I have not described the problem correctly.

Queries are designed to match properties of documents and user 1234 is not a single document.

Elasticsearch, like any other distributed system will have problems with your sorts of query if each user's documents can be spread across different machines. To tackle this you need to bring the related content closer together by either:

  1. Storing each user's content in a single JSON document e.g. {"user_uuid":1234, "keywords":["apple", "google"]}
  2. Storing each user's content in a single JSON document but indexing as multiple Lucene documents (see nested type).
  3. Storing each user's content in separate JSON documents but hosted on the same machine (see parent/child )

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.