How can i compare the fields between index OR compare fields between the result hit?

Hi, is anyone can point me a right direction for the subject ? I've try long time ( i.e scripted_metric , bucket_path ) and didn't find anything can help. Can someone know how can I able to do such thing in elasticsearch ?

Thanks you so much for the help..

Could you explain what you are trying to achieve?

Hi, thanks you for the reply !!!

yes, I am trying to compare the close value from the data in the bucket. Let's say I have this bucket aggs result drawing from two index A & B and got key 1 & key 2 data after query, I would like to filter out if Date 2010-01-04 close value is larger than 2009-12-31 close value.

"buckets" : [
    {
      "key" : 1,
      "doc_count" : 2,
      "group_by_date" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : 1262217600000,
            "key_as_string" : "2009-12-31",
            "doc_count" : 1,
            "close_value" : {
              "value" : 65.884
            }
          },
          {
            "key" : 1262563200000,
            "key_as_string" : "2010-01-04",
            "doc_count" : 1,
            "close_value" : {
              "value" : 65.687
            }
          }
        ]
      }
    },
    {
      "key" : 2,
      "doc_count" : 2,
      "group_by_date" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : 1262217600000,
            "key_as_string" : "2009-12-31",
            "doc_count" : 1,
            "close_value" : {
              "value" : 50.232
            }
          },
          {
            "key" : 1262563200000,
            "key_as_string" : "2010-01-04",
            "doc_count" : 1,
            "close_value" : {
              "value" : 50.328
            }
          }
        ]
      }
    }]

What would you like to filter out?

I would like the bucket left the 2nd result in return , because the key '2' 2010-01-04 close value ( 50.328 > 50.232 = true ) and the key '1' which is not ( 65.687 > 65.884 = false ). Thanks.

If the current aggs is not support to do in this way, could you mind guide me how can I achieve the same purpose ? Thank you very much.

You can do it using the bucket selector aggregation.

I've check the doc, but may I know how can I get the 2nd close price to compare the 1st close price ? Thanks you

I am really unsure how you get these buckets and where this data is coming from. If you can create a reproduction of your issue I might try to add the proper aggregation on the top of it.

OK, I posted all my index mapping, JSON data, search script & the query output below. Thanks for helping me.

This is my index A mapping

{
"mapping": {
"sales_daily": {
  "properties": {
    "sales": {
      "type": "double",
      "coerce": false
    },
    "code": {
      "type": "integer"
    },
    "date": {
      "type": "date",
      "format": "yyyy-MM-dd || yyyy-MM-dd HH:mm:ss || epoch_millis"
    }
    "name": {
      "type": "keyword",
      "index": false,
      "doc_values": false
    }
  }
}
}
}

This is my Index B mapping

{
  "mapping": {
  "sales_daily": {
  "properties": {
    "sales": {
      "type": "double",
      "coerce": false
    },
    "code": {
      "type": "integer"
    },
    "date": {
      "type": "date",
      "format": "yyyy-MM-dd || yyyy-MM-dd HH:mm:ss || epoch_millis"
    },
    "name": {
      "type": "keyword",
      "index": false,
      "doc_values": false
    }
  }
}
}
}

This is my JSON data from Index A and Index B in Kibana ( Both Index A & B got similar data, only the date time is difference )

   {
  "_index": "A",
  "_type": "sales_daily",
  "_id": "1LvEu2cBCkPfXhZI_QY3",
  "_version": 1,
  "_score": null,
  "_source": {
  "code": 2,
  "name": "RedTB",
  "date": "2009-12-31",
  "sales": 87.85
  },
  "fields": {
  "date": [
    "2009-12-31T00:00:00.000Z"
  ]
  },
  "sort": [
  1544745600000
  ]
}
{
  "_index": "A",
  "_type": "sales_daily",
  "_id": "1LvEu2cBCkPfXhZI_QY3",
  "_version": 1,
  "_score": null,
  "_source": {
  "code": 1,
  "name": "BlueTB",
  "date": "2009-12-31",
  "sales": 87.85
  },
  "fields": {
  "date": [
    "2009-12-31T00:00:00.000Z"
  ]
  },
  "sort": [
  1544745600000
  ]
}
{
  "_index": "B",
  "_type": "sales_daily",
  "_id": "1LvEu2cBCkPfXhZI_QY3",
  "_version": 1,
  "_score": null,
  "_source": {
    "code": 2,
    "name": "RedTB",
    "date": "2010-01-04",
    "sales": 90.85
  },
  "fields": {
    "date": [
      "2010-01-04T00:00:00.000Z"
    ]
  },
  "sort": [
    1544745600000
  ]
}
{
  "_index": "B",
  "_type": "sales_daily",
  "_id": "1LvEu2cBCkPfXhZI_QY3",
  "_version": 1,
  "_score": null,
  "_source": {
    "code": 1,
    "name": "BlueTB",
    "date": "2010-01-04",
    "sales": 80.85
  },
  "fields": {
    "date": [
      "2010-01-04T00:00:00.000Z"
    ]
  },
  "sort": [
    1544745600000
  ]
}

This is my testing script

GET A,B/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": {
        "range": {
          "date": {
            "gte": "2009-12-31 00:00:00",
            "lt": "2010-01-05 00:00:00"
          }
        }
      }
    }
  },
  "aggs": {
    "group_by_code": {
      "terms": {
        "field": "code"
      },
      "aggs": {
        "group_by_date": {
          "terms": {
            "field": "date"
          },
          "aggs": {
            "sales_value": {
              "max": {
                "field": "sales"
              }
            }
          }
        }
      }
    }
  }
}

This is my testing script output

{
"took" : 29,
"timed_out" : false,
"_shards" : {
  "total" : 6,
  "successful" : 6,
  "skipped" : 0,
  "failed" : 0
},
"hits" : {
  "total" : 20,
  "max_score" : 0.0,
  "hits" : [ ]
},
"aggregations" : {
  "group_by_code" : {
    "doc_count_error_upper_bound" : 0,
    "sum_other_doc_count" : 0,
    "buckets" : [
      {
        "key" : 1,
        "doc_count" : 2,
        "group_by_date" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [
            {
              "key" : 1262217600000,
              "key_as_string" : "2009-12-31",
              "doc_count" : 1,
              "sales_value" : {
                "value" : 87.85
              }
            },
            {
              "key" : 1262563200000,
              "key_as_string" : "2010-01-04",
              "doc_count" : 1,
              "sales_value" : {
                "value" : 80.85
              }
            }
          ]
        }
      },
      {
        "key" : 2,
        "doc_count" : 2,
        "group_by_date" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [
            {
              "key" : 1262217600000,
              "key_as_string" : "2009-12-31",
              "doc_count" : 1,
              "sales_value" : {
                "value" : 87.85
              }
            },
            {
              "key" : 1262563200000,
              "key_as_string" : "2010-01-04",
              "doc_count" : 1,
              "sales_value" : {
                "value" : 90.85
              }
            }
          ]
        }
      }
    ]
  }
}

}

The problem is still unclear to me. Do you always have 2 buckets in the range? Is it possible to have only one bucket or more than 2 buckets. If this is the case how does bucket selection rule change?

Without knowing details, here is a simple generic solution that should work for you examples, but a few other cases. It has its limitations, but I need to know more about the issue to address them properly.

DELETE a
DELETE b
PUT a
{
  "mappings": {
    "sales_daily": {
      "properties": {
        "sales": {
          "type": "double",
          "coerce": false
        },
        "code": {
          "type": "integer"
        },
        "date": {
          "type": "date",
          "format": "yyyy-MM-dd || yyyy-MM-dd HH:mm:ss || epoch_millis"
        },
        "name": {
          "type": "keyword",
          "index": false,
          "doc_values": false
        }
      }
    }
  }
}

PUT b
{
  "mappings": {
    "sales_daily": {
      "properties": {
        "sales": {
          "type": "double",
          "coerce": false
        },
        "code": {
          "type": "integer"
        },
        "date": {
          "type": "date",
          "format": "yyyy-MM-dd || yyyy-MM-dd HH:mm:ss || epoch_millis"
        },
        "name": {
          "type": "keyword",
          "index": false,
          "doc_values": false
        }
      }
    }
  }
}

PUT a/sales_daily/1
{
  "code": 2,
  "name": "RedTB",
  "date": "2009-12-31",
  "sales": 87.85
}

PUT a/sales_daily/2
{
  "code": 1,
  "name": "BlueTB",
  "date": "2009-12-31",
  "sales": 87.85
}
  
PUT b/sales_daily/1
{
  "code": 2,
  "name": "RedTB",
  "date": "2010-01-04",
  "sales": 90.85
}

PUT b/sales_daily/2
{
  "code": 1,
  "name": "BlueTB",
  "date": "2010-01-04",
  "sales": 80.85
}



GET a,b/_search

GET a,b/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": {
        "range": {
          "date": {
            "gte": "2009-12-31 00:00:00",
            "lt": "2010-01-05 00:00:00"
          }
        }
      }
    }
  },
  "aggs": {
    "group_by_code": {
      "terms": {
        "field": "code"
      },
      "aggs": {
        "first_sale": {
          "terms": {
            "field": "date",
            "size": 1
          },
          "aggs": {
            "sales_value": {
              "max": {
                "field": "sales"
              }
            }
          }
        },
        "first_sale_value": {
          "max_bucket": {
            "buckets_path": "first_sale>sales_value"
          }
        },
        "last_sale": {
          "terms": {
            "field": "date",
            "size": 1,
            "order": {
            "_term": "desc"
            }
          },
          "aggs": {
            "sales_value": {
              "max": {
                "field": "sales"
              }
            }
          }
        },
        "last_sale_value": {
          "max_bucket": {
            "buckets_path": "last_sale>sales_value"
          }
        },
        "filter_positive": {
          "bucket_selector": {
            "buckets_path": {
              "first": "first_sale_value",
              "last": "last_sale_value"
            },
            "script": "params.last > params.first"
          }
        }
      }
    }
  }
}

Do you always have 2 buckets in the range? No , I don't
Is it possible to have only one bucket or more than 2 buckets ? Yes, it would be. May be I need to explain more details so we can get closer to the problem and how can I work it out.

Without knowing details - Please let me explain more on the problem I am facing.
I have a dataset of products with the sales revenue ( from 2010 to 2018 ) and we want to compare the sales across the year from 2010 to 2018 , which may do in following condition :
Is the sales revenue in 2018 > 2017 and the sales revenue in 2017 < 2016

The previous sample is across two years only. If I input more data that over 2 years . Do you have any suggestion ? If aggregation cant achieve.

^Thank you for your kind assistance
BTW, the sample script provided in previous msg work well for two years data set . Thanks the sample 1st.

It would have been much easier to help you if you started with this :slight_smile:

yeh, sorry for unclear info before. So ... do you have any idea or suggestion for me to start ? Thanks much.

which may do in following condition :
Is the sales revenue in 2018 > 2017 and the sales revenue in 2017 < 2016

What does this mean?

Please allow me use the below case to explain

I have a dataset like below:
i.e.
product : RedTB , year : 2018 , sales_revenue : 1000
product : RedTB , year : 2017 , sales_revenue : 600
product : RedTB , year : 2016 , sales_revenue : 800
product : BlueTB , year : 2018 , sales_revenue : 1000
product : BlueTB , year : 2017 , sales_revenue : 920
product : BlueTB , year : 2016 , sales_revenue : 900

We want to find out the product which sales_revenue in 2018 is greater than sales_revenue in 2017 AND sales_revenue in 2017 is less than sales_revenue in 2016.
According to the sample above, the result should be RedTB because of 1000 > 600 and 600 < 800. And BlueTB should not appear in result.

Thanks

Do you have a single record per year? If you have more than 1 record, how do you find sales_revenue?

Single record per year, do you mean the product record ? Each product do have ( only ) one record per year.
The sales revenue is input by other dept. unit , we don't have much details on it.

DELETE a
DELETE b
PUT a
{
  "mappings": {
    "sales_daily": {
      "properties": {
        "sales": {
          "type": "double",
          "coerce": false
        },
        "code": {
          "type": "integer"
        },
        "year": {
          "type": "integer"
        },
        "name": {
          "type": "keyword",
          "index": false,
          "doc_values": false
        }
      }
    }
  }
}

PUT b
{
  "mappings": {
    "sales_daily": {
      "properties": {
        "sales": {
          "type": "double",
          "coerce": false
        },
        "code": {
          "type": "integer"
        },
        "year": {
          "type": "integer"
        },
        "name": {
          "type": "keyword",
          "index": false,
          "doc_values": false
        }
      }
    }
  }
}

PUT a/sales_daily/1
{
  "code": 2,
  "name": "RedTB",
  "year": 2018,
  "sales": 1000
}

PUT a/sales_daily/2
{
  "code": 1,
  "name": "BlueTB",
  "year": 2018,
  "sales": 1000
}
  
PUT b/sales_daily/1
{
  "code": 2,
  "name": "RedTB",
  "year": 2017,
  "sales": 600
}

PUT b/sales_daily/2
{
  "code": 1,
  "name": "BlueTB",
  "year": 2017,
  "sales": 920
}

PUT b/sales_daily/3
{
  "code": 2,
  "name": "RedTB",
  "year": 2016,
  "sales": 800
}

PUT b/sales_daily/4
{
  "code": 1,
  "name": "BlueTB",
  "year": 2016,
  "sales": 900
}


GET a,b/_search

GET a,b/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": {
        "range": {
          "year": {
            "gte": "2016",
            "lte": "2018"
          }
        }
      }
    }
  },
  "aggs": {
    "group_by_code": {
      "terms": {
        "field": "code"
      },
      "aggs": {
        "first_sale": {
          "filter": {
            "term": {
              "year": "2016"
            }
          }, 
          "aggs": {
            "sales_value": {
              "max": {
                "field": "sales"
              }
            }
          }
        },
        "mid_sale": {
          "filter": {
            "term": {
              "year": "2017"
            }
          }, 
          "aggs": {
            "sales_value": {
              "max": {
                "field": "sales"
              }
            }
          }
        },
        "last_sale": {
          "filter": {
            "term": {
              "year": "2018"
            }
          }, 
          "aggs": {
            "sales_value": {
              "max": {
                "field": "sales"
              }
            }
          }
        },
        "filter_positive": {
          "bucket_selector": {
            "buckets_path": {
              "first": "first_sale>sales_value",
              "mid": "mid_sale>sales_value",
              "last": "last_sale>sales_value"
            },
            "script": "params.last > params.mid && params.mid < params.first"
          }
        }
      }
    }
  }
}
1 Like

Thank you for all of your support. :grinning:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.