Access to a field of another type when doing query


(Saeed Zhiany) #1

relevant part of my mapping is here

{
  "mappings": {
    "message": {
      "properties": {
        "id": {
          "type": "string",
          "index": "not_analyzed"
        },
        "sourceId": {
          "type": "string",
          "index": "not_analyzed"
        },
        "sourceName": {
          "type": "string",
          "index": "analyzed"
        },
        "text": {
          "type": "string",
          "store": true
        },
        "creationTimestamp": {
          "type": "integer",
          "index": "not_analyzed"
        },
        "histories": {
          "type": "nested",
          "properties": {
            "views": {
              "type": "integer",
              "index": "not_analyzed"
            },
            "timestamp": {
              "type": "integer",
              "index": "not_analyzed"
            }
          }
        }
      }
    },
    "source": {
      "properties": {
        "id": {
          "type": "string",
          "index": "not_analyzed"
        },
        "name": {
          "type": "string",
          "store": true
        },
        "membersCount": {
          "type": "long",
          "index": "not_analyzed"
        }
      }
    }
  }
}

about mapping

sourceId and sourceName in message is id and name in source.
each source can have a lots of messages and most fields of data (my focus is membersCount field in here) has been updated periodically. so I think it's best mapping for my case for implementing relation between data (rather than parent-child approach)

what I want to achieve is sorting sources based on custom scoring by this formula:

scoreOfSource = (sumOfViewsOfMessagesBelongToCurrentSourceInAPeriod) / membersCount

to achieve this, I created below query on message type

{
  "query": {
    "function_score": {
      "query": {
        "bool": {
          "must": [
            {
              "range": {
                "creationTimestamp": {
                  "gte": 1475325259,
                  "lte": 1476102877
                }
              }
            },
            {
              "match": {
                "text": {
                  "query": "car",
                  "fuzziness": 0
                }
              }
            }
          ]
        }
      },
      "script_score": {
        "script": "_source.histories[_source.histories.values.size()-1].views"
      },
      "boost_mode": "replace"
    }
  },
  "aggs": {
    "sources": {
      "terms": {
        "field": "sourceId",
        "order": {
          "myscore": "desc"
        },
        "size": 100
      },
      "aggs": {
        "myscore": {
          "sum": {
            "script": "_score"
          }
        }
      }
    }
  },
  "size": 0
}

by the aggregation I can get sum of total views of each source's messages in myscore but I don't know how can I access to membersCount of each source because for Division. it is a field of source type and I searching on messages. as I said, membersCount has been updated periodically. so it's not efficient to store the field in message type.

is that a way to access the field in aggregation? for example using a subquery on another type in within of aggregation or any other solution.


(system) #2