Performance: Sorting / Filtering by runtime fields

Hey,

I'm looking into getting price conversions up to date in my indices.

  • Price data: Updated index time
  • Price conversions: Updated in query time (via runtime fields)
    • The data for the currency conversion changes often, so that's why I'm approaching this do be done via query time rather than index time.

The problem is that it seems to be too slow for acceptable usage.
Indexing the prices with conversion already done will possibly too slow, as it's hundreds of millions of documents in my index.

So the question is if there is any recommendations to make my approach faster some how, or do i need to resort to a index-time solution instead (which would not always be up to date etc)?

Here is some technical background on my implementation, in case you want to try it out:

Add fake data to an existing index:

POST some-index-with-many-documents/_update_by_query?refresh
{
  "query": {
    "match_all": {}
  },
  "script": {
    "source": """
      Random rand = new Random();
      
      ctx._source.prices = 
        [
          'lowest':
          [
            'SEK': rand.nextInt(2000),
            'NOK': rand.nextInt(2000),
            'DKK': rand.nextInt(2000),
            'GBP': rand.nextInt(2000),
            'EUR': rand.nextInt(2000)
          ]
        ]
    """
  }
}

Setting the runtime field:

PUT some-index-with-many-documents/_mapping
{
  "runtime": {
    "lowestPrice": {
      "type": "double",
      "script": {
        "lang": "painless",
        "source":
        """
          String targetKey = params['targetKey'];
          HashMap conversionRates = params['conversionRates'];
          HashMap prices = params['_source']["prices"].get(targetKey);
          double lowestPrice = Double.MAX_VALUE;
          
          for(Map.Entry price : prices.entrySet()) {
            double convertedPrice = price.getValue() * conversionRates.get(price.getKey());
            if(lowestPrice < 0 || convertedPrice < lowestPrice) {
              lowestPrice = convertedPrice;
            }
          }
          
          emit(lowestPrice)
        """,
        "params": {
          "targetKey": "lowest",
          "conversionRates": {
            "DKK": 0.679828,
            "EUR": 0.091418,
            "GBP": 0.079018,
            "NOK": 0.93673,
            "SEK": 1
          }
        }
      }
    }
}

Sorting by price:

GET some-index-with-many-documents/_search
{
  "query": {
    "match_all": {}
  },
  "sort" : [
    { "lowestPrice" : {"order" : "asc"}}
    ],
  "fields": ["lowestPrice"]
}

Filtering by price:

GET development-discovery-flink-offer_product_se-v2022.11.15-19.44.00_price_test/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match_all": {}
        }
      ],
      "filter": [
        {
          "range": {
            "lowestPrice": {
              "lte": "50"
            }
          }
        }
      ]
    }
  },
  "fields": ["lowestPrice"]
}

Thanks in advance,
Kenny

That script looks fine. Pulling source is slow, so avoid it if possible.

If the set of prices is known ahead of time, you can access it via doc values (doc['SEK'].value or $('SEK', 0) after 8.1) which should be much faster.

1 Like

Thanks for your input! I will try it out when i have time

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.