Super slow bool query with terms filter array


#1

My query is running insanely slow. This one is over 1.3 seconds with just one should matches and I need 3-4 more which can take it up to 14-15 seconds! First I run a store search to yield a set of menuIds to filter against in an items index.

Shouldn't ES perform the filter first and then search against those documents. My index has over 3 million documents which doesn't seem long and I'm only searching against one or a few fields which are indexed. It almost seems like a full text search is occurring without index. Could the synonyms included in the mapping be contributing this problem?

Here is a link to the mapping and long profile query result: https://drive.google.com/open?id=1yaFFKutDYiwlTdjseZZqIijCrUXncyzf

Here's the query:

GET /item/_doc/_search
{
  "profile": true,
  "query": {
    "bool": {
      "filter": {
        "terms": {
          "menuId": [
            "1362806",
            "1220659",
            "1220282",
            "1410886",
            "1408407",
            "1220203",
            "1362729",
            "1389199",
            "1230880",
            "1222243",
            "1226957",
            "1187670",
            "1226979",
            "1223458",
            "1222824",
            "1303641",
            "1220260",
            "1226806",
            "1226635",
            "1226987",
            "1187857",
            "1224780",
            "1345293",
            "1367150",
            "1219483",
            "1367476",
            "1265920",
            "1226960",
            "1367486",
            "1222986",
            "1409449",
            "1223819",
            "1227004",
            "1396947",
            "1396899",
            "1226976",
            "1409786",
            "1220517",
            "1280015",
            "1219434",
            "1409491",
            "1307692",
            "1219721",
            "1410684",
            "1227037",
            "1401613",
            "1221572",
            "1224560",
            "1223365",
            "1411075",
            "1225831",
            "1410090",
            "1374307",
            "1255994",
            "1226986",
            "1363398",
            "1227332",
            "1410279",
            "1401560",
            "1409946",
            "1226996",
            "1399023",
            "1253851",
            "1410280",
            "1221602",
            "1268223",
            "1226394",
            "1219534",
            "1304872",
            "1410880",
            "1372924",
            "1307391",
            "1227826",
            "1223129",
            "1393406",
            "1222275",
            "1409495",
            "1295376",
            "1347656",
            "1410078",
            "1220572",
            "1311836",
            "1409484",
            "1410077",
            "1295066",
            "1305062",
            "1214538",
            "1409485",
            "1219827",
            "1304871",
            "1363298",
            "1224902",
            "1410505",
            "1279475",
            "1222201",
            "1225956",
            "1386873",
            "1221805",
            "1220307",
            "1226974",
            "1222328",
            "1410485",
            "1221940",
            "1253502",
            "1399371"
          ]
        }
      },
      "minimum_should_match": 1,
      "should": [
        {
          "match": {
            "name": {
              "query": "chicken pot pie",
              "operator": "and",
              "fuzziness": "AUTO",
              "prefix_length": 2,
              "boost": 1100
            }
          }
        }
      ]
    }
  }
}

(ddorian43) #2

Try profiling. It's probably the terms filter.


#3

Thanks, yes, I profiled it and supplied the profile output in a link above.

  1. I moved the filter into a "must" clause and removed the filter:
{
      "must": {
        "terms": {
          "menuId": [
            "1246",
            "124",
            "103",
            "113",
            "124",
            "135",
            "1239",
            "1239",
            "217",
            "116",
            "163",
            "116",
            "28",
            "97",
            "206",
            "1340",
            "1064",
            "103",
            "3",
            "1094",
            "111",
            "61",
            "222",
            "111",
            "225",
            "156",
            "40950",
            "1087",
            "62",
            "229",
            "1339",
            "62",
            "1245",
            "231",
            "62"
          ]
        }
      }
  1. Same horrible slowness.
  2. I tried removing all but one term in the query and still horribly bad.
  3. I tried removing the terms query entirely and still bad.
  4. I wonder if any of the Lucene indices are bad.
  5. Note that when I perform a reindex on this index, it oddly finished prematurely and looses 95% of the data! And the mapping hasn't changed, except for synonym settings.
  6. When I run the same query against a 2nd duplicate cloud deployment with 1/10 the index size of data, it performs in 30ms, so the first is 500x slower!

Very strange behavior is concerning as we depend on ES now for our search solution.


(ddorian43) #4

Open access to the gdrive link.


#5

Sorry, access should be open now.


#6

I'm still pretty novice with ES and I believe my deployment shards became damaged. When I created a fresh deployment and reinserted my docs, the search dropped to 600-700ms. Same query, same data. Very strange, I wish I understood why. But the query time is still too slow.


#7

@ddorian43 Hi, were you able to take a look? I'm floored that a simple query with even just one must term query clause is slowing down a search of 2-6 million documents.

Is there anyone who can take a critical look at our mapping and query DSL to optimize or point out flaws?