How to optimize the term query?

I used term query to search some entries and the search result contains 4714 entries, but it cost 7 min.
My index setting and mapping look like:

{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "type": "custom",
          "tokenizer": "my_tokenizer",
          "filter": [
            "length_filter"
          ]
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "pattern",
          "pattern": "(?<=[EW])(?!R)",
          "lowercase": false
        }
      },
      "filter": {
        "length_filter": {
          "type": "length",
          "max": 70
        }
      }
    }
  },
  "mappings": {
    "_doc": {
      "dynamic": "strict",
      "properties": {
        "id": {
          "type": "keyword"
        },
        "sequence1": {
          "type": "text",
          "analyzer": "my_analyzer"
        },
        "sequence2": {
          "type": "text",
          "analyzer": "my_analyzer"
        }
      }
    }
  }
}

the doc that needs to been inserted looks like:

{
    "id": "001",
    "sequence1": "AAAAAAAAAAAAABBBBBBBBBBBBBBBBBBDDDDDDDDDDDDDDDDDEGGGGGGGGGGGGGGGGGGGEGGGGGGGGGGGGGGGW",
    "sequence2": "AAAAAAAAAAAAABBBBBBBBBBBBBBBBBBDDDDDDDDDDDDDDDDDEGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGWWJHSACSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSYBIUIBBSAHBSJB AAAAAAAAAAAAAAAAAIOJIONMBJKSSSSSSSSSSSSSSSJKNJKNJUHIOHJKJTIBBFAFOBS DHUAGABSBIGUIGBHJBKNIOHIUHHHIOHIUGVAOAPPPPPPP"
}

my java search code is

SearchResponse response = esClient.prepareSearch("my_index")
                .addSort(FieldSortBuilder.DOC_FIELD_NAME, SortOrder.ASC)
                .setScroll(new TimeValue(60_000))
                .setQuery(QueryBuilders.matchPhraseQuery("sequence1, "GGGGGGGGGGGGGGGW"))
                .setFetchSource(null, new String[]{"sequence1", "sequence2"})
                .setSize(100)
                .get();
        do {
            for (SearchHit hit : response.getHits().getHits()) {
                  // there is nothing
            }
            response = esClient.prepareSearchScroll(response.getScrollId())
                    .setScroll(new TimeValue(60_000))
                    .get();
        } while (response.getHits().getHits().length != 0);

I create a line chart to check the time consumed per ** scroll**. The x-axis of the chart is the number of scroll (0 is the first time running scroll). The y-axis is the time consumed.
2019-04-24_22-59

The elasticsearch version is 6.6.1 and use the default settings. I only modified jvm.options,

-Xms4g
-Xmx4g

The index size is 74.03GB and has 66,006,373 docs. (one shard and no replica)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.