Increased Search Latency After Closing Index, Adding Analyzer, and Then Reopening Index

Hey, we have a production AWS opensearch cluster running Elasticsearch
7.10.

We needed to add a new custom analyzer to a running cluster. We closed the index, added the analyzer, and then reopened the index. No searches or indexing should even be using this analyzer yet. Here is the Python code we used with the Elasticsearch client to do this

es # a python es client connected to our cluster
index_name = 'library'

# close the index. 
es.indices.close(index_name)

# update analyzer
es.indices.put_settings(body={
  "analysis" : {
    "analyzer":{
        'whitespace_splitter': {
            'type': 'custom',
            'tokenizer': 'whitespace',
            'filter': [],
        }
    }
  }
}, index=index_name)

es.indices.open(index_name) # open the index

Everything seemed to work.
But now the search latency has increased ever since making this change.

Any help/insights would be appreciated!

OpenSearch/OpenDistro are AWS run products and differ from the original Elasticsearch and Kibana products that Elastic builds and maintains. You may need to contact them directly for further assistance.

(This is an automated response from your friendly Elastic bot. Please report this post if you have any suggestions or concerns :elasticheart: )

Welcome to our community! :smiley:

As the bot mentions, opensearch is not Elasticsearch, and there are code changes that aws has made to their offering that we do not know about. So while we will endeavour to help, you probably also need to ask aws.

What sort of data do you have in the index? What does the mapping look like prior to the update? How large is the index? What sort of queries are you running? What is the output from the _cluster/stats?pretty&human API?

@warkolm Thanks for the reply.
Yea I understand AWS opensearch is different. This was just a long shot request to try and see if anyone has a similar issue before. Maybe I don't quite understand what happens under the hood when we closed and opened the index. Either this causes the issue (I think so) or adding the analyzer (which to the best of my knowledge is not used yet). We added the analyzer because we were going to add a new field to the mapping in a future PR which required this analyzer. I have requested my devops team to reach out to AWS directly.

The mapping is quite large so I wont share it here. But we have various types of fields such as long, integer, keyword, date, text, float. A lot of the queries are actually not even on the text fields so not really much need for the analyzer there. We actually noticed like 50ms added to all queries after closing the index and opening the index. We did not change the mapping after this update.
There is more than one index on the cluster but the the main index of concern on the cluster which has most of the traffic has 200million documents. We were using 21 i3.xlarge instances but since we had this issue we have been making some other changes on the fly to try and alleviate it. We currently are running 9 nodes now of type i3.2xlarge. We dont need that much storage. We were just trying some different things to try and get the latency down.

index settings and whitespace_splitter is the analyzer we just added.

{'library': {'settings': {'index': {'number_of_shards': '100',
    'provided_name': 'library',
    'creation_date': '1634230174562',
    'analysis': {'filter': {'my_stemmer': {'type': 'stemmer',
       'language': 'minimal_english'}},
     'analyzer': {'my_analyzer': {'filter': ['lowercase', 'my_stemmer'],
       'type': 'custom',
       'tokenizer': 'standard'},
      'whitespace_splitter': {'filter': [],
       'type': 'custom',
       'tokenizer': 'whitespace'}}},
    'number_of_replicas': '1',
    'uuid':<>',
    'version': {'created': '7100299'}}}}}

output from _cluster/stats?pretty&human

{
  "_nodes" : {
    "total" : 12,
    "successful" : 12,
    "failed" : 0
  },
  "cluster_name" : "<>",
  "cluster_uuid" :"<>",
  "timestamp" : 1653316143311,
  "status" : "green",
  "indices" : {
    "count" : 5,
    "shards" : {
      "total" : 246,
      "primaries" : 123,
      "replication" : 1.0,
      "index" : {
        "shards" : {
          "min" : 2,
          "max" : 200,
          "avg" : 49.2
        },
        "primaries" : {
          "min" : 1,
          "max" : 100,
          "avg" : 24.6
        },
        "replication" : {
          "min" : 1.0,
          "max" : 1.0,
          "avg" : 1.0
        }
      }
    },
    "docs" : {
      "count" : 213372031,
      "deleted" : 74542341
    },
    "store" : {
      "size" : "1.8tb",
      "size_in_bytes" : 2052500656121,
      "reserved" : "0b",
      "reserved_in_bytes" : 0
    },
    "fielddata" : {
      "memory_size" : "0b",
      "memory_size_in_bytes" : 0,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size" : "4.3gb",
      "memory_size_in_bytes" : 4649357299,
      "total_count" : 3489160673,
      "hit_count" : 351944234,
      "miss_count" : 3137216439,
      "cache_size" : 1074799,
      "cache_count" : 1112397,
      "evictions" : 37598
    },
    "completion" : {
      "size" : "0b",
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 5539,
      "memory" : "198.7mb",
      "memory_in_bytes" : 208394906,
      "terms_memory" : "48.5mb",
      "terms_memory_in_bytes" : 50876800,
      "stored_fields_memory" : "2.9mb",
      "stored_fields_memory_in_bytes" : 3087592,
      "term_vectors_memory" : "0b",
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory" : "4.2mb",
      "norms_memory_in_bytes" : 4475328,
      "points_memory" : "0b",
      "points_memory_in_bytes" : 0,
      "doc_values_memory" : "143mb",
      "doc_values_memory_in_bytes" : 149955186,
      "index_writer_memory" : "112.4mb",
      "index_writer_memory_in_bytes" : 117915792,
      "version_map_memory" : "62.9kb",
      "version_map_memory_in_bytes" : 64465,
      "fixed_bit_set" : "192b",
      "fixed_bit_set_memory_in_bytes" : 192,
      "max_unsafe_auto_id_timestamp" : -1,
      "file_sizes" : { }
    },
    "mappings" : {
      "field_types" : [ {
        "name" : "binary",
        "count" : 1,
        "index_count" : 1
      }, {
        "name" : "boolean",
        "count" : 12,
        "index_count" : 2
      }, {
        "name" : "byte",
        "count" : 1,
        "index_count" : 1
      }, {
        "name" : "date",
        "count" : 12,
        "index_count" : 4
      }, {
        "name" : "float",
        "count" : 48,
        "index_count" : 1
      }, {
        "name" : "integer",
        "count" : 39,
        "index_count" : 3
      }, {
        "name" : "keyword",
        "count" : 72,
        "index_count" : 5
      }, {
        "name" : "long",
        "count" : 106,
        "index_count" : 3
      }, {
        "name" : "nested",
        "count" : 1,
        "index_count" : 1
      }, {
        "name" : "object",
        "count" : 39,
        "index_count" : 4
      }, {
        "name" : "text",
        "count" : 46,
        "index_count" : 3
      } ]
    },
    "analysis" : {
      "char_filter_types" : [ ],
      "tokenizer_types" : [ ],
      "filter_types" : [ {
        "name" : "stemmer",
        "count" : 2,
        "index_count" : 2
      } ],
      "analyzer_types" : [ {
        "name" : "custom",
        "count" : 3,
        "index_count" : 2
      } ],
      "built_in_char_filters" : [ ],
      "built_in_tokenizers" : [ {
        "name" : "standard",
        "count" : 2,
        "index_count" : 2
      }, {
        "name" : "whitespace",
        "count" : 1,
        "index_count" : 1
      } ],
      "built_in_filters" : [ {
        "name" : "lowercase",
        "count" : 2,
        "index_count" : 2
      } ],
      "built_in_analyzers" : [ ]
    }
  },
  "nodes" : {
    "count" : {
      "total" : 12,
      "coordinating_only" : 0,
      "data" : 9,
      "ingest" : 9,
      "master" : 3,
      "remote_cluster_client" : 12
    },
    "versions" : [ "7.10.2" ],
    "os" : {
      "available_processors" : 84,
      "allocated_processors" : 84,
      "names" : [ {
        "count" : 12
      } ],
      "pretty_names" : [ {
        "count" : 12
      } ],
      "mem" : {
        "total" : "561.1gb",
        "total_in_bytes" : 602526834688,
        "free" : "7.2gb",
        "free_in_bytes" : 7759409152,
        "used" : "553.9gb",
        "used_in_bytes" : 594767425536,
        "free_percent" : 1,
        "used_percent" : 99
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 19
      },
      "open_file_descriptors" : {
        "min" : 2346,
        "max" : 2915,
        "avg" : 2725
      }
    },
    "jvm" : {
      "max_uptime" : "23.3h",
      "max_uptime_in_millis" : 84178991,
      "mem" : {
        "heap_used" : "91.5gb",
        "heap_used_in_bytes" : 98314006664,
        "heap_max" : "288.8gb",
        "heap_max_in_bytes" : 310116089856
      },
      "threads" : 2002
    },
    "fs" : {
      "total" : "15.3tb",
      "total_in_bytes" : 16852141891584,
      "free" : "13.4tb",
      "free_in_bytes" : 14765537472512,
      "available" : "13.4tb",
      "available_in_bytes" : 14765336145920
    },
    "network_types" : {
      "transport_types" : {
        "com.amazon.opendistroforelasticsearch.security.ssl.http.netty.OpenDistroSecuritySSLNettyTransport" : 12
      },
      "http_types" : {
        "filter-jetty" : 12
      }
    },
    "discovery_types" : {
      "zen" : 12
    },
    "packaging_types" : [ {
      "flavor" : "oss",
      "type" : "tar",
      "count" : 12
    } ],
    "ingest" : {
      "number_of_pipelines" : 0,
      "processor_stats" : { }
    }
  }
}

I understand it's almost impossible to help us out without all our info, details etc.

I was more just curious if closing and opening an index could have such an effect in general and if anyone had any experience with that before.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.