Request to close index fails due to socket timeout

Scenario:
We create new index on daily basis so as to keep the data in the index updated.After creation of new index when close request is made,it times out with Socket failure execption.

{
class: "org.quartz.simpl.SimpleThreadPool$WorkerThread"
exact: true
file: "SimpleThreadPool.java"
}]
localizedMessage: "30,000 milliseconds timeout on connection http-outgoing-26 [ACTIVE]"
message: "30,000 milliseconds timeout on connection http-outgoing-26 [ACTIVE]"
name: "java.net.SocketTimeoutException"
}

The client used here is Java High level client

    compile "org.elasticsearch.client:elasticsearch-rest-high-level-client: 7.9.2"

What could be the cause for the timeout happening and
what approach or practice to be followed to close index effectively and avoid this timeout?

Which version of Elasticsearch are you using? What is the full output of the cluster stats API?

One reason this could happen is that you have a very large number of shards (open and closed) that results in a large cluster state that takes time to update and propagate. How well Elasticsearch handles this depend a lot on the version used though.

elastic version used is 7.9.2,
Could you please specify which info from cluster stats is needed?

It would be good to see it all as it is hard to tell which parts are most important. Initially I would be most interested in the following though:

  • Number of nodes in the cluster
  • System resources assigned to the nodes, e.g. RAM and heap
  • Total number of indices and shards in the cluster (open and closed)
  • Total data volume stored in the cluster
  • Memory usage information
  • List of third-party plugins installed that could have an impact on cluster state updates and distribution

It would also be good to get an understanding around what load the cluster is under and what kind of hardware and storage that supports it.

Is there anything in the logs, e.g. around long GC?

There are 7 nodes in total
please check the below details

mem": {
        "total_in_bytes": 375809638400,
        "free_in_bytes": 58817269760,
        "used_in_bytes": 316992368640,
        "free_percent": 16,
        "used_percent": 84
      }
    },
    "process": {
      "cpu": {
        "percent": 38
      },
      "open_file_descriptors": {
        "min": 709,
        "max": 819,
        "avg": 767
      }
    },
    "jvm": {
      "max_uptime_in_millis": 4945262384,
      "versions": [
        {
          "version": "15",
          "vm_name": "OpenJDK 64-Bit Server VM",
          "vm_version": "15+36",
          "vm_vendor": "AdoptOpenJDK",
          "bundled_jdk": true,
          "using_bundled_jdk": true,
          "count": 7
        }
      ],
      "mem": {
        "heap_used_in_bytes": 80353920208,
        "heap_max_in_bytes": 187904819200
      },
      "threads": 893
    },
    "fs": {
      "total_in_bytes": 1472131325952,
      "free_in_bytes": 1164132966400,
      "available_in_bytes": 1164015525888
    }

there are no logs in es server as such

What about answers to the remaining questions? The most likely reason you are seeing this is in my experience that you have too many shards and indices (open and closed all count) in the cluster which has increased the size of the cluster state and affected how fast this updates and replicates.

"index : {
    "count": 26,
    "shards": {
      "total": 94,
      "primaries": 26,
      "replication": 2.6153846153846154,
      "index": {
        "shards": {
          "min": 1,
          "max": 7,
          "avg": 3.6153846153846154
        },
        "primaries": {
          "min": 1,
          "max": 1,
          "avg": 1.0
        },
        "replication": {
          "min": 0.0,
          "max": 6.0,
          "avg": 2.6153846153846154
        }
      }
    },
    "docs": {
      "count": 6156944,
      "deleted": 574568
    },
    "store": {
      "size_in_bytes": 200301943826,
      "reserved_in_bytes": 0
    },
    "fielddata": {
      "memory_size_in_bytes": 23387224,
      "evictions": 0
    },
    "query_cache": {
      "memory_size_in_bytes": 922891449,
      "total_count": 25291617563,
      "hit_count": 339591942,
      "miss_count": 24952025621,
      "cache_size": 172299,
      "cache_count": 904183,
      "evictions": 731884
    },
    "completion": {
      "size_in_bytes": 15990602714
    },
    "segments": {
      "count": 1233,
      "memory_in_bytes": 16087510648,
      "terms_memory_in_bytes": 16066293914,
      "stored_fields_memory_in_bytes": 1421048,
      "term_vectors_memory_in_bytes": 0,
      "norms_memory_in_bytes": 5009536,
      "points_memory_in_bytes": 0,
      "doc_values_memory_in_bytes": 14786150,
      "index_writer_memory_in_bytes": 275889624,
      "version_map_memory_in_bytes": 345014,
      "fixed_bit_set_memory_in_bytes": 0,
      "max_unsafe_auto_id_timestamp": -1,
      "file_sizes": {
        
      }
    },

Please check the details above for the other questions

Does that count include the closed indices?

those are total number of indices (closed+open)

That does not look excessive. What kind if storage are you using?

storage type is ssd on gcloud

Then I am unfortunately out of ideas.

It would be great if you could suggest some best practice around closing index effectively

I would recommend not closing indices as well as upgrading to a more recent version. The overhead of open indices has been reduced in recent version which reduces the benefits of closing indices.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.