@pokaleshrey,
I'm sorry for the delay in getting back to you.
Is there a way to see what does memory_in_bytes hold actually ?
There are two options that I know of. The index _stats
api has a high-level summary of what the index is holding in memory. From my test cluster:
GET kibana_sample_data_flights/_stats
{
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_all" : {
"primaries" : {
[...]
"segments" : {
"count" : 10,
"memory_in_bytes" : 81110,
"terms_memory_in_bytes" : 53288,
"stored_fields_memory_in_bytes" : 4040,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 0,
"points_memory_in_bytes" : 886,
"doc_values_memory_in_bytes" : 22896,
"index_writer_memory_in_bytes" : 0,
"version_map_memory_in_bytes" : 0,
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : 1563803176248,
"file_sizes" : { }
},
[...]
},
[...]
},
[...]
}
If want to see Lucene internals on a shard-by-shard basis, use the verbose=true
option on an index _segments
endpoint:
GET kibana_sample_data_flights/_segments?verbose=true
{
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"indices" : {
"kibana_sample_data_flights" : {
"shards" : {
"0" : [
{
"routing" : {
"state" : "STARTED",
"primary" : false,
"node" : "7MIorM8CQ8SU0OB4ySU3FQ"
},
"num_committed_segments" : 10,
"num_search_segments" : 10,
"segments" : {
"_0" : {
"generation" : 0,
"num_docs" : 500,
"deleted_docs" : 0,
"size_in_bytes" : 290988,
"memory_in_bytes" : 7256,
"committed" : true,
"search" : true,
"version" : "8.0.0",
"compound" : true,
"ram_tree" : [
{
"description" : "postings [PerFieldPostings(segment=_0 formats=1)]",
"size_in_bytes" : 4924,
"children" : [...]
},
{
"description" : "docvalues [PerFieldDocValues(formats=1)]",
"size_in_bytes" : 1964,
"children" : [...]
},
{
"description" : "stored fields [CompressingStoredFieldsReader(mode=FAST,chunksize=16384)]",
"size_in_bytes" : 344,
"children" : [...]
},
{
"description" : "points [org.apache.lucene.codecs.lucene60.Lucene60PointsReader@4124869]",
"size_in_bytes" : 24,
"children" : [...]
}
],
"attributes" : {
"Lucene50StoredFieldsFormat.mode" : "BEST_SPEED"
}
},
[...]
}
}
]
}
}
}
}
Note that the RAM tree may contain a lot of nested data, so be ready for a lot of output from this command. The output for my simple test index was about 6,000 lines of pretty-printed JSON. However, if you look at my example, the top-level entries in the ram_tree
add up to the memory_in_bytes_value
. Unfortunately, since this extra verbose output comes from a Lucene API, I am not an expert at interpreting it.
I hope this is helpful.
-William