Hi all,
I store an array of arrays of Json objects in elastic. The first array has 100 sub-arrays of 20 Json objects.
Example of this field
"matrix": [
[
{
"field1" : value,
"field2" : value,
"field3" : value
},
{
"field1" : value,
"field2" : value,
"field3" : value
},
......... 20 x ...........
],
[
{
"field1" : value,
"field2" : value,
"field3" : value
},
{
"field1" : value,
"field2" : value,
"field3" : value
},
......... 20 x ...........
],
.........100 x .........
]
I made a performance test to index this document (25 "normal" fields + this array field) . I used JMeter with 100 threads.
I used one node with an m3.2xlarge Amazon instance ( 30 GiB of memory, 8vCPU )
Results with the array :
- Index Speed: 26 doc/second
- Average response time : 3.5s
- CPU usage : 99%
- Memory usage : 65%
The document without the array :
- Index Speed: 500 doc/second
- Average response time : 0.15s
- CPU usage : 14%
- Memory usage : 3%
With the array, CPU usage is overloaded and index speed and response time are very bad...
Without the array field, I have normal results (my documents has 25 fields) .
Has anyone ever works with a big array ?
I can increase performances of the instance and use a bigger cluster with more instances, but maybe I need to improve the way I store this large object.
Thank you