Hello. i just saw strange stuff when im trying to do single shard test.
here is my metric it's error rate is 0.01% and its service time looks reasonable to service, but latency is really bad.
the question is
-
what cause the difference of latency and service time ?
-
if latency is lot more than service time, does it make sense that error rate goes higher?
-
to make query body, my "track.py" read csv file which has 1000,000 word. is it can be a petential problem ?
-
if (2) is problem, how can i simulate real world query parameter ?
1.command and metric
cmd :esrally race --pipeline=benchmark-only --target-host=192.168.35.177:9200 --track-path=~/rallyCustomTrack/custom-track-search --track-params="t_qps:500,t_period:300" --report-file=~/{path}/report_500.json
------------------------------------------------------
_______ __ _____
/ ____(_)___ ____ _/ / / ___/_________ ________
/ /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
------------------------------------------------------
| Metric | Task | Value | Unit |
|---------------------------------------------------------------:|-------------:|-----------------:|-------:|
| Cumulative indexing time of primary shards | | 4.8841 | min |
| Min cumulative indexing time across primary shards | | 0 | min |
| Median cumulative indexing time across primary shards | | 2.44205 | min |
| Max cumulative indexing time across primary shards | | 4.8841 | min |
| Cumulative indexing throttle time of primary shards | | 0 | min |
| Min cumulative indexing throttle time across primary shards | | 0 | min |
| Median cumulative indexing throttle time across primary shards | | 0 | min |
| Max cumulative indexing throttle time across primary shards | | 0 | min |
| Cumulative merge time of primary shards | | 0.910017 | min |
| Cumulative merge count of primary shards | | 7 | |
| Min cumulative merge time across primary shards | | 0 | min |
| Median cumulative merge time across primary shards | | 0.455008 | min |
| Max cumulative merge time across primary shards | | 0.910017 | min |
| Cumulative merge throttle time of primary shards | | 0.0854833 | min |
| Min cumulative merge throttle time across primary shards | | 0 | min |
| Median cumulative merge throttle time across primary shards | | 0.0427417 | min |
| Max cumulative merge throttle time across primary shards | | 0.0854833 | min |
| Cumulative refresh time of primary shards | | 0.224667 | min |
| Cumulative refresh count of primary shards | | 23 | |
| Min cumulative refresh time across primary shards | | 0 | min |
| Median cumulative refresh time across primary shards | | 0.112333 | min |
| Max cumulative refresh time across primary shards | | 0.224667 | min |
| Cumulative flush time of primary shards | | 0.000716667 | min |
| Cumulative flush count of primary shards | | 2 | |
| Min cumulative flush time across primary shards | | 0 | min |
| Median cumulative flush time across primary shards | | 0.000358333 | min |
| Max cumulative flush time across primary shards | | 0.000716667 | min |
| Total Young Gen GC time | | 0.966 | s |
| Total Young Gen GC count | | 263 | |
| Total Old Gen GC time | | 0 | s |
| Total Old Gen GC count | | 0 | |
| Store size | | 0.799961 | GB |
| Translog size | | 1.02445e-07 | GB |
| Heap used for segments | | 0.0501366 | MB |
| Heap used for doc values | | 0.00166702 | MB |
| Heap used for terms | | 0.03302 | MB |
| Heap used for norms | | 0.00415039 | MB |
| Heap used for points | | 0 | MB |
| Heap used for stored fields | | 0.0112991 | MB |
| Segment count | | 23 | |
| Total Ingest Pipeline count | | 0 | |
| Total Ingest Pipeline time | | 0 | s |
| Total Ingest Pipeline failed | | 0 | |
| Min Throughput | query_string | 177.7 | ops/s |
| Mean Throughput | query_string | 213.13 | ops/s |
| Median Throughput | query_string | 215.15 | ops/s |
| Max Throughput | query_string | 218.91 | ops/s |
| 50th percentile latency | query_string | 83098.6 | ms |
| 90th percentile latency | query_string | 153420 | ms |
| 99th percentile latency | query_string | 169491 | ms |
| 99.9th percentile latency | query_string | 171335 | ms |
| 99.99th percentile latency | query_string | 171642 | ms |
| 100th percentile latency | query_string | 171732 | ms |
| 50th percentile service time | query_string | 20.4667 | ms |
| 90th percentile service time | query_string | 31.5086 | ms |
| 99th percentile service time | query_string | 49.6949 | ms |
| 99.9th percentile service time | query_string | 150.478 | ms |
| 99.99th percentile service time | query_string | 179.699 | ms |
| 100th percentile service time | query_string | 664.647 | ms |
| error rate | query_string | 0 | % |
- track.json
{
"version": 2,
"description": "์ปค์คํ
ํธ๋ ์์: search with test param ",
"indices": [
{
"name": "text-index"
}
],
"schedule": [
{
"operation" : {
"name": "query_string",
"operation-type": "search",
"param-source": "my-custom-term-param-source"
},
"clients": 20,
"target-throughput" : {{ t_qps|default(1000) }},
"time-period": {{ t_period|default(60) }}
}
]
}
- track.py
import random
import csv
class TermParamSource:
def __init__(self, track, params, **kwargs):
# choose a suitable index: if there is only one defined for this track
# choose that one, but let the user always override index and type.
if len(track.indices) == 1:
default_index = track.indices[0].name
if len(track.indices[0].types) == 1:
default_type = track.indices[0].types[0].name
else:
default_type = None
else:
default_index = "text-index"
default_type ="_doc"
# we can eagerly resolve these parameters already in the constructor...
self._index_name = params.get("index", default_index)
self._type_name = params.get("type", default_type)
self._cache = params.get("cache", False)
# ... but we need to resolve "profession" lazily on each invocation later
self._params = params
# Determines whether this parameter source will be "exhausted" at some point or
# Rally can draw values infinitely from it.
self.infinite = True
self.words = self.init_word()
def init_word(self):
wordList = []
f = open('/Users/gimdohyeon/rallyCustomTrack/custom-track-search/paramWord.csv', 'r', encoding='utf-8')
rdr = csv.reader(f)
reader = csv.reader(f,
delimiter = ",", quotechar = '"',
quoting = csv.QUOTE_ALL)
for words in reader:
for word in words:
wordList.append(word)
f.close()
return wordList
def partition(self, partition_index, total_partitions):
return self
def buildParam(self, length):
param = ""
for _ in range(0,length):
param += (random.choice(self.words) + " ")
return param
def params(self):
# you must provide all parameters that the runner expects
return {
"body": {
"query": {
"query_string": {
"query": "%s" % self.buildParam(4)
}
}
},
"index": "text-index",
"type": "_doc",
"cache": False
}
def register(registry):
registry.register_param_source("my-custom-term-param-source", TermParamSource)