better search performance means faster search.
I work for a company and they have all the resources that I need.
my source for data is an Oracle database.
volume of data is about billions records.
my use case:
lets say I have father with thousands of children.
In Oracle I have one table for all the fathers and another table for all the children.
the keys let me relate the father with its children.
with five keys in the father record I can find all father's children in the table.
In Oracle each key is defined as index.
any query is about one father's children each time. it is a very important detail.
I have just two numeric fields. the rest of data's type is numeric.
In Elasticsearch I decided that there will be one index for each father's children.
and because of the drawbacks of join I decided that it's better for me that there will be only one type of record(document) that includes one child's data and its father's data.
so I have duplicate data of the father, each father's data shown with a multiplier of the number of its children but I don't need to use join. I prefer to add servers but waiting more for getting the search results.
so each father's children will be in its own index and each children will be routed to its shard according to its teacher id routing value.
so I have 40 fields in each document.
there is no inner objects.
I understood that Elasticsearch make best performance with flat data structure
In kibana I visualize all indices together.
I need to do benchmarking and compare between Oracle and Elastic search capabilities
my desire is that Elasticsearch search results will be much better then Oracle's
but because I am a student, I have no time to try all the scenarios.
The max number of expected concurrent searches together is about 20
latency just need to be much smaller than the latency of Oracle
all my queries are simple filter queries, I don't need to score the results.
so a query in Oracle\Elastic can be for example:
give me all the children of father x that their Hemoglobin level is between y and z.
give me all the children of father x that their Hemoglobin level is y.
volume of data I need to search is about dozens of thousands records each time.
please help me to understand if I organized my data correctly so it can win Oracle in the benchmarking
I have to know if I need to change something to make Elastic do the best before I present my project to my directors.