Need help with Performance and Storage Factors

sidra_89 · October 6, 2017, 4:16am

Hi ..

i am new to elasticsearch and have some basic knowledge about it and at this point i dont know what and how performance will be altered with the choices i will make for using Elasticsearch. i came across some questions and hoping to get answers ..

For AWS, it provides built-in 5.5 ES should i opt for it because 5.6 is the latest and what other options i need to consider when installing ES on AWS.
How can i analyze my DB requirements in ES .. if i have 800GB+ of data how much space i will be needing in case of ES.
Logistash is used to get data from MYSQL into ES .. so for live server are we still going to use Logistash, if so then what is the proper way to transfer data from our MYSQL to ES.
_bulk update is faster or one by one update . like when ever an order is creating it will sync with ES .. So one by one updation and bulk updation what is better suited in case of performance.
To search part of words ngram tokenizer is used as per i research if there any other way?
should we use default mappings or custom code mappings for our fields? and how much space these mappings will be taking.
what other factors i need to consider when it comes to improving performance in terms of ES.
how much knowledge should i be having to fully run ES without any errors?

Thank you ..

warkolm · October 6, 2017, 7:11am

The only real choice is to use https://www.elastic.co/cloud/as-a-service which provides the latest releases and without any limitations to Elasticsearch
Depends, what sort of data is it. What sort of analysis are you going to apply (ie the mapping)
Logstash
Bulk, always!
Depends what problem you wan tot solve here
Default mappings are always better because you are explicit. How much spaces is a question you need to test
What problems are you having?
What errors are you having?

sidra_89 · October 6, 2017, 7:17am

basically i want to search parts of words like if i have world and i type orld in search then it should show me all the values matching orld .. basically i need it to search product names, asin etc so which way is the best way to solve this in elasticsearch ..

warkolm · October 6, 2017, 7:18am

ngrams are, but they can be expensive.

sidra_89 · October 6, 2017, 7:18am

what you mean by expensive?

warkolm · October 6, 2017, 7:20am

From NGram Tokenizer | Elasticsearch Reference [5.6] | Elastic;

The ngram tokenizer first breaks text down into words whenever it encounters one of a list of specified characters, then it emits N-grams of each word of the specified length.

Have a read of the rest of that page, it runs through an example

system · November 3, 2017, 7:21am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch + logstash space requirements Elasticsearch	5	1341	July 5, 2017
Storage Ratios - I my syslog streams are expanding in elastic search to more than 10:1? Elasticsearch	5	496	July 6, 2017
First post, some questions Elasticsearch	2	339	July 6, 2017
ES for logging - what to look after with high indexing rate Elasticsearch	8	1937	September 27, 2017
Migration from ES1.5.4 TO ES 6.3 HDD performance issue Elasticsearch	5	426	August 17, 2018

Need help with Performance and Storage Factors

Related topics