Elastic search configuration for windows server


(Saineshwar Bageri) #1

If I have a data of 20 Million and I want to push into elastic search single instance.

  • what will be the configuration of the server, I will be using windows server.
  • Does single instance is enough.
  • Creating Elasticsearch cluster is mandatory.

(David Pilato) #2

May I suggest you look at the following resources about sizing:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing


(Saineshwar Bageri) #3

Hi Dadoonet sir,

I have one more query i want to update data in elastic search , what if data to update is 1 Million and which api to use and tools (Logstash) or any other tools.


(David Pilato) #4

It depends on what tool you used at first I believe.


(Saineshwar Bageri) #5

I am using Logstash with jdbc an it is on windows server.


(David Pilato) #6

Then use the same tools to update. Note that if you are going to update a lot of documents, it might be better to reindex the whole dataset instead.


(Saineshwar Bageri) #7

Hi dadoonet sir,

Any documents links for Update API and Re-indexing document.


(David Pilato) #8

Updating a document is the same API as creating a document.
When I say reindex, I meant index again as you did the first time.


(Saineshwar Bageri) #9

Hi dadoonet sir,

how to speed document insert into elastic search using logstash.

Can you give me some suggestion on it sir.

I am using JDBC plugin below is the code of it.

input {  
    jdbc {  
        jdbc_driver_library => "D:\sqljdbc_6.4.0.0_enu\sqljdbc_6.4\enu\mssql-jdbc-6.4.0.jre8.jar"  
        jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"  
        jdbc_connection_string => "jdbc:sqlserver://SAI-PC;user=sa;password=Pass$123;"  
        jdbc_user => "sa"  
        jdbc_password => "Pass$123"  
        statement => "SELECT * FROM [AdventureWorks2008R2].[HumanResources].[Employee]"  
    }  
}  
filter {}  
output {  
    stdout {  
        codec => json_lines  
    }  
    elasticsearch {  
        hosts => "http://localhost:9200"  
        index => "humanresources"  
    }  
}

(David Pilato) #10

In my experience, most of time is spent on reading the source database.
In that case, you can may be add a WHERE clause in your query to select only a subset of your documents and then run multiple logstash pipelines at once in parallel?


(system) #11

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.