Consideration for elasticsearch topology for handling huge search request


(NM) #1

Hey, Hi

Problem: We are using elasticsearch for our search-engine, which means first of all there wont be any indexing done everyday but only searching will be done by estimated load of 1000-2000 queries per second.

Topology: The topology for our cluster which we decided is
Total: 5 node cluster
master+data nodes: 3
only client nodes: 2

Our Machines: Each machine/node has 4 GB of ram and 4 cores

Do you have any suggestion for better topology? if yes please share, if you also think that this topology will work then please comment and say it will work.

Thanks for reading and helping.


(Mark Walkom) #2

Depends on your datasize, type and what your queries do.

That is a pretty high data load so try what you have and see how it performs - just make sure you have monitoring in place.


(NM) #3

This is our query, i hope now you can help me

$query = [
    "filtered" => [
        "query" => [
            "bool" => [
                "should" => [
                    [
                        'query_string' => [
                            'fields' => [
                                'Title.title^4',
                                'Title.ngrams_front^2',
                                'Title.ngrams_back'
                            ],
                            'defaultOperator' => 'or',
                            'query' => $paramsObj->q
                        ]
                    ],
                    [
                        'query_string' => [
                            'auto_generate_phrase_queries' => 0,
                            'enable_position_increments' => false,
                            'fields' => [
                                'Title.title',
                                'Address',
                                'keys'
                            ],
                            'query' => $paramsObj->q,
                            'use_dis_max' => false,
                            'boost' => 2
                        ]
                    ],
                    [
                        'fuzzy' => [
                            'Title.title' => [
                                'value' => $paramsObj->q,
                                'boost' => 1,
                                'min_similarity' => 0.5,
                                'max_expansions' => 20,
                                'prefix_length' => 0
                            ]
                        ]
                    ]
                ]   
            ]
        ],

        "filter" => $filters
    ]	
];

And one more thing 1000-2000 per second is our peek load, which is not what we get everyday, but we will get on weekends. so planning to sustain that load, elasticsearch should not bend down to knees, lags are allowed but not bending down to knees


(system) #4