Query takes to much time in elastic compare to sphinx

KrishK · July 29, 2020, 10:28am

Hi Support,

We moved Sphinx to ElasticSearch.
Elastic and Sphinx server configuration is the same.

OS: Ubuntu 18.04
RAM: 60GB(50% occupied Elastic)
Disc : 2TB
Processor: 4 Core

Elastic Configuration: ELK - 7.7.X

We have 450 million records on production and index created with below mapping and settings.

Index shards: 25
Index size : 890GB

Mapping:
{
  "mappings": {
    "_doc": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "@version": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "group_id": {
          "type": "long"
        },
        "member_id": {
          "type": "integer"
        },
        "foldername": {
          "type": "keyword",
          "ignore_above": 256
        },
        "f_length": {
          "type": "long"
        },
        "fol_id": {
          "type": "long"
        },
        "y_m": {
          "type": "integer"
        },
        "com_id": {
          "type": "integer"
        },
        "doc_type": {
          "properties": {
            "d": {
              "type": "short"
            }
          }
        },
        "doc_order": {
          "type": "long"
        },
        "f_data": {
          "type": "text"
        },
        "id": {
          "type": "long"
        },
        "document_name": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          },
          "analyzer": "splchar_analyzer"
        },
        "flg": {
          "type": "short"
        },
        "isdisplay": {
          "type": "short"
        },
        "deleted": {
          "type": "short"
        },
        "doc_section": {
          "properties": {
            "s": {
              "type": "short"
            }
          }
        },
        "type": {
          "type": "integer"
        }
      }
    }
  }
}

Setting:
{
  "index.blocks.read_only_allow_delete": "false",
  "index.priority": "1",
  "index.query.default_field": [
    "*"
  ],
  "index.write.wait_for_active_shards": "1",
  "index.highlight.max_analyzed_offset": "60000000",
  "index.refresh_interval": "300s",
  "index.analysis.analyzer.splchar_analyzer.filter": [
    "lowercase"
  ],
  "index.analysis.analyzer.splchar_analyzer.char_filter": [
    "spl_char_filter"
  ],
  "index.analysis.analyzer.splchar_analyzer.tokenizer": "standard",
  "index.analysis.char_filter.spl_char_filter.pattern": "\\.",
  "index.analysis.char_filter.spl_char_filter.type": "pattern_replace",
  "index.analysis.char_filter.spl_char_filter.replacement": " ",
  "index.number_of_replicas": "0"
}

When we search first-time below normal query it taking t much time[5-6sec]. after run 2 -3 times, get a response in 1sec.

GET _sql?format=txt
{
  "query":"""SELECT id FROM indexname WHERE QUERY('(f_data:("testing") OR document_name:("testing"))','default_operator=AND')  AND member_id = 5002 AND deleted in(0) AND type NOT IN (0,10,13,15,16) and fol_id > 0 and isdisplay = 0"""
}

While the same query checked with Sphinx it gives a response in 0.5 sec.

I have implemented this on the Production server and I am facing this slowness issue.

Can anyone help me with this? is there any missing in the configuration? or do we need to increase the heap size?

Thanks

Mark_Harwood · July 30, 2020, 8:57am

Try keyword fields in place of numbers if you're doing exact-match lookups only.
Numeric fields are optimised for range queries.

KrishK · July 30, 2020, 11:34am

Hi Mark_Harwood,

Thank you for your reply.

Is there any issue with this configuration? Please let me know if required to change in configuration.

Can we make optimize the query?

To change mapping on a production server is quite a long process.

FYI - We have two different elastic nodes without Cluster. we indexed separately without replica.

Node 1: Shards 25, Replica 0
Node 2: Shard 25, Replica 0

Thanks

Mark_Harwood · July 30, 2020, 12:14pm

As with any production change, it would make sense to benchmark in another environment to prove the cost/benefits first.

KrishK · July 30, 2020, 1:06pm

Thank you.

Yes, we will do changes in another environment.

Can you please suggest the best configuration and mapping to overcome this issue?

Do we need to use cluster instead of separate node indexing?

Thanks

system · August 27, 2020, 1:06pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elastic Search Query performance problem Elasticsearch	1	425	July 6, 2017
Elastic Cloud Service search performance Elasticsearch	12	573	July 13, 2020
Indices size Elasticsearch	4	576	July 6, 2017
Moving from Sphinx to ES Elasticsearch	2	3260	July 6, 2017
Elasticsearch Search Qurery Taking Time Elasticsearch	9	765	July 5, 2017

Query takes to much time in elastic compare to sphinx

Related topics