Elasticsearch topologie


#1

Bonjour,

J'ai réalisé une topologie avec un noeud Elasticsearch qui sert de load balancer et 3 autres noeuds qui servent de noeuds master et data. J'ai un index par jour de 5 shards et 1 replica pour environ 100GB de données par jour.

Le tout fonctionne très bien seulement le temps de recherche est très élevé (de l'ordre de 15s pour un jour).
Je vois pas quoi faire pour améliorer ça :confused:

J'ai alloué 4GB de RAM puis 8GB et rien n'a changé, j'ai ajouté des noeuds rien non plus et en jouant avec le nombre de shards je n'ai pas vu de différence. Il y a sûrement quelques choses qui m'échappe mais là je vois plus quoi tester.

J'utilise la dernière version pour toute la Stack.

En espérant que vous pourrez m'aider.


(David Pilato) #2

Quel type de machine?
Quelle query executes-tu?

Tu peux donner aussi le JSon typique que tu récupères?


#3

Ce sont des serveurs Ubuntu 16.4.
C'est tout simplement la recherche par date.
image
Si je précise une date pour avoir tous les logs de ce jour-là, j'ai un temps d'attente de 15secondes et pareil si je mets "Today". A partir d'un certains nombre de logs les performances descendent fortement.

{
  "_index": "filebeat-2018.05.07",
  "_type": "doc",
  "_id": "VX_6OmMBUBkisn7v1Gd6",
  "_version": 1,
  "_score": null,
  "_source": {
    "beat": {
      "hostname": "die138",
      "version": "6.2.4",
      "name": "filebeat"
    },
    "@version": "1",
    "@timestamp": "2018-05-07T14:21:45.000Z",
    "host": "die138",
    "source": "/var/log/syslog",
    "fileset": {
      "module": "system",
      "name": "syslog"
    },
    "prospector": {
      "type": "log"
    },
    "system": {
      "syslog": {
        "pid": "381",
        "message": "2018-05-07T14:21:45Z E! InfluxDB Output Error: Post http://toto:8086/write?db=toto: net/http: request canceled (Client.Timeout exceeded while awaiting headers)",
        "hostname": "die138",
        "timestamp": "May  7 16:21:45",
        "program": "toto"
      }
    },
    "offset": 1314094,
    "tags": [
      "beats_input_codec_plain_applied"
    ],
    "message": "May  7 16:21:45 die138 telegraf[381]: 2018-05-07T14:21:45Z E! InfluxDB Output Error: Post http://toto:8086/write?db=toto: net/http: request canceled (Client.Timeout exceeded while awaiting headers)",
    "fields": {
      "env": "die",
      "document_type": "system"
    }
  },
  "fields": {
    "@timestamp": [
      "2018-05-07T14:21:45.000Z"
    ]
  },
  "sort": [
    1525702905000
  ]
}

(David Pilato) #4

Quel type de serveur ? (Pas quel type d'OS mais quel type de machine est-ce ? Quel type de disque ?)

Peux-tu me donner une requête typique que tu passes?

Par exemple: execute dans la console Kibana:

GET /filebeat-*/_search

Et partage ici le résultat.


#5

Ce sont des machines virtuelles et c'est du storage partagé sur HDD.

Voici le résultat de la requête :

{
  "took": 260,
  "timed_out": false,
  "_shards": {
    "total": 36,
    "successful": 36,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 54053687,
    "max_score": 1,
    "hits": [
      {
        "_index": "filebeat-2018.05.01",
        "_type": "doc",
        "_id": "Tpb_GGMBAaZyQaoS1Qq1",
        "_score": 1,
        "_source": {
          "fileset": {
            "name": "access",
            "module": "nginx"
          },
          "tags": [
            "beats_input_codec_plain_applied",
            "_geoip_lookup_failure"
          ],
          "nginx": {
            "access": {
              "remote_addr": "10.1.2.171",
              "http_user_agent": "Apache CXF 2.6.9",
              "upstream_response_time": "0.700",
              "msec": "1525132804.190",
              "server_name": "die-el2-adm.lampiris.be",
              "upstream_addr": "10.10.154.155:8080",
              "request_body": "-",
              "timestamp": "01/May/2018:02:00:04 +0200",
              "http_referer": "-",
              "request": "GET /el2-adm-backend/comeca/billing/accesspoint/21113892853300/billingdata?Apr+18+00%3A00%3A00+CEST+2017 HTTP/1.1",
              "remote_name": "-",
              "request_time": "0.700",
              "body_bytes_sent": "181",
              "status": "500"
            }
          },
          "read_timestamp": "2018-05-01T00:00:07.537Z",
          "source": "/var/log/nginx/die-el2-adm.lampiris.be-access.log",
          "offset": 1138445,
          "@version": "1",
          "fields": {
            "env": "die",
            "document_type": "nginx"
          },
          "host": "die158",
          "@timestamp": "2018-05-01T00:00:04.000Z",
          "prospector": {
            "type": "log"
          },
          "beat": {
            "name": "filebeat",
            "hostname": "die158",
            "version": "6.2.4"
          },
          "message": "[01/May/2018:02:00:04 +0200] remote_addr : 10.1.2.171 remote_user : - server_name : backend/comeca/billing/accesspoint/21113892853300/billingdata?Apache CXF 2.6.9"
        }
      },
      {
        "_index": "filebeat-2018.05.01",
        "_type": "doc",
        "_id": "VZb_GGMBAaZyQaoS1Qq1",
        "_score": 1,
        "_source": {
          "fileset": {
            "name": "access",
            "module": "nginx"
          },
          "tags": [
            "beats_input_codec_plain_applied",
            "_geoip_lookup_failure"
          ],
          "nginx": {
            "access": {
              "remote_addr": "10.1.2.171",
              "http_user_agent": "Apache CXF 2.6.9",
              "upstream_response_time": "0.364",
              "msec": "1525132803.926",
              "server_name": "die-el2-adm.lampiris.be",
              "upstream_addr": "10.10.154.156:8080",
              "request_body": "-",
              "timestamp": "01/May/2018:02:00:03 +0200",
              "http_referer": "-",
              "request": "GET /el2-adm-backend/comeca/billing/accesspoint/22582199703209/billingdata?te=Thu+May+23+00%3A00%3A00+CEST+2013 HTTP/1.1",
              "remote_name": "-",
              "request_time": "0.364",
              "body_bytes_sent": "177",
              "status": "500"
            }
          },
          "read_timestamp": "2018-05-01T00:00:07.537Z",
          "source": "/var/log/nginx/die-el2-adm.lampiris.be-access.log",
          "offset": 1137914,
          "@version": "1",
          "fields": {
            "env": "die",
            "document_type": "nginx"
          },
          "host": "die158",
          "@timestamp": "2018-05-01T00:00:03.000Z",
          "prospector": {
            "type": "log"
          },
          "message": "[01/May/2018:02:00:03 +0200] remote_addr : 10.1.2.171 remote_user : - server_name : backend/comeca/billing/accesspoint/22582199703209/billingdata?http_user_agent : Apache CXF 2.6.9",
          "beat": {
            "name": "filebeat",
            "hostname": "die158",
            "version": "6.2.4"
          }
        }
      },

#6
 {
        "_index": "filebeat-2018.05.01",
        "_type": "doc",
        "_id": "ZZb_GGMBAaZyQaoS1woK",
        "_score": 1,
        "_source": {
          "fileset": {
            "name": "log",
            "module": "postgresql"
          },
          "tags": [
            "beats_input_codec_plain_applied"
          ],
          "source": "/var/log/postgresql/postgresql-9.3-main.log",
          "postgresql": {
            "message": "GMT LOG:  duration: 216.794 ms  execute S_2: select processorc0_.id as id1_122_0_, processorc0_.createdDate as createdD2_122_0_, processorc0_.modifiedDate as ",
            "timestamp": "2018-05-01 00:00:07"
          },
          "offset": 9213032,
          "fields": {
            "env": "die",
            "document_type": "postgresql"
          },
          "@version": "1",
          "host": "p2diel2findbm",
          "@timestamp": "2018-05-01T00:00:08.888Z",
          "prospector": {
            "type": "log"
          },
          "message": "2018-05-01 00:00:07 GMT LOG:  duration: 216.794 ms  execute S_2: select processorc0_.id as id1_122_0_, processorc0_.createdDate as createdD2_122_0_, process_orchestrator ",
          "beat": {
            "name": "filebeat",
            "hostname": "p2diel2findbm",
            "version": "6.2.4"
          }
        }
      },
      {
        "_index": "filebeat-2018.05.01",
        "_type": "doc",
        "_id": "YJb_GGMBAaZyQaoS1woK",
        "_score": 1,
        "_source": {
          "fileset": {
            "name": "log",
            "module": "postgresql"
          },
          "tags": [
            "beats_input_codec_plain_applied"
          ],
          "source": "/var/log/postgresql/postgresql-9.3-main.log",
          "postgresql": {
            "message": "GMT LOG:  duration: 236.699 ms  execute S_4: update eventindexsource set createdDate=$1, modifiedDate=$2, obsoleteDate=$3, accountReference=$4, admCallDate=$5, id=$24",
            "timestamp": "2018-05-01 00:00:07"
          },
          "offset": 9211858,
          "@version": "1",
          "fields": {
            "env": "die",
            "document_type": "postgresql"
          },
          "host": "p2diel2findbm",
          "@timestamp": "2018-05-01T00:00:08.888Z",
          "prospector": {
            "type": "log"
          },
          "message": "2018-05-01 00:00:07 GMT LOG:  duration: 236.699 ms  execute S_4: update eventindexsource set createdDate=$1, modifiedDate=$2, obsoleteDate=$3, accountReference=$4, id=$24",
          "beat": {
            "name": "filebeat",
            "hostname": "p2diel2findbm",
            "version": "6.2.4"
          }
        }
      },
      {
        "_index": "filebeat-2018.05.01",
        "_type": "doc",
        "_id": "aZb_GGMBAaZyQaoS1wrd",
        "_score": 1,
        "_source": {
          "fileset": {
            "name": "syslog",
            "module": "system"
          },
          "tags": [
            "beats_input_codec_plain_applied"
          ],
          "source": "/var/log/syslog",
          "offset": 2623,
          "@version": "1",
          "fields": {
            "env": "die",
            "document_type": "system"
          },
          "host": "die132",
          "@timestamp": "2018-05-01T00:00:01.000Z",
          "system": {
            "syslog": {
              "program": "CRON",
              "pid": "15879",
              "hostname": "die132",
              "message": "(clusterlauncher) CMD (/application/batch_advances.sh)",
              "timestamp": "May  1 02:00:01"
            }
          },
          "prospector": {
            "type": "log"
          },
          "message": "May  1 02:00:01 die132 CRON[15879]: (clusterlauncher) CMD (/application/batch_advances.sh)",
          "beat": {
            "name": "filebeat",
            "hostname": "die132",
            "version": "6.2.4"
          }
        }
      },

#7
{
        "_index": "filebeat-2018.05.01",
        "_type": "doc",
        "_id": "apb_GGMBAaZyQaoS1wrd",
        "_score": 1,
        "_source": {
          "tags": [
            "beats_input_codec_plain_applied"
          ],
          "source": "/var/log/karaf/karaf.log",
          "offset": 53012025,
          "@version": "1",
          "fields": {
            "env": "die",
            "document_type": "karaf"
          },
          "host": "p2dl2busesbfin",
          "@timestamp": "2018-05-01T00:00:08.088Z",
          "karaf": {
            "loglevel": "INFO",
            "message": " | Context_Worker-2 | route7 | 82 - org.apache.camel.camel-core - 2.18.2 | Start cron at 2018/05/01 02:00:00",
            "timestamp": "2018-05-01 02:00:00,001"
          },
          "prospector": {
            "type": "log"
          },
          "beat": {
            "name": "filebeat",
            "hostname": "p2dl2busesbfin",
            "version": "6.2.4"
          },
          "message": "2018-05-01 02:00:00,001 | INFO  | Context_Worker-2 | route7 | 82 - org.apache.camel.camel-core - 2.18.2 | Start cron at 2018/05/01 02:00:00"
        }
      },
      {
        "_index": "filebeat-2018.05.01",
        "_type": "doc",
        "_id": "a5b_GGMBAaZyQaoS1wrd",
        "_score": 1,
        "_source": {
          "tags": [
            "beats_input_codec_plain_applied"
          ],
          "source": "/var/log/karaf/karaf.log",
          "offset": 53012577,
          "@version": "1",
          "fields": {
            "env": "die",
            "document_type": "karaf"
          },
          "host": "p2dl2busesbfin",
          "@timestamp": "2018-05-01T00:00:08.088Z",
          "karaf": {
            "loglevel": "INFO",
            "message": " | Context_Worker-2 | route7 | 82 - org.apache.camel.camel-core - 2.18.2 | Total processed 0 invoices",
            "timestamp": "2018-05-01 02:00:00,003"
          },
          "prospector": {
            "type": "log"
          },
          "message": "2018-05-01 02:00:00,003 | INFO  | Context_Worker-2 | route7 | 82 -org.apache.camel.camel-core - 2.18.2 | Total processed 0 invoices",
          "beat": {
            "name": "filebeat",
            "hostname": "p2dl2busesbfin",
            "version": "6.2.4"
          }
        }
      },
      {
        "_index": "filebeat-2018.05.01",
        "_type": "doc",
        "_id": "gZb_GGMBAaZyQaoS3Ar6",
        "_score": 1,
        "_source": {
          "fileset": {
            "name": "syslog",
            "module": "system"
          },
          "tags": [
            "beats_input_codec_plain_applied"
          ],
          "source": "/var/log/syslog",
          "offset": 2440,
          "@version": "1",
          "fields": {
            "env": "die",
            "document_type": "system"
          },
          "host": "die148",
          "@timestamp": "2018-05-01T00:00:00.000Z",
          "system": {
            "syslog": {
              "program": "el2-fo-tariff.jar",
              "pid": "1907",
              "hostname": "die148",
              "message": "2018-05-01 02:00:00.000  INFO 1918 --- [pool-2-thread-1] b.l.e.f.t.s.s.ElecConsumptionCalculator  : Updating elec simulation values started",
              "timestamp": "May  1 02:00:00"
            }
          },
          "prospector": {
            "type": "log"
          },
          "message": "May  1 02:00:00 die148 el2-fo-tariff.jar[1907]: 2018-05-01 02:00:00.000  INFO 1918 --- [pool-2-thread-1] b.l.e.f.t.s.s.ElecConsumptionCalculator  : Updating elec simulation values started",
          "beat": {
            "name": "filebeat",
            "hostname": "die148",
            "version": "6.2.4"
          }
        }
      },
      {
        "_index": "filebeat-2018.05.01",
        "_type": "doc",
        "_id": "j5b_GGMBAaZyQaoS3Qqp",
        "_score": 1,
        "_source": {
          "fileset": {
            "name": "access",
            "module": "nginx"
          },
          "tags": [
            "beats_input_codec_plain_applied",
            "_geoip_lookup_failure"
          ],
          "nginx": {
            "access": {
              "remote_addr": "10.1.2.171",
              "http_user_agent": "Apache CXF 2.6.9",
              "upstream_response_time": "0.191",
              "msec": "1525132811.430",
              "server_name": "die-el2-adm.lampiris.be",
              "upstream_addr": "10.10.154.155:8080",
              "request_body": "-",
              "timestamp": "01/May/2018:02:00:11 +0200",
              "http_referer": "-",
              "request": "GET /el2-adm-backend/comeca/billing/accesspoint/19809406621850/billingdatae=Wed+Dec+13+00%3A00%3A00+CET+2017 HTTP/1.1",
              "remote_name": "-",
              "request_time": "0.191",
              "body_bytes_sent": "181",
              "status": "500"
            }
          },
          "read_timestamp": "2018-05-01T00:00:11.543Z",
          "source": "/var/log/nginx/die-el2-adm.lampiris.be-access.log",
          "offset": 1149019,
          "@version": "1",
          "fields": {
            "env": "die",
            "document_type": "nginx"
          },
          "host": "die158",
          "@timestamp": "2018-05-01T00:00:11.000Z",
          "prospector": {
            "type": "log"
          },
          "message": "[01/May/2018:02:00:11 +0200] remote_addr : 10.1.2.171 remote_user : - server_name : backend/comeca/billing/accesspoint/19809406621850/billingdata?http_user_agent : Apache CXF 2.6.9",
          "beat": {
            "name": "filebeat",
            "hostname": "die158",
            "version": "6.2.4"
          }
        }
      },
      {
        "_index": "filebeat-2018.05.01",
        "_type": "doc",
        "_id": "kZb_GGMBAaZyQaoS3Qqp",
        "_score": 1,
        "_source": {
          "tags": [
            "beats_input_codec_plain_applied"
          ],
          "source": "/var/log/wildfly/server.log",
          "offset": 11616,
          "@version": "1",
          "fields": {
            "env": "die",
            "document_type": "wildfly"
          },
          "host": "die161",
          "@timestamp": "2018-05-01T00:00:04.570Z",
          "wildfly": {
            "loglevel": "ERROR",
            "message": """
[org.meveo.security.keycloak.CurrentUserProvider] (EJB default - 1) No session context=WELD-001303: No active contexts for scope type javax.enterprise.context.SessionScoped
""",
            "timestamp": "2018-05-01 02:00:00,016"
          },
          "prospector": {
            "type": "log"
          },
          "beat": {
            "name": "filebeat",
            "hostname": "die161",
            "version": "6.2.4"
          },
          "message": """
2018-05-01 02:00:00,016 ERROR [org.meveo.security.keycloak.CurrentUserProvider] (EJB default - 1) No 
"""
        }
      }
    ]
  }
}

(David Pilato) #8

260ms de temps de réponse côté elasticsearch, c'est pas si mal.

En plus tu n'utilises pas du SSD local. C'est plutôt honorable je dirais.


#9

C'est vrai quand je fais la commande le temps de réponse est plus que convenable mais dès que je vais dans le Discover et que je lui demande d'afficher les logs de ma semaine c'est long.

Ce serait donc normal avec le matériel actuel ?


(David Pilato) #10

Dans discover, le truc est que Kibana ramène plus de documents que par défaut.
Je ne me souviens plus de mémoire, mais c'est peut-être 1000.

Comme tu as un disque lent, la lecture de tous ces JSON peut du coup être lente.

Ajuste éventuellement dans les options de Kibana le nombre d'enregistrements rapatriés.


#11

D'accord, merci beaucoup :slight_smile:

Juste une petite dernière question. Vous me conseiller de mettre combien de shards par index (donc par jour dans mon cas) j'ai laissé la valeur par défaut qui est de 5, ça vous semble bon ?


(David Pilato) #12

En général la valeur par défaut n'est pas bonne.

Je t'invite à regarder:


#13

Super merci beaucoup


(system) #14

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.