High CPU utilization on master node

I recycled the pods and ran the process again. Can you please have a look at the logs again?

The snapshot started at 2021-12-02T14:41:01,809Z; you've only shared logs from one data node and they only start at 2021-12-02T14:46:05,544Z.

Hi @DavidTurner , here are the logs again. The file is huge 308mb and I was unable to create the gist. So have created a repo and uploaded the file there. Would appreciate it if you can look at the logs.

It seems they are too big for Github too, at least it's using some large-file feature that I don't have installed (and won't be spending time installing either):

$ git clone git@github.com:guptaparv-rlr/snapshot.git
Cloning into 'snapshot'...
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 5 (delta 0), reused 5 (delta 0), pack-reused 0
Receiving objects: 100% (5/5), done.
$ ls -al snapshot
total 24
drwxr-xr-x   6 davidturner  staff  192  8 Dec 07:48 .
drwxr-xr-x   5 davidturner  staff  160  8 Dec 07:48 ..
drwxr-xr-x  12 davidturner  staff  384  8 Dec 07:48 .git
-rw-r--r--   1 davidturner  staff  143  8 Dec 07:48 .gitattributes
-rwxr-xr-x   1 davidturner  staff  134  8 Dec 07:48 data-1.json
-rwxr-xr-x   1 davidturner  staff  131  8 Dec 07:48 master-1.json
$ cat snapshot/data-1.json
version https://git-lfs.github.com/spec/v1
oid sha256:b103a5ac01171adfe54e8ad8a1a8d2af4576b60f2c458e0609ed21eef1b7df52
size 323753450

Could you just pick out the time range for the one specific snapshot and then gzip the files first?

I have zipped the files. You should be able to download and view the files now.

Thank you

Looks like it's performing fine, but you've configured ES to split each file into chunks of 32 bytes?

writeBlob(indices/aGkkRKEdQ4qxP_n3An93tg/0/__dqJn2U4XQpSdpaLa4n8zAw.part1650, stream, 32) - done

Typical chunk sizes are measured in GBs, not bytes. Uploading a 1MB blob normally takes 50ms or so, but if you divide it into 30000 32-byte chunks then a few milliseconds of per-chunk overhead add up to minutes of wasted time.

1 Like

Thanks @DavidTurner. I'm not really sure where I saw the 32 bytes chunk size(maybe somewhere in the docs or github). But I have updated it to the default setting of 64Mb and now the snapshots are back to 1-3s.

Thank you so much for your help. Really grateful.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.