Kibana v7.16.1 login issue. Not able to login (ResponseError: version_conflict_engine_exception)

Hello,

We are on v7.16.1 of Kibana/Elastic and facing issues with Kibana login all of a sudden. It was working fine around couple of hours back When checked in the logs seeing below exceptions.

{"type":"log","@timestamp":"2022-07-01T12:01:16+00:00","tags":["error","plugins","security","session","index"],"pid":115029,"message":"Failed to create session value: version_conflict_engine_exception: [version_conflict_engine_exception] Reason: [F8GE4AZDdmPIGpePXKJ/T4fzvV2jwC5FQdz1N4mPseo=]: version conflict, document already exists (current version [1])"}
{"type":"log","@timestamp":"2022-07-01T12:01:16+00:00","tags":["error","http"],"pid":115029,"message":"ResponseError: version_conflict_engine_exception: [version_conflict_engine_exception] Reason: [F8GE4AZDdmPIGpePXKJ/T4fzvV2jwC5FQdz1N4mPseo=]: version conflict, document already exists (current version [1])\n at onBody (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:367:23)\n at IncomingMessage.onEnd (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:291:11)\n at IncomingMessage.emit (node:events:402:35)\n at endReadableNT (node:internal/streams/readable:1343:12)\n at processTicksAndRejections (node:internal/process/task_queues:83:21) {\n meta: {\n body: { error: [Object], status: 409 },\n statusCode: 409,\n headers: {\n 'x-elastic-product': 'Elasticsearch',\n 'content-type': 'application/json; charset=UTF-8',\n 'content-length': '547'\n },\n meta: {\n context: null,\n request: [Object],\n name: 'elasticsearch-js',\n connection: [Object],\n attempts: 1,\n aborted: false\n }\n }\n}"}
{"type":"error","@timestamp":"2022-07-01T11:59:46+00:00","tags":,"pid":115029,"level":"error","error":{"message":"Internal Server Error","name":"Error","stack":"Error: Internal Server Error\n at HapiResponseAdapter.toInternalError (/usr/share/kibana/src/core/server/http/router/response_adapter.js:61:19)\n at Router.handle (/usr/share/kibana/src/core/server/http/router/router.js:172:34)\n at runMicrotasks ()\n at processTicksAndRejections (node:internal/process/task_queues:96:5)\n at handler (/usr/share/kibana/src/core/server/http/router/router.js:124:50)\n at exports.Manager.execute (/usr/share/kibana/node_modules/@hapi/hapi/lib/toolkit.js:60:28)\n at Object.internals.handler (/usr/share/kibana/node_modules/@hapi/hapi/lib/handler.js:46:20)\n at exports.execute (/usr/share/kibana/node_modules/@hapi/hapi/lib/handler.js:31:20)\n at Request._lifecycle (/usr/share/kibana/node_modules/@hapi/hapi/lib/request.js:371:32)\n at Request._execute (/usr/share/kibana/node_modules/@hapi/hapi/lib/request.js:281:9)"},"url":"http://<Kibana_Host>:<Kibana_Port>/internal/security/login","message":"Internal Server Error"}
{"type":"log","@timestamp":"2022-07-01T12:01:18+00:00","tags":["error","plugins","taskManager"],"pid":115029,"message":"Failed to poll for work: Error: work has timed out"}

Attached the login error screenshot as well. Did any one faced the similar issue and if yes, pleas post the solution to resolve this.

Cheers.

How are you creating your index pattern and did you also create an index pattern template?

@Pierre_Gayvallet how do we fix this?

Thanks,
Bhavya

@bhavyarm

Yes, We have created index templates and gave the index patterns to apply to that template. Also, We are using Data Streams instead of indices.

Any inputs on resolving this is much appreciated. Thanks.

@bhavyarm Any update on the issue?

@Avinash_09 - Did you happen to disable index.refresh_interval (i.e value = -1). We have seen similar errors in the past where the refresh interval was disabled on the .security-tokens-7 , .security-7 or .kibana_security_session_1 indices and this would result into similar issues. I would recommend to check these indices settings (c.f GET .kibana_security_session_1/_settings, etc.)

Hi @ropc ,

No, We haven't disabled any refresh interval settings on these indices in our environment. We have recently upgraded the Elastic stack to 8.3.3 to overcome this issue.

But, Post upgrade, When checked about this indices, For .kibana_security_session_1, .security-7 indices the refresh interval is with 1s and there is no index named .security-tokens-7. Getting below message in dev-tools:
'''
"type": "index_not_found_exception",
"reason": "no such index [.security-tokens-7]"
'''
What could be the next steps to resolve this issue? Can re-install of kibana nodes and joining them to existing elastic cluster would help? If yes, Should we need to follow any pre-steps before re-install of kibana to avoid any data loss w.r.t kibana related indices.

Regards,
Avinash

No, We haven't disabled any refresh interval settings on these indices in our environment. We have recently upgraded the Elastic stack to 8.3.3 to overcome this issue.

I am glad this is working after the upgrade.

there is no index named .security-tokens-7. Getting below message in dev-tools

The .security-tokens-7 index will only exists if you have authentication realms where access tokens need to be stored (e.g SAML authentication realm).

Hi @ropc,

Thanks for the quick reply. But, Post upgrade also we are still seeing the issue intermittently when accessing with kibana internal URL. Along with that, We are having new issue when accessing the Kibana via proxy/public URL.

Below is the error message:

Please upgrade your browser

This Elastic installation has strict security requirements enabled that your current browser does not meet

PFA screenshot for the same.

Can you please help us here.

Regards,
Avinash

Post upgrade also we are still seeing the issue intermittently when accessing with kibana internal URL

Are you referring to the same error: version_conflict_engine_exception? Do you happen to have multiple Kibana instances running?

We are having new issue when accessing the Kibana via proxy/public URL.

It may be related to Content Security Policy (CSP) and/or some outdated browsers. You may want to check this.

Yes @ropc,

To your question 1: Yes, I am referring the same error. Yes, We have 2 kibana instances running for this cluster. Any other places to check on this issue?

Thanks for info on point 2. I will check on Content Security Policy (CSP) settings and see if the browser outdated issue gets resolved.

Regards,
Avinash

To your question 1: Yes, I am referring the same error. Yes, We have 2 kibana instances running for this cluster. Any other places to check on this issue?

Is this proxy used as a load-balancer as well? Perhaps you may want to check the following documentation:

Hi @ropc,

Yes, Proxy is used as load balancer. We have followed the necessary steps as per documentation. But, Will cross check on them and get back.

w.r.t browser issue (Content Security Policy), even after applying the settings still we see the same info message on browser.

Please upgrade your browser

This Elastic installation has strict security requirements enabled that your current browser does not meet.

PFA Screenshot for the same.

Also, I could see below error journal logs. Is there anything to do with this error?

Aug 11 12:55:52 dev5083 kibana[97108]: [2022-08-11T12:55:52.888+00:00][ERROR][plugins.taskManager] Failed to poll for work: Error: work has timed out

  1. What is the current setting in Kibana for csp.strict and are you facing this issue on all browsers? Which browser / browser version are you using?

  2. Have you tried to bypass the proxy and check if the same problem happens?

csp.strict: true is the current setting in Kibana. Yes, I am facing issue on all browsers. I have tried on below browsers and their versions are mentioned as well.

Chrome Version: 104.0.5112.81 (Official Build) (64-bit)
Firefox Version: 103.0.2 (64-bit)
Microsoft Edge version: Version 104.0.1293.47 (Official build) (64-bit)

Yes, We tried bypassing the proxy and no problem is noticed when bypassing the proxy.

Regards,
Avinash

If bypassing the proxy works, then it must be something related to the proxy and its configuration. This is not really my expertise but I can ask around...

Which proxy are you using?

@ropc, Thanks for the update. We suspect the same. We are using Apache httpd as proxy server. Can you check and update us w.r.t the changes need to be done on Proxy server to fix this issue?