Elastic Cloud SAML XSRF Header

We've updating our deployments to Elastic Cloud 6.5.2 so we can take advantage of the new SAML integration.

Two of our deployments are now using the SAML integration to connect to our AD FS IdP without issue. However, our third deployment is not working, even with an equivalent configuration in Elasticsearch and Kibana.
The configuration on the IdP side is also equivalent between the three deployments. When we attempt an IdP initiated sign on we receive an error message from Kibana after the SAML request:

{"statusCode":400,"error":"Bad Request","message":"Request must contain a kbn-xsrf header."}

My understanding is this relates to the kibana.yml setting for white-listing paths from XSRF header validation. Beyond that we are not sure what the issue is.
I should note that we first attempted running the 'bad deployment' behind a proxy so we could use our own domain name and certificate. We had originally thought
this was the source of the problem, but since reverting the configuration to the default Kibana URL without the proxy we still receive the error. Here are two of the
configuration sets, they appear to be equivalent:

Working Deployment:

xpack.security.authc.realms.cloud-saml: 
  type: saml
  order: 2
  attributes.principal:        "nameid:persistent" 
  attributes.groups:           "groups" 
  idp.metadata.path:           "https://adfs.example.com/FederationMetadata/2007-06/FederationMetadata.xml" 
  idp.entity_id:               "http://adfs.example.com/adfs/services/trust"
  sp.entity_id:                "https://good-deployment.us-east-1.aws.found.io:9243/" 
  sp.acs:                      "https://good-deployment.us-east-1.aws.found.io:9243/api/security/v1/saml"
  sp.logout:                   "https://good-deployment.us-east-1.aws.found.io:9243/logout"
xpack.security.public:
  protocol: "https"
  hostname: "good-deployment.us-east-1.aws.found.io"
  port: 9243

Non-working Deployment:

xpack.security.authc.realms.cloud-saml: 
  type: saml
  order: 2
  attributes.principal:        "nameid:persistent" 
  attributes.groups:           "groups" 
  idp.metadata.path:           "https://adfs.example.com/FederationMetadata/2007-06/FederationMetadata.xml" 
  idp.entity_id:               "http://adfs.example.com/adfs/services/trust"
  sp.entity_id:                "https://bad-deployment.us-east-1.aws.found.io:9243/" 
  sp.acs:                      "https://bad-deployment.us-east-1.aws.found.io:9243/api/security/v1/saml"
  sp.logout:                   "https://bad-deployment.us-east-1.aws.found.io:9243/logout"
xpack.security.public:
  protocol: "https"
  hostname: "bad-deployment.us-east-1.aws.found.io"
  port: 9243

I'm looking for suggestions on next steps for figuring out what the issue might be.

Hey,

Have you gone through the documentation ? We have a section for kibana config (that also contains a link to Kibana's relevant documentation page) that describes that you need to set

server.xsrf.whitelist: [/api/security/v1/saml]

in kibana.yml, among other. Can you verify this is set in all Kibana deployements ( especially the not working one ) ? Also can you please share a larger part of the kibana logs ?

Woops, I did not include the entire Kibana configuration. Yes we have that setting there in all three deployments:

xpack.security.authProviders: [ "saml", "basic" ]
server.xsrf.whitelist: [ "/api/security/v1/saml" ]
xpack.security.public:
  protocol: "https"
  hostname: "good-deployment.us-east-1.aws.found.io"
  port: 9243
xpack.security.authProviders: [ "saml", "basic" ]
server.xsrf.whitelist: [ "/api/security/v1/saml" ]
xpack.security.public:
  protocol: "https"
  hostname: "bad-deployment.us-east-1.aws.found.io"
  port: 9243

Sorry about that.

We have looked at this documentation for the latest release of Elastic Cloud. We looked at some other settings from the non-cloud SAML realms, they don't seem to be supported in the cloud realm so we've stuck to that EC-specific documentation since then.

What specifically am I looking for in the Kibana logs? Are those stored with the Elasticsearch logs in the Elastic Cloud console?

Tried restarting Kibana and the Elasticsearch cluster to make sure that the configuration were applied correctly, but we are still running into the same issue.

Is there anything else you could suggest we try?

This all points to a misconfiguration in the non-working instance. Apologies if this sounds too obvious but have you checked your kibana saml config for typos, trailing spaces, extra characters etc ?

@Brandon_Kobel, do you happen to have an idea of would could be the issue here from Kibana's perspective ?

The {"statusCode":400,"error":"Bad Request","message":"Request must contain a kbn-xsrf header."} response alludes to the server.xsrf.whitelist: setting in the kibana config being incorrect, which you've already recommended they double check.

I was able to not set the xpack.security.public settings in the Kibana config as well when running in Cloud, perhaps that is worth a try?

Thanks for the reply.

Yeah I understand its hard to suggest something else for me to check, it really does seem like a misconfiguration.

I ran both configurations through a diff-ing tool, doesn't appear to be anything unusual like trailing whitespace, improper indentation or extra slashes. From what I can tell they still appear to be equivalent configurations between deployments.

I tried removing the xpack.security.public segment of the Kibana configuration. The already working configuration continued to function, the non-working configuration is still running into the same issue. This makes configuration simpler though, which is nice!

I also captured the SAML response headed to each deployment and diff-ed those as well, they only appear to differ in the places you'd expect (audience, destination, etc). Both of those fields have the same format in both SAML responses.

At this point, it's probably worth-while to open up a Cloud Support ticket here to have them look at the logs for your specific cluster to figure out what could be going awry.

Will do, thanks for the help!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.