LDAPS causing "com.unboundid.ldap.sdk.LDAPSearchException: time limit exceeded"

Hey guys, hope you're all doing well :smile:

I'm facing an extremely strange issue with my LDAP configuration for Shield. This only occurs when I attempt to connect to my LDAP server using SSL (i.e .LDAPS) and it occurs around 70% of the time (with the other 30 or so % working as expected).

Basically what happens is that immediately (i.e. in a few milliseconds) after attempting to authenticate against Elasticsearch (using curl), the following error shows up in the logs and the auth attempt fails:

[2016-05-04 22:02:23,292][WARN ][shield.authc.ldap        ] [elasticsearch-client-node] authentication failed for user [fotis]: could not search for LDAP groups for DN [uid=fotis,ou=people,ou=staff,dc=aaa,dc=example,dc=com]
cause: com.unboundid.ldap.sdk.LDAPSearchException: time limit exceeded

My configuration is as follows:

shield:
  authc:
    realms:
      file1:
        order: 0
        type: file
      ldap1:
        connect_timeout: 120s
        read_timeout: 120s
        order: 1
        type: ldap
        url: ldaps://ldap.example.com
        user_search:
          base_dn: ou=staff,dc=aaa,dc=example,dc=com
          pool:
            health_check:
              enabled: false
        group_search:
          base_dn: ou=staff,dc=aaa,dc=example,dc=com
  ssl:
    keystore:
      path: /etc/elasticsearch/client-node/shield/node01.jks
      password: abcabc

Any help would be greatly appreciated!

Thanks so much
Fotis

I have same issue , Any solution for this ?

Maybe you can add timeout.ldap_search: 10s to your realm's configuration? The default is 5 seconds. I am not sure why ldaps would trigger this as the timeout for this is controlled by the ldap server.

We had to do the same, but with a default timeout of 5 seconds, why is this failing instantly? I suspect that there's a bug here. The default value of 5 seconds should have been more than enough for our LDAP server.

Do you know what type of ldap server you are connecting to? The LDAP server is returning a result code of 3; I know some implementations will do this for a broad filter such as one using wildcards, but that doesn't seem to be the case here

Good point mate, well yeah this is a strange Sun LDAP server running behind Oracle Sentinel that I have been asked to use.

Perhaps this behaviour won't be present with a regular OpenLDAP implementation but I don't have the ability to test that right now sadly.