Authentication Error setting up Eland on Jupyter Notebook

Hi, I am trying to use Eland on Jupyter Notebook. Here is what I wrote on Jupyter

import eland as ed
import pandas as pd
import numpy as np
from elasticsearch import Elasticsearch

def json(x):
    import json
    print(json.dumps(x, indent=2, sort_keys=True))

es = Elasticsearch(
    cloud_id="<cloud id>",
    basic_auth=("<username>", "<password>")


However upon 'Run' this is the error I got -

AuthenticationException: AuthenticationException(401, 'security_exception', 'unable to authenticate user [username] for REST request [/]')

What am I doing wrong here as I indicated the username & password that I use to login to Elastic Cloud portal, but seems to not able to work. Also used basic_auth instead of http_auth that was shown on the video demo by Elastic that I followed, as it indicated that http_auth has been deprecated.

Hello, and thank you for your interest in Eland!

As you have realized, you should not use your Elastic Cloud password, but your Elasticsearch password. A common choice is the elastic user: its password was shown to you when you created the deployment, but you can also generate a new one: Reset the elastic user password | Elasticsearch Service Documentation | Elastic.

Then you can indeed use basic_auth=("elastic", "password") which is correct.

Note that the warnings in the docs above apply: for better security consider using an API key or an user account with less privileges. See Connecting | Elasticsearch Python Client [8.9] | Elastic for how to use those authentication methods using the Elasticsearch Python client that Eland relies on.

Hi Quentin, that works flawlessly! Didn't know I have to use Elasticsearch password instead of Elastic Cloud. Thanks.

Also, on a separate note, I managed to pull data successfully unto Jupyter if I used
df = ed.Dataframe(es, "logs-aws*") however the moment I used
df = ed.Dataframe(es, "logs-*"), I received the error below. May I know why is this so?

<directory>\ UserWarning: Field message has conflicting types ('match_only_text', None) != text
<directory>\ UserWarning: Field has conflicting types ('text', None) != match_only_text
<directory>\ UserWarning: Field error.message has conflicting types ('match_only_text', None) != keyword
<directory>\ UserWarning: Field event.dataset has conflicting types ('constant_keyword', None) != keyword
<directory>\ UserWarning: Field event.ingested has conflicting types ('date', 'strict_date_time_no_millis||strict_date_optional_time||epoch_millis') != keyword
<directory>\ UserWarning: Field has conflicting types ('match_only_text', None) != text
<directory>\ UserWarning: Field has conflicting types ('match_only_text', None) != text

When converting indices to a dataframe representation, Eland needs to figure out what is the type of each column. It uses the Elasticsearch mappings to do so. And then, for example, the integer Elasticsearch type gets mapped to the int64 pandas type.

This is fine for your logs-aws* indices. But what should happen when you ask for multiple indices that use different mappings, such as your logs-* indices that apparently include more indices than just logs-aws-*? There can be only one type in a given dataframe. So what Eland does is that it takes the first definition and sticks with it. If it then finds another index with a different types, it ignores it but warns you about it. This is what you've seen. It's only a warning,

(An additional issue here is that Eland does not know about match_only_text and constant_keyword yet, so it maps them to object in Pandas.)

In your case, the message field was first seen as match_only_text and presumably mapped to object in Pandas, but then another index had it mapped as text, but that is ignored and Eland is telling you that.

If it does not cause any issues for you, you can ignore the warnings.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.