Hi, I am trying to use Eland on Jupyter Notebook. Here is what I wrote on Jupyter
import eland as ed
import pandas as pd
import numpy as np
from elasticsearch import Elasticsearch
def json(x):
import json
print(json.dumps(x, indent=2, sort_keys=True))
es = Elasticsearch(
cloud_id="<cloud id>",
basic_auth=("<username>", "<password>")
)
json(es.info())
However upon 'Run' this is the error I got -
AuthenticationException: AuthenticationException(401, 'security_exception', 'unable to authenticate user [username] for REST request [/]')
What am I doing wrong here as I indicated the username & password that I use to login to Elastic Cloud portal, but seems to not able to work. Also used basic_auth instead of http_auth that was shown on the video demo by Elastic that I followed, as it indicated that http_auth has been deprecated.
As you have realized, you should not use your Elastic Cloud password, but your Elasticsearch password. A common choice is the elastic user: its password was shown to you when you created the deployment, but you can also generate a new one: Reset the elastic user password | Elasticsearch Service Documentation | Elastic.
Then you can indeed use basic_auth=("elastic", "password") which is correct.
Note that the warnings in the docs above apply: for better security consider using an API key or an user account with less privileges. See Connecting | Elasticsearch Python Client [8.9] | Elastic for how to use those authentication methods using the Elasticsearch Python client that Eland relies on.
Hi Quentin, that works flawlessly! Didn't know I have to use Elasticsearch password instead of Elastic Cloud. Thanks.
Also, on a separate note, I managed to pull data successfully unto Jupyter if I used df = ed.Dataframe(es, "logs-aws*") however the moment I used df = ed.Dataframe(es, "logs-*"), I received the error below. May I know why is this so?
<directory>\field_mappings.py:324: UserWarning: Field message has conflicting types ('match_only_text', None) != text
warnings.warn(
<directory>\field_mappings.py:324: UserWarning: Field host.os.name.text has conflicting types ('text', None) != match_only_text
warnings.warn(
<directory>\field_mappings.py:324: UserWarning: Field error.message has conflicting types ('match_only_text', None) != keyword
warnings.warn(
<directory>\field_mappings.py:324: UserWarning: Field event.dataset has conflicting types ('constant_keyword', None) != keyword
warnings.warn(
<directory>\field_mappings.py:324: UserWarning: Field event.ingested has conflicting types ('date', 'strict_date_time_no_millis||strict_date_optional_time||epoch_millis') != keyword
warnings.warn(
<directory>\field_mappings.py:324: UserWarning: Field destination.as.organization.name.text has conflicting types ('match_only_text', None) != text
warnings.warn(
<directory>\field_mappings.py:324: UserWarning: Field destination.user.name.text has conflicting types ('match_only_text', None) != text
warnings.warn(
......
When converting indices to a dataframe representation, Eland needs to figure out what is the type of each column. It uses the Elasticsearch mappings to do so. And then, for example, the integer Elasticsearch type gets mapped to the int64 pandas type.
This is fine for your logs-aws* indices. But what should happen when you ask for multiple indices that use different mappings, such as your logs-* indices that apparently include more indices than just logs-aws-*? There can be only one type in a given dataframe. So what Eland does is that it takes the first definition and sticks with it. If it then finds another index with a different types, it ignores it but warns you about it. This is what you've seen. It's only a warning,
(An additional issue here is that Eland does not know about match_only_text and constant_keyword yet, so it maps them to object in Pandas.)
In your case, the message field was first seen as match_only_text and presumably mapped to object in Pandas, but then another index had it mapped as text, but that is ignored and Eland is telling you that.
If it does not cause any issues for you, you can ignore the warnings.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.