Elasticsearch-py index and get errors


(Paul Starrett) #1

Below code creates traceback that follows. I am trying to get json file into and out of ES and have checked documentation but cannot seem to resolve. Appreciate any help!

Code:

d = {}
def text_to_json(data):
for k,v in data.items():
with open(v, 'r') as r:
cont = r.read().strip()
d[k] = cont

a = json.dumps(d)
return a

engine_and_file = {
'Google': './page_content_first.txt/part-00000',
}

a = (text_to_json(engine_and_file))

es.index('webpage-import', 'html', a)
es.get('webpage-import', 'html', id=1)

Traceback (most recent call last):
File "/home/ubuntu/Python_Projects/Spark/spark-1.5.1-bin-hadoop2.6/./Spark_FileImport_Test.py", line 60, in
es.get('webpage-import', 'html', id=1)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 69, in _wrapped
return func(*args, params=params, **kwargs)
TypeError: get() got multiple values for keyword argument 'id'


(Magnus B├Ąck) #2

By mixing positional and keyword arguments you are effectively supplying the id argument twice. Here's the definition of the get() call:

https://github.com/elastic/elasticsearch-py/blob/2.1.0/elasticsearch/client/init.py#L296

Note how id is the second positional argument, so you're actually passing "html" as the id, and then you're passing id again as a keyword arguments.

The documentation doesn't list any positional arguments at all, only keyword arguments, so you should only use keyword arguments.


(Paul Starrett) #3

Magnus, yes, that was the trick. I am calling the fields by name. Actually, I found that I also was not properly calling the doc ID. I was calling id='1' when ES was supplying the doc ID. Once I got the correct ID, it returns the doc. Thank you very much!


(system) #4