How to overwrite docs with same id?


(project2501) #1

Hi,
I notice I cannot store a document and control the _id field. It gives an
error. I read that I need to 'enable' storing _id?
That is very odd.

I want to store a bunch of documents. And then I want to update ones by
simply overwriting them with same _id.
I don't want to have to query to get an _id, that is a waste of time.

How can this be done? Upsert? I need to control the _id

thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(project2501) #2

Ok, I was able to find info on starting titan using titan.sh. The titan
docs are pretty bad. Doesn't even tell you how to start the server (in the
getting started page).

Anyway, once I did that. I now get this.

from bulbs.titan import Graph
g = Graph()
james = g.vertices.create(name="James")
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/bulbs/element.py", line 565,
in create
resp = self.client.create_vertex(data)
File "/usr/local/lib/python2.7/dist-packages/bulbs/rexster/client.py",
line 370, in create_vertex
return self.request.post(vertex_path,data)
File "/usr/local/lib/python2.7/dist-packages/bulbs/rest.py", line 128, in
post
return self.request(POST, path, params)
File "/usr/local/lib/python2.7/dist-packages/bulbs/rest.py", line 183, in
request
return self.response_class(http_resp, self.config)
File "/usr/local/lib/python2.7/dist-packages/bulbs/rexster/client.py",
line 198, in init
self.handle_response(response)
File "/usr/local/lib/python2.7/dist-packages/bulbs/rexster/client.py",
line 222, in handle_response
response_handler(http_resp)
File "/usr/local/lib/python2.7/dist-packages/bulbs/rest.py", line 39, in
not_found
raise LookupError(http_resp)
LookupError: ({'status': '404', 'transfer-encoding': 'chunked', 'vary':
'Accept', 'server': 'grizzly/2.2.16', 'date': 'Wed, 13 Nov 2013 15:51:54
GMT', 'access-control-allow-origin': '*', 'content-type':
'application/json;charset=ISO-8859-1'}, '{"message":"Graph [titanexample]
could not be found"}')

I realize that its probably my ignorance here, but the docs on both sides
for this are pretty lean. I do appreciate the help.

On Wednesday, November 13, 2013 10:14:24 AM UTC-5, project2501 wrote:

Hi,
I notice I cannot store a document and control the _id field. It gives
an error. I read that I need to 'enable' storing _id?
That is very odd.

I want to store a bunch of documents. And then I want to update ones by
simply overwriting them with same _id.
I don't want to have to query to get an _id, that is a waste of time.

How can this be done? Upsert? I need to control the _id

thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(project2501) #3

Ok, I was able to find info on starting titan using titan.sh (on
stackoverflow of all places). The titan docs are pretty bad. Doesn't even
tell you how to start the server (in the getting started page).

Anyway, once I did that. I now get this.

from bulbs.titan import Graph
g = Graph()
james = g.vertices.create(name="James")
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/bulbs/element.py", line 565,
in create
resp = self.client.create_vertex(data)
File "/usr/local/lib/python2.7/dist-packages/bulbs/rexster/client.py",
line 370, in create_vertex
return self.request.post(vertex_path,data)
File "/usr/local/lib/python2.7/dist-packages/bulbs/rest.py", line 128, in
post
return self.request(POST, path, params)
File "/usr/local/lib/python2.7/dist-packages/bulbs/rest.py", line 183, in
request
return self.response_class(http_resp, self.config)
File "/usr/local/lib/python2.7/dist-packages/bulbs/rexster/client.py",
line 198, in init
self.handle_response(response)
File "/usr/local/lib/python2.7/dist-packages/bulbs/rexster/client.py",
line 222, in handle_response
response_handler(http_resp)
File "/usr/local/lib/python2.7/dist-packages/bulbs/rest.py", line 39, in
not_found
raise LookupError(http_resp)
LookupError: ({'status': '404', 'transfer-encoding': 'chunked', 'vary':
'Accept', 'server': 'grizzly/2.2.16', 'date': 'Wed, 13 Nov 2013 15:51:54
GMT', 'access-control-allow-origin': '*', 'content-type':
'application/json;charset=ISO-8859-1'}, '{"message":"Graph [titanexample]
could not be found"}')

I realize that its probably my ignorance here, but the docs on both sides
for this are pretty lean. I do appreciate the help.

On Wednesday, November 13, 2013 10:14:24 AM UTC-5, project2501 wrote:

Hi,
I notice I cannot store a document and control the _id field. It gives
an error. I read that I need to 'enable' storing _id?
That is very odd.

I want to store a bunch of documents. And then I want to update ones by
simply overwriting them with same _id.
I don't want to have to query to get an _id, that is a waste of time.

How can this be done? Upsert? I need to control the _id

thanks.

On Wednesday, November 13, 2013 10:14:24 AM UTC-5, project2501 wrote:

Hi,
I notice I cannot store a document and control the _id field. It gives
an error. I read that I need to 'enable' storing _id?
That is very odd.

I want to store a bunch of documents. And then I want to update ones by
simply overwriting them with same _id.
I don't want to have to query to get an _id, that is a waste of time.

How can this be done? Upsert? I need to control the _id

thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Luca Cavanna) #4

What do you mean by "controlling the _id"? Using the index api, you need to
send index, type and id (optional), plus the document itself as request
body.
If you so send the _id, you can control what value you use for it,
otherwise it's going to be auto-generated. In case it is auto0generated in
order to overwrite an existing document you need to retrieve it and get its
_id back.

Does this help?

On Wednesday, November 13, 2013 4:14:24 PM UTC+1, project2501 wrote:

Hi,
I notice I cannot store a document and control the _id field. It gives
an error. I read that I need to 'enable' storing _id?
That is very odd.

I want to store a bunch of documents. And then I want to update ones by
simply overwriting them with same _id.
I don't want to have to query to get an _id, that is a waste of time.

How can this be done? Upsert? I need to control the _id

thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(project2501) #5

Hi Luca,

I try to set the '_id' field, but it throws and error when I do. I did some
searches about this but it's still not clear why I can't specify _id.

Traceback (most recent call last):
File "load-pyes.py", line 25, in
conn.index('default','lexeme',entry,id=i)
File "/usr/local/lib/python2.7/dist-packages/pyelasticsearch/client.py",
line 96, in decorate
return func(*args, query_params=query_params, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pyelasticsearch/client.py",
line 344, in index
query_params)
File "/usr/local/lib/python2.7/dist-packages/pyelasticsearch/client.py",
line 254, in send_request
self._raise_exception(resp, prepped_response)
File "/usr/local/lib/python2.7/dist-packages/pyelasticsearch/client.py",
line 269, in _raise_exception
raise error_class(response.status_code, error_message)
pyelasticsearch.exceptions.ElasticHttpError: (400,
u'MapperParsingException[failed to parse [_id]]; nested:
MapperParsingException[Provided id [1] does not match the content one
[f5e638cc78dd325906c1298a0c21fb6b]]; ')

On Wednesday, November 13, 2013 1:05:41 PM UTC-5, Luca Cavanna wrote:

What do you mean by "controlling the _id"? Using the index api, you need
to send index, type and id (optional), plus the document itself as request
body.
If you so send the _id, you can control what value you use for it,
otherwise it's going to be auto-generated. In case it is auto0generated in
order to overwrite an existing document you need to retrieve it and get its
_id back.

Does this help?

On Wednesday, November 13, 2013 4:14:24 PM UTC+1, project2501 wrote:

Hi,
I notice I cannot store a document and control the _id field. It gives
an error. I read that I need to 'enable' storing _id?
That is very odd.

I want to store a bunch of documents. And then I want to update ones by
simply overwriting them with same _id.
I don't want to have to query to get an _id, that is a waste of time.

How can this be done? Upsert? I need to control the _id

thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6