Index by query?

sbellem · February 21, 2013, 10:37pm

Hi,

is it possible (in one step) to index documents that match the results of
a given query? That is, similar to the Delete by Query API, can one index
by query?

Thanks,
Sylvain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ivan · February 21, 2013, 10:47pm

Do you mean update by query? If so, there is an open issue to support it:

github.com/elastic/elasticsearch

Update API: update by query

opened 07:36PM - 12 Jan 12 UTC

closed 07:52AM - 18 Jul 14 UTC

monken

:Distributed/CRUD

#1583 allows to update individual documents. Update by query will reduce the net…work roundtrips radically if you want to update a number of documents and push work from the client to ES. ``` curl -XPOST localhost:9200/index/type/_update -d '{ "query" : { "constant_score" : { "filter" : { "term" : { "counter" : 0 } } } }, "script" : "ctx._source.counter += count", "params" : { "count" : 4 } }' ```

If you really mean index by query, you would need to provide an example,
because I don't see how something would be returned by a query unless it
already existed.

Cheers,

Ivan

On Thu, Feb 21, 2013 at 2:37 PM, sbellem sbellem@gmail.com wrote:

Hi,

is it possible (in one step) to index documents that match the results of
a given query? That is, similar to the Delete by Query API, can one index
by query?

Thanks,
Sylvain

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Clinton_Gormley · February 22, 2013, 10:43am

Hiya

is it possible (in one step) to index documents that match the
results of a given query? That is, similar to the Delete by Query API,
can one index by query?

This is something you would need to handle yourself. Depending on which
client API you use, you may already have some utilities available to
make it easy.

For instance, in the Perl API, you could do:

$search = $es->scrolled_search(
    query => { ... whatever query ... },
    search_type => 'scan'
);

$es->reindex( source => $search, dest_index => 'new_index')

If your API doesn't have anything similar, then you can write it
yourself. All it consists of is:

a scrolled search (with scanning)
bulk indexing

To implement a scrolled search, run an ordinary search request, with
these parameters:

scroll: "1m"
this says that you want to take a "snapshot" of the current state
of your data, and to keep that around for, eg, "1m"
search_type: "scan"
this search type disables sorting and is very efficient for
retrieving large numbers of documents from elasticsearch
Elasticsearch Platform — Find real-time answers at scale | Elastic
size: 1000
the number of docs to return in each request. actually, with
'scan', this means the number of docs to return from EACH shard
in each request, eg 5 * 1000 = max of 5000 docs in each request

The above search request will return a scroll ID. You pass that scroll
ID to each subsequent "scroll" request /_search/scroll, until you get no
more hits.

The parameters are:

scroll: "1m"
refresh the lock on the scroll snapshot and keep it in place for
another one minutes
scroll_id: "xxxx"
the scroll ID returned by the original search request, or by the
previous scroll request. You MUST update this scroll ID to have
the value of the previous request

Each time you call /_search/scroll you will get another batch of
documents.

You can reindex those (to a new index, or after making any changes)
using the "bulk" API.

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

sylvain · February 22, 2013, 6:03pm

Hi Ivan,

yes! I meant update by query, thanks for the link!

Best,
Sylvain

On Thursday, 21 February 2013 23:47:20 UTC+1, Ivan Brusic wrote:

Do you mean update by query? If so, there is an open issue to support it:
https://github.com/elasticsearch/elasticsearch/issues/1607

If you really mean index by query, you would need to provide an example,
because I don't see how something would be returned by a query unless it
already existed.

Cheers,

Ivan

On Thu, Feb 21, 2013 at 2:37 PM, sbellem <sbe...@gmail.com <javascript:>>wrote:

Hi,

is it possible (in one step) to index documents that match the results
of a given query? That is, similar to the Delete by Query API, can one
index by query?

Thanks,
Sylvain

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

sbellem · February 22, 2013, 6:08pm

Hi Ivan,

yes! I meant update by query, thanks for the link!

Best,
Sylvain

On Thursday, 21 February 2013 23:47:20 UTC+1, Ivan Brusic wrote:

Do you mean update by query? If so, there is an open issue to support it:
https://github.com/elasticsearch/elasticsearch/issues/1607

If you really mean index by query, you would need to provide an example,
because I don't see how something would be returned by a query unless it
already existed.

Cheers,

Ivan

On Thu, Feb 21, 2013 at 2:37 PM, sbellem <sbe...@gmail.com <javascript:>>wrote:

Hi,

is it possible (in one step) to index documents that match the results
of a given query? That is, similar to the Delete by Query API, can one
index by query?

Thanks,
Sylvain

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

sbellem · February 22, 2013, 6:10pm

Thanks Clint!

On Friday, 22 February 2013 11:43:05 UTC+1, Clinton Gormley wrote:

Hiya

is it possible (in one step) to index documents that match the
results of a given query? That is, similar to the Delete by Query API,
can one index by query?

This is something you would need to handle yourself. Depending on which
client API you use, you may already have some utilities available to
make it easy.

For instance, in the Perl API, you could do:
$search = $es->scrolled_search( 
    query => { ... whatever query ... }, 
    search_type => 'scan' 
); 

$es->reindex( source => $search, dest_index => 'new_index') 
If your API doesn't have anything similar, then you can write it
yourself. All it consists of is:

a scrolled search (with scanning)

bulk indexing

To implement a scrolled search, run an ordinary search request, with
these parameters:

scroll: "1m"
this says that you want to take a "snapshot" of the current state
of your data, and to keep that around for, eg, "1m"

search_type: "scan"
this search type disables sorting and is very efficient for
retrieving large numbers of documents from elasticsearch
Elasticsearch Platform — Find real-time answers at scale | Elastic

size: 1000
the number of docs to return in each request. actually, with
'scan', this means the number of docs to return from EACH shard
in each request, eg 5 * 1000 = max of 5000 docs in each request

The above search request will return a scroll ID. You pass that scroll
ID to each subsequent "scroll" request /_search/scroll, until you get no
more hits.

Elasticsearch Platform — Find real-time answers at scale | Elastic

The parameters are:

scroll: "1m"
refresh the lock on the scroll snapshot and keep it in place for
another one minutes

scroll_id: "xxxx"
the scroll ID returned by the original search request, or by the
previous scroll request. You MUST update this scroll ID to have
the value of the previous request

Each time you call /_search/scroll you will get another batch of
documents.

You can reindex those (to a new index, or after making any changes)
using the "bulk" API.

Elasticsearch Platform — Find real-time answers at scale | Elastic

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Java api update by query? Elasticsearch	2	391	July 6, 2017
Can update by script be performed on query results atomically? Elasticsearch	2	408	July 6, 2017
Official Support for UpdateByQuery? Elasticsearch	1	289	July 6, 2017
Pure boolean query - just return true or false, no result set Elasticsearch	5	1285	July 6, 2017
While updating a document, query by range return duplicate documents Elasticsearch	1	332	July 6, 2017

Index by query?

Related topics