Restrict a field from going to _source field or easily print out all stored fields?


(ppearcy) #1

Hello,
I like the concept of the _source field, however, in some cases I
have large content fields that can be any arbitrary length. I have no
need to return these content fields to the search and was wondering if
it was possible to selectively disable what goes into the _source
field. I understand that this removes the usefulness of _source for
the re-index case.

If this isn't possible, is it possible to add a * or _all input to the
fields parameter on the search as a convenience to show all the stored
fields? When debugging different indexes with different fields it is
painful to have to manually list all the fields you want returned and
want to be able to get a snapshot of everything you have.

As always, thanks a ton for the help.

Best Regards,
Paul


(ppearcy) #2

Actually, in retrospect, my preferred approach would be to keep the
_source field as is, but be able to suppress certain fields within it
from being returned.

I could probably hack out a patch for this, if you there is interest.

Thanks,
Paul

On Aug 3, 1:57 pm, Paul ppea...@gmail.com wrote:

Hello,
I like the concept of the _source field, however, in some cases I
have large content fields that can be any arbitrary length. I have no
need to return these content fields to the search and was wondering if
it was possible to selectively disable what goes into the _source
field. I understand that this removes the usefulness of _source for
the re-index case.

If this isn't possible, is it possible to add a * or _all input to the
fields parameter on the search as a convenience to show all the stored
fields? When debugging different indexes with different fields it is
painful to have to manually list all the fields you want returned and
want to be able to get a snapshot of everything you have.

As always, thanks a ton for the help.

Best Regards,
Paul


(Shay Banon) #3

Hi,

All, both this and the previos email are valid requests. I think that the
ability to return all the stored fields is a valid one, you can open an
issue for this. But, the more stored fields you have, the slower the "fetch"
process be.

Regarding the source field. When indexing, it is never actually converted
to an in memory representation, it is pull parsed directly into the index
structure, and stored as bytes. Removing some fields from it requires either
moving it to an in mem rep and mungin it, or having smart recreation of it
while it is being pull parser.

Returning part of the source field has the same problem. Since it is
stored as byte array, it is just fetched and returned. In order to extract
part of the data from it, then it needs to be parsed and munged for each
fetch. Not ideal as well... .

-shay.banon

On Wed, Aug 4, 2010 at 4:43 AM, Paul ppearcy@gmail.com wrote:

Actually, in retrospect, my preferred approach would be to keep the
_source field as is, but be able to suppress certain fields within it
from being returned.

I could probably hack out a patch for this, if you there is interest.

Thanks,
Paul

On Aug 3, 1:57 pm, Paul ppea...@gmail.com wrote:

Hello,
I like the concept of the _source field, however, in some cases I
have large content fields that can be any arbitrary length. I have no
need to return these content fields to the search and was wondering if
it was possible to selectively disable what goes into the _source
field. I understand that this removes the usefulness of _source for
the re-index case.

If this isn't possible, is it possible to add a * or _all input to the
fields parameter on the search as a convenience to show all the stored
fields? When debugging different indexes with different fields it is
painful to have to manually list all the fields you want returned and
want to be able to get a snapshot of everything you have.

As always, thanks a ton for the help.

Best Regards,
Paul


(ppearcy) #4

Sweet, thanks.

On Aug 3, 11:59 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

All, both this and the previos email are valid requests. I think that the
ability to return all the stored fields is a valid one, you can open an
issue for this. But, the more stored fields you have, the slower the "fetch"
process be.

Regarding the source field. When indexing, it is never actually converted
to an in memory representation, it is pull parsed directly into the index
structure, and stored as bytes. Removing some fields from it requires either
moving it to an in mem rep and mungin it, or having smart recreation of it
while it is being pull parser.

Returning part of the source field has the same problem. Since it is
stored as byte array, it is just fetched and returned. In order to extract
part of the data from it, then it needs to be parsed and munged for each
fetch. Not ideal as well... .

-shay.banon

On Wed, Aug 4, 2010 at 4:43 AM, Paul ppea...@gmail.com wrote:

Actually, in retrospect, my preferred approach would be to keep the
_source field as is, but be able to suppress certain fields within it
from being returned.

I could probably hack out a patch for this, if you there is interest.

Thanks,
Paul

On Aug 3, 1:57 pm, Paul ppea...@gmail.com wrote:

Hello,
I like the concept of the _source field, however, in some cases I
have large content fields that can be any arbitrary length. I have no
need to return these content fields to the search and was wondering if
it was possible to selectively disable what goes into the _source
field. I understand that this removes the usefulness of _source for
the re-index case.

If this isn't possible, is it possible to add a * or _all input to the
fields parameter on the search as a convenience to show all the stored
fields? When debugging different indexes with different fields it is
painful to have to manually list all the fields you want returned and
want to be able to get a snapshot of everything you have.

As always, thanks a ton for the help.

Best Regards,
Paul


(Shay Banon) #5

pushed to master.

On Wed, Aug 4, 2010 at 10:17 AM, Paul ppearcy@gmail.com wrote:

Sweet, thanks.

http://github.com/elasticsearch/elasticsearch/issues/issue/296

On Aug 3, 11:59 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

All, both this and the previos email are valid requests. I think that
the
ability to return all the stored fields is a valid one, you can open an
issue for this. But, the more stored fields you have, the slower the
"fetch"
process be.

Regarding the source field. When indexing, it is never actually
converted
to an in memory representation, it is pull parsed directly into the index
structure, and stored as bytes. Removing some fields from it requires
either
moving it to an in mem rep and mungin it, or having smart recreation of
it
while it is being pull parser.

Returning part of the source field has the same problem. Since it is
stored as byte array, it is just fetched and returned. In order to
extract
part of the data from it, then it needs to be parsed and munged for each
fetch. Not ideal as well... .

-shay.banon

On Wed, Aug 4, 2010 at 4:43 AM, Paul ppea...@gmail.com wrote:

Actually, in retrospect, my preferred approach would be to keep the
_source field as is, but be able to suppress certain fields within it
from being returned.

I could probably hack out a patch for this, if you there is interest.

Thanks,
Paul

On Aug 3, 1:57 pm, Paul ppea...@gmail.com wrote:

Hello,
I like the concept of the _source field, however, in some cases I
have large content fields that can be any arbitrary length. I have no
need to return these content fields to the search and was wondering
if

it was possible to selectively disable what goes into the _source
field. I understand that this removes the usefulness of _source for
the re-index case.

If this isn't possible, is it possible to add a * or _all input to
the

fields parameter on the search as a convenience to show all the
stored

fields? When debugging different indexes with different fields it is
painful to have to manually list all the fields you want returned and
want to be able to get a snapshot of everything you have.

As always, thanks a ton for the help.

Best Regards,
Paul


(system) #6