Hello,
I like the concept of the _source field, however, in some cases I
have large content fields that can be any arbitrary length. I have no
need to return these content fields to the search and was wondering if
it was possible to selectively disable what goes into the _source
field. I understand that this removes the usefulness of _source for
the re-index case.
If this isn't possible, is it possible to add a * or _all input to the
fields parameter on the search as a convenience to show all the stored
fields? When debugging different indexes with different fields it is
painful to have to manually list all the fields you want returned and
want to be able to get a snapshot of everything you have.
Actually, in retrospect, my preferred approach would be to keep the
_source field as is, but be able to suppress certain fields within it
from being returned.
I could probably hack out a patch for this, if you there is interest.
Hello,
I like the concept of the _source field, however, in some cases I
have large content fields that can be any arbitrary length. I have no
need to return these content fields to the search and was wondering if
it was possible to selectively disable what goes into the _source
field. I understand that this removes the usefulness of _source for
the re-index case.
If this isn't possible, is it possible to add a * or _all input to the
fields parameter on the search as a convenience to show all the stored
fields? When debugging different indexes with different fields it is
painful to have to manually list all the fields you want returned and
want to be able to get a snapshot of everything you have.
All, both this and the previos email are valid requests. I think that the
ability to return all the stored fields is a valid one, you can open an
issue for this. But, the more stored fields you have, the slower the "fetch"
process be.
Regarding the source field. When indexing, it is never actually converted
to an in memory representation, it is pull parsed directly into the index
structure, and stored as bytes. Removing some fields from it requires either
moving it to an in mem rep and mungin it, or having smart recreation of it
while it is being pull parser.
Returning part of the source field has the same problem. Since it is
stored as byte array, it is just fetched and returned. In order to extract
part of the data from it, then it needs to be parsed and munged for each
fetch. Not ideal as well... .
Actually, in retrospect, my preferred approach would be to keep the
_source field as is, but be able to suppress certain fields within it
from being returned.
I could probably hack out a patch for this, if you there is interest.
Hello,
I like the concept of the _source field, however, in some cases I
have large content fields that can be any arbitrary length. I have no
need to return these content fields to the search and was wondering if
it was possible to selectively disable what goes into the _source
field. I understand that this removes the usefulness of _source for
the re-index case.
If this isn't possible, is it possible to add a * or _all input to the
fields parameter on the search as a convenience to show all the stored
fields? When debugging different indexes with different fields it is
painful to have to manually list all the fields you want returned and
want to be able to get a snapshot of everything you have.
All, both this and the previos email are valid requests. I think that the
ability to return all the stored fields is a valid one, you can open an
issue for this. But, the more stored fields you have, the slower the "fetch"
process be.
Regarding the source field. When indexing, it is never actually converted
to an in memory representation, it is pull parsed directly into the index
structure, and stored as bytes. Removing some fields from it requires either
moving it to an in mem rep and mungin it, or having smart recreation of it
while it is being pull parser.
Returning part of the source field has the same problem. Since it is
stored as byte array, it is just fetched and returned. In order to extract
part of the data from it, then it needs to be parsed and munged for each
fetch. Not ideal as well... .
Actually, in retrospect, my preferred approach would be to keep the
_source field as is, but be able to suppress certain fields within it
from being returned.
I could probably hack out a patch for this, if you there is interest.
Hello,
I like the concept of the _source field, however, in some cases I
have large content fields that can be any arbitrary length. I have no
need to return these content fields to the search and was wondering if
it was possible to selectively disable what goes into the _source
field. I understand that this removes the usefulness of _source for
the re-index case.
If this isn't possible, is it possible to add a * or _all input to the
fields parameter on the search as a convenience to show all the stored
fields? When debugging different indexes with different fields it is
painful to have to manually list all the fields you want returned and
want to be able to get a snapshot of everything you have.
All, both this and the previos email are valid requests. I think that
the
ability to return all the stored fields is a valid one, you can open an
issue for this. But, the more stored fields you have, the slower the
"fetch"
process be.
Regarding the source field. When indexing, it is never actually
converted
to an in memory representation, it is pull parsed directly into the index
structure, and stored as bytes. Removing some fields from it requires
either
moving it to an in mem rep and mungin it, or having smart recreation of
it
while it is being pull parser.
Returning part of the source field has the same problem. Since it is
stored as byte array, it is just fetched and returned. In order to
extract
part of the data from it, then it needs to be parsed and munged for each
fetch. Not ideal as well... .
Actually, in retrospect, my preferred approach would be to keep the
_source field as is, but be able to suppress certain fields within it
from being returned.
I could probably hack out a patch for this, if you there is interest.
Hello,
I like the concept of the _source field, however, in some cases I
have large content fields that can be any arbitrary length. I have no
need to return these content fields to the search and was wondering
if
it was possible to selectively disable what goes into the _source
field. I understand that this removes the usefulness of _source for
the re-index case.
If this isn't possible, is it possible to add a * or _all input to
the
fields parameter on the search as a convenience to show all the
stored
fields? When debugging different indexes with different fields it is
painful to have to manually list all the fields you want returned and
want to be able to get a snapshot of everything you have.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.