How to index parts of json files using Java API?

I've parsed a json file into an JSONObject and created an index for this
whole JSONObject.
But how can I index only a part of this JSONObject, for instance, only
several fields or nested parts?
Should I extract the required parts of json object and create an another
json object for indexing? I think there should be a better way to do this.
Could anyone give my some advices? Thanks a lot!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,
There are a couple of options...

The elasticsearch defaults are great for playing around, but it is almost
always a good idea to disable dynamic mapping and setup explicit mappings(
Elasticsearch Platform — Find real-time answers at scale | Elastic) to configure what
fields get indexed and which ones don't.

The next question is if you want to keep the original JSON or have
elasticsearch strip out elements you don't want. You can accomplish this
with source field include/excludes:

Best Regards,
Paul

On Monday, July 1, 2013 9:56:17 AM UTC-6, 少爷允之 wrote:

I've parsed a json file into an JSONObject and created an index for this
whole JSONObject.
But how can I index only a part of this JSONObject, for instance, only
several fields or nested parts?
Should I extract the required parts of json object and create an another
json object for indexing? I think there should be a better way to do this.
Could anyone give my some advices? Thanks a lot!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks a lot.
I've read something about mapping and notice there is a "enabled" flag(
Elasticsearch Platform — Find real-time answers at scale | Elastic). I tried
to create an explicit mapping with this flag, here is my code:
String mapping =
XContentFactory.jsonBuilder().startObject().startObject("testpbl")
.startObject("properties")
.startObject("person").field("type", "object")
.startObject("properties")
.startObject("name").field("type", "object")
.field("enabled", false)
.endObject()
.startObject("sid").field("type","string").endObject()
.endObject()
.startObject("message").field("type", "string").endObject()
.endObject().endObject().endObject().endObject()
.string();

    node.client().admin()
            .indices().prepareCreate("pbl")
            .addMapping("testpbl", mapping)
            .execute().actionGet();

    JSONParser parser = new JSONParser();
    Object object = parser.parse(new FileReader("mydoc\\test.js"));
    JSONObject jsonData = (JSONObject) object;

    node.client().prepareIndex("pbl", "testpbl",

"1").setSource(jsonData).execute().actionGet();

and the test.js is:

{
"tweet": {
"person": {
"name": {
"first_name": "Shay",
"last_name": "Banon"
},
"sid": "12345"
},
"message": "This is a tweet!"
}
}

the question is, it seems that my mapping doesn't work. When I tried to get
my created index and check its content, I still get the name's content
which should not be indexed according to my mapping.

is there any error with my code? I will appreciate for helping.

2013/7/2 ppearcy ppearcy@gmail.com

Hi,
There are a couple of options...

The elasticsearch defaults are great for playing around, but it is almost
always a good idea to disable dynamic mapping and setup explicit mappings(
Elasticsearch Platform — Find real-time answers at scale | Elastic) to configure what
fields get indexed and which ones don't.

The next question is if you want to keep the original JSON or have
elasticsearch strip out elements you don't want. You can accomplish this
with source field include/excludes:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Best Regards,
Paul

On Monday, July 1, 2013 9:56:17 AM UTC-6, 少爷允之 wrote:

I've parsed a json file into an JSONObject and created an index for this
whole JSONObject.
But how can I index only a part of this JSONObject, for instance, only
several fields or nested parts?
Should I extract the required parts of json object and create an another
json object for indexing? I think there should be a better way to do this.
Could anyone give my some advices? Thanks a lot!

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/c7IQfT7Z8Os/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Just a note to this: There is a big difference between indexing and storing source document.

Elasticsearch won't touch your source document and will store it as is.
For example, if you use carriage return after each field, you will get it back when doing get or search operations.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 2 juil. 2013 à 12:49, Chang Zhang chang.zhang29@googlemail.com a écrit :

Thanks a lot.
I've read something about mapping and notice there is a "enabled" flag(Elasticsearch Platform — Find real-time answers at scale | Elastic). I tried to create an explicit mapping with this flag, here is my code:
String mapping = XContentFactory.jsonBuilder().startObject().startObject("testpbl")
.startObject("properties")
.startObject("person").field("type", "object")
.startObject("properties")
.startObject("name").field("type", "object")
.field("enabled", false)
.endObject()
.startObject("sid").field("type","string").endObject()
.endObject()
.startObject("message").field("type", "string").endObject()
.endObject().endObject().endObject().endObject()
.string();

    node.client().admin()
            .indices().prepareCreate("pbl")
            .addMapping("testpbl", mapping)
            .execute().actionGet();

    JSONParser parser = new JSONParser();
    Object object = parser.parse(new FileReader("mydoc\\test.js"));
    JSONObject jsonData = (JSONObject) object;

    node.client().prepareIndex("pbl", "testpbl", "1").setSource(jsonData).execute().actionGet();

and the test.js is:

{
"tweet": {
"person": {
"name": {
"first_name": "Shay",
"last_name": "Banon"
},
"sid": "12345"
},
"message": "This is a tweet!"
}
}

the question is, it seems that my mapping doesn't work. When I tried to get my created index and check its content, I still get the name's content which should not be indexed according to my mapping.

is there any error with my code? I will appreciate for helping.

2013/7/2 ppearcy ppearcy@gmail.com

Hi,
There are a couple of options...

The elasticsearch defaults are great for playing around, but it is almost always a good idea to disable dynamic mapping and setup explicit mappings(Elasticsearch Platform — Find real-time answers at scale | Elastic) to configure what fields get indexed and which ones don't.

The next question is if you want to keep the original JSON or have elasticsearch strip out elements you don't want. You can accomplish this with source field include/excludes:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Best Regards,
Paul

On Monday, July 1, 2013 9:56:17 AM UTC-6, 少爷允之 wrote:

I've parsed a json file into an JSONObject and created an index for this whole JSONObject.
But how can I index only a part of this JSONObject, for instance, only several fields or nested parts?
Should I extract the required parts of json object and create an another json object for indexing? I think there should be a better way to do this.
Could anyone give my some advices? Thanks a lot!

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/c7IQfT7Z8Os/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thank you.
now I noticed that my mapping has been working to the index, because when I
search the field which has been enabled false in the mapping cannot be
hitted. So this means that this part is not indexed? do I understand
correctly?

but here is an another question.
I add a field to be loaded and return in the searchresponse with
addField("myfield"), it seems that there is no required field returned,
here is my code and a screenshot of debug result.

SearchResponse searchResponse = node.client().prepareSearch("pbl")
.setTypes("testpbl")
.setQuery(QueryBuilders.matchQuery("sid","12345"))
.addField("sid")
.setExplain(true)
.setSize(10)
.execute().actionGet();
[image: 内嵌图片 1]

2013/7/2 David Pilato david@pilato.fr

Just a note to this: There is a big difference between indexing and
storing source document.

Elasticsearch won't touch your source document and will store it as is.
For example, if you use carriage return after each field, you will get it
back when doing get or search operations.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 2 juil. 2013 à 12:49, Chang Zhang chang.zhang29@googlemail.com a
écrit :

Thanks a lot.
I've read something about mapping and notice there is a "enabled" flag(
Elasticsearch Platform — Find real-time answers at scale | Elastic). I
tried to create an explicit mapping with this flag, here is my code:
String mapping =
XContentFactory.jsonBuilder().startObject().startObject("testpbl")
.startObject("properties")
.startObject("person").field("type", "object")
.startObject("properties")
.startObject("name").field("type", "object")
.field("enabled", false)
.endObject()
.startObject("sid").field("type","string").endObject()
.endObject()
.startObject("message").field("type", "string").endObject()
.endObject().endObject().endObject().endObject()
.string();

    node.client().admin()
            .indices().prepareCreate("pbl")
            .addMapping("testpbl", mapping)
            .execute().actionGet();

    JSONParser parser = new JSONParser();
    Object object = parser.parse(new FileReader("mydoc\\test.js"));
    JSONObject jsonData = (JSONObject) object;

    node.client().prepareIndex("pbl", "testpbl",

"1").setSource(jsonData).execute().actionGet();

and the test.js is:

{
"tweet": {
"person": {
"name": {
"first_name": "Shay",
"last_name": "Banon"
},
"sid": "12345"
},
"message": "This is a tweet!"
}
}

the question is, it seems that my mapping doesn't work. When I tried to
get my created index and check its content, I still get the name's content
which should not be indexed according to my mapping.

is there any error with my code? I will appreciate for helping.

2013/7/2 ppearcy ppearcy@gmail.com

Hi,
There are a couple of options...

The elasticsearch defaults are great for playing around, but it is almost
always a good idea to disable dynamic mapping and setup explicit mappings(
Elasticsearch Platform — Find real-time answers at scale | Elastic) to configure what
fields get indexed and which ones don't.

The next question is if you want to keep the original JSON or have
elasticsearch strip out elements you don't want. You can accomplish this
with source field include/excludes:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Best Regards,
Paul

On Monday, July 1, 2013 9:56:17 AM UTC-6, 少爷允之 wrote:

I've parsed a json file into an JSONObject and created an index for this
whole JSONObject.
But how can I index only a part of this JSONObject, for instance, only
several fields or nested parts?
Should I extract the required parts of json object and create an another
json object for indexing? I think there should be a better way to do this.
Could anyone give my some advices? Thanks a lot!

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/c7IQfT7Z8Os/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/c7IQfT7Z8Os/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I think I found the problem. the addField() can return a field which should
be stored first, so I add field("store","yes") to the field I'd like to
return.
But I noticed if I setup my own mapping and just create a sample index and
then use addField(), it works OK, I don't need to specify store in my
mapping. In my case, I load a json file and setup a mapping for it in order
to index only parts of this file, when I search this created index and want
to get a certain field values, I should specify it has been set
field("store","yes"). Why is it work in this way? Could anybody help my to
understand this? Thanks very much!
I'm so new to Elasticsearch, please forgive my stupid questions! :slight_smile:

2013/7/2 Chang Zhang chang.zhang29@googlemail.com

Thank you.
now I noticed that my mapping has been working to the index, because when
I search the field which has been enabled false in the mapping cannot be
hitted. So this means that this part is not indexed? do I understand
correctly?

but here is an another question.
I add a field to be loaded and return in the searchresponse with
addField("myfield"), it seems that there is no required field returned,
here is my code and a screenshot of debug result.

SearchResponse searchResponse = node.client().prepareSearch("pbl")
.setTypes("testpbl")
.setQuery(QueryBuilders.matchQuery("sid","12345"))
.addField("sid")
.setExplain(true)
.setSize(10)
.execute().actionGet();
[image: 内嵌图片 1]

2013/7/2 David Pilato david@pilato.fr

Just a note to this: There is a big difference between indexing and
storing source document.

Elasticsearch won't touch your source document and will store it as is.
For example, if you use carriage return after each field, you will get it
back when doing get or search operations.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 2 juil. 2013 à 12:49, Chang Zhang chang.zhang29@googlemail.com a
écrit :

Thanks a lot.
I've read something about mapping and notice there is a "enabled" flag(
Elasticsearch Platform — Find real-time answers at scale | Elastic). I
tried to create an explicit mapping with this flag, here is my code:
String mapping =
XContentFactory.jsonBuilder().startObject().startObject("testpbl")
.startObject("properties")
.startObject("person").field("type", "object")
.startObject("properties")
.startObject("name").field("type", "object")
.field("enabled", false)
.endObject()
.startObject("sid").field("type","string").endObject()
.endObject()
.startObject("message").field("type",
"string").endObject()
.endObject().endObject().endObject().endObject()
.string();

    node.client().admin()
            .indices().prepareCreate("pbl")
            .addMapping("testpbl", mapping)
            .execute().actionGet();

    JSONParser parser = new JSONParser();
    Object object = parser.parse(new FileReader("mydoc\\test.js"));
    JSONObject jsonData = (JSONObject) object;

    node.client().prepareIndex("pbl", "testpbl",

"1").setSource(jsonData).execute().actionGet();

and the test.js is:

{
"tweet": {
"person": {
"name": {
"first_name": "Shay",
"last_name": "Banon"
},
"sid": "12345"
},
"message": "This is a tweet!"
}
}

the question is, it seems that my mapping doesn't work. When I tried to
get my created index and check its content, I still get the name's content
which should not be indexed according to my mapping.

is there any error with my code? I will appreciate for helping.

2013/7/2 ppearcy ppearcy@gmail.com

Hi,
There are a couple of options...

The elasticsearch defaults are great for playing around, but it is
almost always a good idea to disable dynamic mapping and setup explicit
mappings(Elasticsearch Platform — Find real-time answers at scale | Elastic) to
configure what fields get indexed and which ones don't.

The next question is if you want to keep the original JSON or have
elasticsearch strip out elements you don't want. You can accomplish this
with source field include/excludes:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Best Regards,
Paul

On Monday, July 1, 2013 9:56:17 AM UTC-6, 少爷允之 wrote:

I've parsed a json file into an JSONObject and created an index for
this whole JSONObject.
But how can I index only a part of this JSONObject, for instance, only
several fields or nested parts?
Should I extract the required parts of json object and create an
another json object for indexing? I think there should be a better way to
do this.
Could anyone give my some advices? Thanks a lot!

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/c7IQfT7Z8Os/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/c7IQfT7Z8Os/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Heya,
This is a pretty good explanation:

To summarize, there are two ways to store fields in elasticsearch.

  • Store = yes where it is actually stored by lucene
  • Stored as part of _source. Here the single _source field is stored and
    individual items are extracted via the JSON.

There are some nuances to both, but in general, I recommend only storing
_source vs other individual fields.

Best Regards,
Paul

On Tuesday, July 2, 2013 9:10:17 AM UTC-6, Chang Zhang wrote:

I think I found the problem. the addField() can return a field which
should be stored first, so I add field("store","yes") to the field I'd like
to return.
But I noticed if I setup my own mapping and just create a sample index and
then use addField(), it works OK, I don't need to specify store in my
mapping. In my case, I load a json file and setup a mapping for it in order
to index only parts of this file, when I search this created index and want
to get a certain field values, I should specify it has been set
field("store","yes"). Why is it work in this way? Could anybody help my to
understand this? Thanks very much!
I'm so new to Elasticsearch, please forgive my stupid questions! :slight_smile:

2013/7/2 Chang Zhang <chang....@googlemail.com <javascript:>>

Thank you.
now I noticed that my mapping has been working to the index, because when
I search the field which has been enabled false in the mapping cannot be
hitted. So this means that this part is not indexed? do I understand
correctly?

but here is an another question.
I add a field to be loaded and return in the searchresponse with
addField("myfield"), it seems that there is no required field returned,
here is my code and a screenshot of debug result.

SearchResponse searchResponse = node.client().prepareSearch("pbl")
.setTypes("testpbl")
.setQuery(QueryBuilders.matchQuery("sid","12345"))
.addField("sid")
.setExplain(true)
.setSize(10)
.execute().actionGet();
[image: 内嵌图片 1]

2013/7/2 David Pilato <da...@pilato.fr <javascript:>>

Just a note to this: There is a big difference between indexing and
storing source document.

Elasticsearch won't touch your source document and will store it as is.
For example, if you use carriage return after each field, you will get
it back when doing get or search operations.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 2 juil. 2013 à 12:49, Chang Zhang <chang....@googlemail.com<javascript:>>
a écrit :

Thanks a lot.
I've read something about mapping and notice there is a "enabled" flag(
Elasticsearch Platform — Find real-time answers at scale | Elastic). I
tried to create an explicit mapping with this flag, here is my code:
String mapping =
XContentFactory.jsonBuilder().startObject().startObject("testpbl")
.startObject("properties")
.startObject("person").field("type", "object")
.startObject("properties")
.startObject("name").field("type", "object")
.field("enabled", false)
.endObject()
.startObject("sid").field("type","string").endObject()
.endObject()
.startObject("message").field("type",
"string").endObject()
.endObject().endObject().endObject().endObject()
.string();

    node.client().admin()
            .indices().prepareCreate("pbl")
            .addMapping("testpbl", mapping)
            .execute().actionGet();

    JSONParser parser = new JSONParser();
    Object object = parser.parse(new FileReader("mydoc\\test.js"));
    JSONObject jsonData = (JSONObject) object;

    node.client().prepareIndex("pbl", "testpbl", 

"1").setSource(jsonData).execute().actionGet();

and the test.js is:

{
"tweet": {
"person": {
"name": {
"first_name": "Shay",
"last_name": "Banon"
},
"sid": "12345"
},
"message": "This is a tweet!"
}
}

the question is, it seems that my mapping doesn't work. When I tried to
get my created index and check its content, I still get the name's content
which should not be indexed according to my mapping.

is there any error with my code? I will appreciate for helping.

2013/7/2 ppearcy <ppe...@gmail.com <javascript:>>

Hi,
There are a couple of options...

The elasticsearch defaults are great for playing around, but it is
almost always a good idea to disable dynamic mapping and setup explicit
mappings(Elasticsearch Platform — Find real-time answers at scale | Elastic) to
configure what fields get indexed and which ones don't.

The next question is if you want to keep the original JSON or have
elasticsearch strip out elements you don't want. You can accomplish this
with source field include/excludes:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Best Regards,
Paul

On Monday, July 1, 2013 9:56:17 AM UTC-6, 少爷允之 wrote:

I've parsed a json file into an JSONObject and created an index for
this whole JSONObject.
But how can I index only a part of this JSONObject, for instance, only
several fields or nested parts?
Should I extract the required parts of json object and create an
another json object for indexing? I think there should be a better way to
do this.
Could anyone give my some advices? Thanks a lot!

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/c7IQfT7Z8Os/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/c7IQfT7Z8Os/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.