Documents and property naming


(James Cook) #1

I saw a post a couple days ago on the list where someone was saying that the
same property name reused across multiple types must have the same mapping.

I guess I just wanted someone to confirm whether this is true or not. It
seems like this would have a huge impact on most document designs if it is
accurate.

I thought the type name was actually used behind the scenes as a prefix for
each attribute. Consider this example:

$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elastic Search"
}'

Isn't the user property actually indexed at the Lucene layer as
'tweet.user'? Wouldn't this mean that each property name within a type has a
unique mapping?


(fashionalwallet) #2
  • deleted -

(Shay Banon) #3

Heya,

Yes, that is the case (and they are not stored in Lucene using a type prefix).. Or, more specifically, if they are using a different type, you would need to prefix the type of the doc you want to search on (for example: twitter.user as the field name in queries). In general though, you want to keep them using the same type.

-shay.banon

On Thursday, June 9, 2011 at 4:13 AM, James Cook wrote:

I saw a post a couple days ago on the list where someone was saying that the same property name reused across multiple types must have the same mapping.

I guess I just wanted someone to confirm whether this is true or not. It seems like this would have a huge impact on most document designs if it is accurate.

I thought the type name was actually used behind the scenes as a prefix for each attribute. Consider this example:

$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"user" : "kimchy",

"post_date" : "2009-11-15T14:12:12",

"message" : "trying out Elastic Search"

}'

Isn't the user property actually indexed at the Lucene layer as 'tweet.user'? Wouldn't this mean that each property name within a type has a unique mapping?


(James Cook) #4

Are "client.username" and "supplier.username" considered the same or
different field names from the perspective of mapping?

On Thu, Jun 9, 2011 at 3:07 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Heya,

Yes, that is the case (and they are not stored in Lucene using a type
prefix).. Or, more specifically, if they are using a different type, you
would need to prefix the type of the doc you want to search on (for example:
twitter.user as the field name in queries). In general though, you want to
keep them using the same type.

-shay.banon

On Thursday, June 9, 2011 at 4:13 AM, James Cook wrote:

I saw a post a couple days ago on the list where someone was saying that
the same property name reused across multiple types must have the same
mapping.

I guess I just wanted someone to confirm whether this is true or not. It
seems like this would have a huge impact on most document designs if it is
accurate.

I thought the type name was actually used behind the scenes as a prefix for
each attribute. Consider this example:

$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elastic Search"
}'

Isn't the user property actually indexed at the Lucene layer as
'tweet.user'? Wouldn't this mean that each property name within a type has a
unique mapping?


(Shay Banon) #5

If client and supplier are types, then the field name that will be indexed in Lucene will be username. If username is in one case a string, and in one case a long, and you query on username, you will get failures. If you query on client.username, you will not (and it will get filtered automatically to only match on client documents).

On Thursday, June 9, 2011 at 11:59 PM, James Cook wrote:

Are "client.username" and "supplier.username" considered the same or different field names from the perspective of mapping?

On Thu, Jun 9, 2011 at 3:07 PM, Shay Banon <shay.banon@elasticsearch.com (mailto:shay.banon@elasticsearch.com)> wrote:

Heya,

Yes, that is the case (and they are not stored in Lucene using a type prefix).. Or, more specifically, if they are using a different type, you would need to prefix the type of the doc you want to search on (for example: twitter.user as the field name in queries). In general though, you want to keep them using the same type.

-shay.banon

On Thursday, June 9, 2011 at 4:13 AM, James Cook wrote:

I saw a post a couple days ago on the list where someone was saying that the same property name reused across multiple types must have the same mapping.

I guess I just wanted someone to confirm whether this is true or not. It seems like this would have a huge impact on most document designs if it is accurate.

I thought the type name was actually used behind the scenes as a prefix for each attribute. Consider this example:

$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"user" : "kimchy",

"post_date" : "2009-11-15T14:12:12",

"message" : "trying out Elastic Search"

}'

Isn't the user property actually indexed at the Lucene layer as 'tweet.user'? Wouldn't this mean that each property name within a type has a unique mapping?


(James Cook) #6

Sorry I wasn't clear.

In my example, I meant that supplier.username and client.username were part
of the same type.

So, document =
{
supplier: {
username: "user1",
},
client: {
username: "joe@cool.com"
}
}

I would want to use two different mappings for this case. Are there any
issues with this?

Thanks

On Thu, Jun 9, 2011 at 6:42 PM, Shay Banon shay.banon@elasticsearch.comwrote:

If client and supplier are types, then the field name that will be
indexed in Lucene will be username. If username is in one case a string, and
in one case a long, and you query on username, you will get failures. If you
query on client.username, you will not (and it will get filtered
automatically to only match on client documents).

On Thursday, June 9, 2011 at 11:59 PM, James Cook wrote:

Are "client.username" and "supplier.username" considered the same or
different field names from the perspective of mapping?

On Thu, Jun 9, 2011 at 3:07 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Heya,

Yes, that is the case (and they are not stored in Lucene using a type
prefix).. Or, more specifically, if they are using a different type, you
would need to prefix the type of the doc you want to search on (for example:
twitter.user as the field name in queries). In general though, you want to
keep them using the same type.

-shay.banon

On Thursday, June 9, 2011 at 4:13 AM, James Cook wrote:

I saw a post a couple days ago on the list where someone was saying that
the same property name reused across multiple types must have the same
mapping.

I guess I just wanted someone to confirm whether this is true or not. It
seems like this would have a huge impact on most document designs if it is
accurate.

I thought the type name was actually used behind the scenes as a prefix for
each attribute. Consider this example:

$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elastic Search"
}'

Isn't the user property actually indexed at the Lucene layer as
'tweet.user'? Wouldn't this mean that each property name within a type has a
unique mapping?


(Shay Banon) #7

No issues in this case.

On Friday, June 10, 2011 at 3:58 PM, James Cook wrote:

Sorry I wasn't clear.

In my example, I meant that supplier.username and client.username were part of the same type.

So, document =
{
supplier: {
username: "user1",
},
client: {
username: "joe@cool.com (mailto:joe@cool.com)"
}
}

I would want to use two different mappings for this case. Are there any issues with this?

Thanks

On Thu, Jun 9, 2011 at 6:42 PM, Shay Banon <shay.banon@elasticsearch.com (mailto:shay.banon@elasticsearch.com)> wrote:

If client and supplier are types, then the field name that will be indexed in Lucene will be username. If username is in one case a string, and in one case a long, and you query on username, you will get failures. If you query on client.username, you will not (and it will get filtered automatically to only match on client documents).

On Thursday, June 9, 2011 at 11:59 PM, James Cook wrote:

Are "client.username" and "supplier.username" considered the same or different field names from the perspective of mapping?

On Thu, Jun 9, 2011 at 3:07 PM, Shay Banon <shay.banon@elasticsearch.com (mailto:shay.banon@elasticsearch.com)> wrote:

Heya,

Yes, that is the case (and they are not stored in Lucene using a type prefix).. Or, more specifically, if they are using a different type, you would need to prefix the type of the doc you want to search on (for example: twitter.user as the field name in queries). In general though, you want to keep them using the same type.

-shay.banon

On Thursday, June 9, 2011 at 4:13 AM, James Cook wrote:

I saw a post a couple days ago on the list where someone was saying that the same property name reused across multiple types must have the same mapping.

I guess I just wanted someone to confirm whether this is true or not. It seems like this would have a huge impact on most document designs if it is accurate.

I thought the type name was actually used behind the scenes as a prefix for each attribute. Consider this example:

$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"user" : "kimchy",

"post_date" : "2009-11-15T14:12:12",

"message" : "trying out Elastic Search"

}'

Isn't the user property actually indexed at the Lucene layer as 'tweet.user'? Wouldn't this mean that each property name within a type has a unique mapping?


(jacorob) #8

This was something that caught me by surprise last month and forced me to
redesign my json to account for it. Just for searching posterity for people
who search and find this discussion I found the original discussion that
made me realize this. Shay provides additional information that you may find
useful in that thread:

"The mappings are recommended to be the same. Its kindda in the middle now,
I have started (way back) to try and support it better when explicitly
specifying the type in relevant queries filters, but it does not work when
they have different analyzers. And, in any case, you would need to
explicitly specify the type, for example:

{ "term" : { "my_type.my_field" : "value" }}.

It does not work though when you execute a search on that type alone
(/my_index/my_type/_search), i.e., the typeness is not bubbled down.

Those are the two main issues that I can think that I are missing. But, even
with that, things like facets will become problematic (since there can be
only one "type" for that field when faceting on it). So, in any case, even
with those issues "fixed", its recommended that they will have the same
mapping (analyzer, type, and so on)."

The full thread can be found at:
http://elasticsearch-users.115913.n3.nabble.com/Recommended-maximum-fields-per-index-tp2930788p2937852.html

During that thread it was also discussed to open a issue to document this
which can be found at:
https://github.com/elasticsearch/elasticsearch/issues/927;cid=1306263216528-951

Bob

On Fri, Jun 10, 2011 at 9:09 AM, Shay Banon shay.banon@elasticsearch.comwrote:

No issues in this case.

On Friday, June 10, 2011 at 3:58 PM, James Cook wrote:

Sorry I wasn't clear.

In my example, I meant that supplier.username and client.username were part
of the same type.

So, document =
{
supplier: {
username: "user1",
},
client: {
username: "joe@cool.com"
}
}

I would want to use two different mappings for this case. Are there any
issues with this?

Thanks

On Thu, Jun 9, 2011 at 6:42 PM, Shay Banon shay.banon@elasticsearch.comwrote:

If client and supplier are types, then the field name that will be
indexed in Lucene will be username. If username is in one case a string, and
in one case a long, and you query on username, you will get failures. If you
query on client.username, you will not (and it will get filtered
automatically to only match on client documents).

On Thursday, June 9, 2011 at 11:59 PM, James Cook wrote:

Are "client.username" and "supplier.username" considered the same or
different field names from the perspective of mapping?

On Thu, Jun 9, 2011 at 3:07 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Heya,

Yes, that is the case (and they are not stored in Lucene using a type
prefix).. Or, more specifically, if they are using a different type, you
would need to prefix the type of the doc you want to search on (for example:
twitter.user as the field name in queries). In general though, you want to
keep them using the same type.

-shay.banon

On Thursday, June 9, 2011 at 4:13 AM, James Cook wrote:

I saw a post a couple days ago on the list where someone was saying that
the same property name reused across multiple types must have the same
mapping.

I guess I just wanted someone to confirm whether this is true or not. It
seems like this would have a huge impact on most document designs if it is
accurate.

I thought the type name was actually used behind the scenes as a prefix for
each attribute. Consider this example:

$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elastic Search"
}'

Isn't the user property actually indexed at the Lucene layer as
'tweet.user'? Wouldn't this mean that each property name within a type has a
unique mapping?


(system) #9