Add core type - guid

Vladimir_Khazin · April 3, 2014, 8:11pm

Sample data:

"_source": {
"Id": "ca23459f-cc96-46cb-8ae8-509368467670",
"Title": "TPTest Scaling 10:3"
}

Trouble:
Guid is not one of the supported data
types: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html
As a result guid by default is indexed as string where dashes '-' are used
to break the long string into terms.
Search and aggregations will produce less than desired results unless the
Id field is mapped as not analysed.
Alternative is to strip off the dashes from the document, but such approach
will require custom serialization/de-serialization for the Guid data-type
in .Net using popular json parser http://james.newtonking.com/json.
The challenge is that Guid is widely used in our application and field name
is not always 'Id' and mapping every guid has proven quite time consuming
and in case of a missed mapping the data needs to be re-indexed.

Request/Question:
Is there a less effort consuming approach?
Are there plans to support guid as core type?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bee7866e-2c0a-4ae1-9b48-0b17da2931ce%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

brian_yoder · April 3, 2014, 9:01pm

Perhaps, index the Id field but do not analyze it? Then it will be indexed
and queried intact as-is.

Brian

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8367d3a3-1469-4bd9-8fbb-49b86bad23f0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Binh_Ly_2 · April 3, 2014, 9:26pm

Brian is right, if you map your field as not_analyzed, then you can do
exact case-sensitive matches on it, as well as term facets/aggregations and
sorts. This applies to fields like IDs, Guids, or anything you can think of
that you don't want tokenized:

{
"mappings": {
"doc": {
"properties": {
"Id": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/edbe39c7-9983-40ff-afa2-c7f61319dc6f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Vladimir_Khazin · April 4, 2014, 1:40pm

Thank you Brian and Binh for your comments!

The slight trouble with that approach is an ongoing mapping changes every
time we add another Id (guid field) to the json document and it is quite
often on the move.
One of the reasons I fell in love with Elasticsearch was it somewhat
schema-less approach.
The situation with Guid breaks this idealistic model, requiring to add
not-analysed mapping for every new field of type guid and if it has been
missed - reindex the type after deleting and adding mapping.

That's why I was wondering whether there are plans to add native support
for guid.

On Thursday, April 3, 2014 5:26:00 PM UTC-4, Binh Ly wrote:

Brian is right, if you map your field as not_analyzed, then you can do
exact case-sensitive matches on it, as well as term facets/aggregations and
sorts. This applies to fields like IDs, Guids, or anything you can think of
that you don't want tokenized:

{
"mappings": {
"doc": {
"properties": {
"Id": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/15d0435b-f11c-47de-afc5-19fe414fac0b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

brian_yoder · April 4, 2014, 8:59pm

Vladimir,

Even if ES supported a specific GUID field type, then it could still fail
to be auto-detected.

I would think that you could detect when your data added a new guid field
much more reliably than ES could auto-detect it.

Note that you can easily update the mappings of an existing index in
non-breaking ways, and one of these valid ways is to add a field that
didn't exist before.

I, too, liked ES's schema-less approach which made it easier to dive
directly into and learn. But as time went on, I have finally locked down ES
to never automatically create an index, and to never automatically map a
field that doesn't already have an existing mapping. Combined with the cool
ability to add mappings for new fields to an existing index, these make it
easy to reliably catch new unexpected fields and then add the mappings for
them without the chance of ES dynamically creating an incompatible mapping.

Note that the auto-detection issue is the same whether ES supports a "guid"
field type or whether you need to be a little more wordy and specify a
"string" type that is indexed but not_analyzed. If you make ES guess, it
can still guess wrong and define the new fields as "string" but with the
standard analyzer.

Brian

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ff1cae6b-056b-49bd-9054-ea7ea954eabd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Vladimir_Khazin · April 4, 2014, 9:32pm

Hard to argue with your comments and experience.

Thank you for the feedback!

On Fri, Apr 4, 2014 at 4:59 PM, InquiringMind brian.from.fl@gmail.comwrote:

Vladimir,

Even if ES supported a specific GUID field type, then it could still fail
to be auto-detected.

I would think that you could detect when your data added a new guid field
much more reliably than ES could auto-detect it.

Note that you can easily update the mappings of an existing index in
non-breaking ways, and one of these valid ways is to add a field that
didn't exist before.

I, too, liked ES's schema-less approach which made it easier to dive
directly into and learn. But as time went on, I have finally locked down ES
to never automatically create an index, and to never automatically map a
field that doesn't already have an existing mapping. Combined with the cool
ability to add mappings for new fields to an existing index, these make it
easy to reliably catch new unexpected fields and then add the mappings for
them without the chance of ES dynamically creating an incompatible mapping.

Note that the auto-detection issue is the same whether ES supports a
"guid" field type or whether you need to be a little more wordy and specify
a "string" type that is indexed but not_analyzed. If you make ES guess, it
can still guess wrong and define the new fields as "string" but with the
standard analyzer.

Brian

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/cbtYLj5B8eM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ff1cae6b-056b-49bd-9054-ea7ea954eabd%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/ff1cae6b-056b-49bd-9054-ea7ea954eabd%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Sincerely yours,
Vlad Khazin
Email: vlad.khazin@icssolutions.ca
Skype: vladimir.khazin
Cell: 416-802-2771
Fax: 866-425-2660
http://www.linkedin.com/in/vkhazin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAMnv9kbFXS_ZDKYOFf24b69prmO-SbwNP3h_aaHLx4_sPVQnJQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Elastic Search with GUID Elasticsearch	3	3394	February 2, 2018
Non-numeric idetifiers Elasticsearch	3	431	August 7, 2018
NEST search by guid returning more documentes, how to escape "-" char? Elasticsearch	4	1767	January 24, 2017
Filter documents based on their Guid[] array property to only retrieve those containing all the values from an incoming Guid[] array or more Elasticsearch	2	31	October 27, 2024
How to include the _id field in the _source Elasticsearch	2	2422	July 6, 2017

Add core type - guid

Related topics