Globally disable analysis (i.e. no via per-field mapping)?

Eran_Duchan · December 19, 2014, 7:15am

I'd like not to use analysis across my schema to save a bit of CPU (I know
the penalty this inflicts on searching). Right now I set "index":
"not_analyzed" per field but this is cumbersome.

I know I can choose between default analyzers
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-analyzers.htmlbut
there's no "none" analyzer to choose from. Short of writing a custom one
that does nothing, is there a way to globally disable analysis?

Eran

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9d75b098-451c-488e-a29f-4e1b116538d1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

warkolm · December 19, 2014, 7:57am

It feels like you're almost defeating the whole purpose of using
Elasticsearch with this approach! Is it really that much of a problem?

On 19 December 2014 at 08:15, Eran Duchan pavius@gmail.com wrote:

I'd like not to use analysis across my schema to save a bit of CPU (I know
the penalty this inflicts on searching). Right now I set "index":
"not_analyzed" per field but this is cumbersome.

I know I can choose between default analyzers
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-analyzers.htmlbut
there's no "none" analyzer to choose from. Short of writing a custom one
that does nothing, is there a way to globally disable analysis?

Eran

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9d75b098-451c-488e-a29f-4e1b116538d1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9d75b098-451c-488e-a29f-4e1b116538d1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X84G1wcWMz_w9rwdBOug%3Dj6110%2Bd0_5d_HoPdCnS-jpxg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Eran_Duchan · December 19, 2014, 9:12am

We use ElasticSearch to index our structured analytics data. We chose it
for a few reasons:

All fields are indexed so we can search by any field or combination
of fields, including nested fields
Flexible and built in geospatial searches
Can scale with our data, which grows at ~100M documents a day

It's pretty much a generic datastore (though not the source of truth).

While we do have quite a few string fields in our data, these are mostly
enumeration values ("connected", "not connected") and in preliminary tests
we've found that disabling analysis (per field) shows savings of ~5% CPU.
Not a huge amount but every bit helps.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ea428a1b-5b64-4992-8134-12a1ff4f0e97%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · December 19, 2014, 9:51am

The "keyword" analyzer is the "none" analyzer you are looking for.

Example settings:

{
"index" : {
"analysis" : {
"analyzer" : {
"default" : {
"type" : "keyword"
}
}
}
}
}

Jörg

On Fri, Dec 19, 2014 at 8:15 AM, Eran Duchan pavius@gmail.com wrote:

I'd like not to use analysis across my schema to save a bit of CPU (I know
the penalty this inflicts on searching). Right now I set "index":
"not_analyzed" per field but this is cumbersome.

I know I can choose between default analyzers
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-analyzers.htmlbut
there's no "none" analyzer to choose from. Short of writing a custom one
that does nothing, is there a way to globally disable analysis?

Eran

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9d75b098-451c-488e-a29f-4e1b116538d1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9d75b098-451c-488e-a29f-4e1b116538d1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH%2B-wi22gGEunNUKO1uEdZjGHoFNXvaZQF-cVjWUicq5A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Eran_Duchan · December 19, 2014, 10:23am

Ahh, that should probably do the trick. Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e9e3bac7-a74b-443b-a479-bac4891e324b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

warkolm · December 19, 2014, 10:30am

Ok that makes a bit more sense, but it seems the amount of CPU you will
save isn't worth the effort.

You could create an index template that matches fields with pattern "*" and
sets index: not_analyzed, that'd be easiest.

On 19 December 2014 at 10:12, Eran Duchan pavius@gmail.com wrote:

We use Elasticsearch to index our structured analytics data. We chose it
for a few reasons:

All fields are indexed so we can search by any field or combination
of fields, including nested fields

Flexible and built in geospatial searches

Can scale with our data, which grows at ~100M documents a day

It's pretty much a generic datastore (though not the source of truth).

While we do have quite a few string fields in our data, these are mostly
enumeration values ("connected", "not connected") and in preliminary tests
we've found that disabling analysis (per field) shows savings of ~5% CPU.
Not a huge amount but every bit helps.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ea428a1b-5b64-4992-8134-12a1ff4f0e97%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ea428a1b-5b64-4992-8134-12a1ff4f0e97%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X84eb6wUqOfMNS8uBqvS5DdGR%2BXZx_zWZBdCKyfbkhH8A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Configuring analyzer for entire schema Elasticsearch	2	319	July 6, 2017
ES - settings/mappings - globally for an index - "index": "not_analyzed" and "analyzer":"whitespace" - new feature or not supported Elasticsearch	4	541	July 6, 2017
Disabling default analyzer Elasticsearch	4	2885	July 6, 2017
Not_analyzed as default for string type for multiple indices Elasticsearch	2	325	July 6, 2017
Set index.query.parse.allow_unmapped_fields = false for a single index Elasticsearch	3	760	July 6, 2017

Globally disable analysis (i.e. no via per-field mapping)?

Related topics