Adding analyzer to dynamic fields

Onur_Aktas · April 29, 2013, 8:08am

Hi,

I'm new to ES and need some advice about mapping / adding analyzers to
dynamic fields. The product document I want to Index in ES has some dynamic
fields (fields starting with "f_") based on the Product type.

Product A:
{
name: "Product A",
f_material: ["Material A", "Material B"],
f_type: "Type A"
}

Product B:
{
name: "Product B",
f_gender: "Male",
f_size: ["M", "L", "XL", "XXL"]
}

As you can see Fields starting with f_ are dynamic and change based on the
product type. One has f_material and f_type while the other has f_size,
f_gender.
So;

How can I add comma separated analyzer to dynamic fields? Can we
define analyzers to the field names that match some regex? (i.e "^f_.+")
There may be around 200, 300 dynamically named fields, as I read from
different topics I saw that they cause memory issues on Lucene. If I kept
their names as "f_1", "f_2".. instead of "f_material", "f_type", would it
prevent memory issues?

Thanks for your advices,
Onur

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Onur_Aktas · April 29, 2013, 9:26am

Adding this as mapping type resolves my problem.

dynamic_templates: [

{
- template_feature: {
  - mapping: {
    - type: multi_field
    - fields: {
      - {name}: {
        
        type: {dynamic_type}
        
        index: analyzed
        }
      - org: {
        
        type: {dynamic_type}
        
        index: not_analyzed
        }
        }
        }
  - match: f_*
    }
    }

]

On Monday, April 29, 2013 11:08:34 AM UTC+3, Onur Aktaş wrote:

Hi,

I'm new to ES and need some advice about mapping / adding analyzers to
dynamic fields. The product document I want to Index in ES has some dynamic
fields (fields starting with "f_") based on the Product type.

Product A:
{
name: "Product A",
f_material: ["Material A", "Material B"],
f_type: "Type A"
}

Product B:
{
name: "Product B",
f_gender: "Male",
f_size: ["M", "L", "XL", "XXL"]
}

As you can see Fields starting with f_ are dynamic and change based on the
product type. One has f_material and f_type while the other has f_size,
f_gender.
So;

How can I add comma separated analyzer to dynamic fields? Can we
define analyzers to the field names that match some regex? (i.e "^f_.+")

There may be around 200, 300 dynamically named fields, as I read
from different topics I saw that they cause memory issues on Lucene. If I
kept their names as "f_1", "f_2".. instead of "f_material", "f_type",
would it prevent memory issues?

Thanks for your advices,
Onur

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ivan · April 29, 2013, 2:19pm

If you are worried about memory issues, are you sure you want to have a
multi field on every dynamic property? In your example, your values are
multi-valued and are not a single value that is comma delimited (the
correct way to do things). Are you sure you need a custom analyzer?

--
Ivan

On Mon, Apr 29, 2013 at 2:26 AM, Onur Aktaş onurraktas@gmail.com wrote:

Adding this as mapping type resolves my problem.

dynamic_templates: [

{

template_feature: {

mapping: {

type: multi_field

fields: {

{name}: {

type: {dynamic_type}

index: analyzed
}

org: {

type: {dynamic_type}

index: not_analyzed
}
}
}

match: f_*
}
}

]

On Monday, April 29, 2013 11:08:34 AM UTC+3, Onur Aktaş wrote:

Hi,

I'm new to ES and need some advice about mapping / adding analyzers to
dynamic fields. The product document I want to Index in ES has some dynamic
fields (fields starting with "f_") based on the Product type.

Product A:
{
name: "Product A",
f_material: ["Material A", "Material B"],
f_type: "Type A"
}

Product B:
{
name: "Product B",
f_gender: "Male",
f_size: ["M", "L", "XL", "XXL"]
}

As you can see Fields starting with f_ are dynamic and change based on
the product type. One has f_material and f_type while the other has f_size,
f_gender.
So;

How can I add comma separated analyzer to dynamic fields? Can we
define analyzers to the field names that match some regex? (i.e "^f_.+")

There may be around 200, 300 dynamically named fields, as I read
from different topics I saw that they cause memory issues on Lucene. If I
kept their names as "f_1", "f_2".. instead of "f_material", "f_type",
would it prevent memory issues?

Thanks for your advices,
Onur

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Onur_Aktas · April 30, 2013, 11:32am

With multi-field I was able to get facets by"f_type.org", "f_material.org"
etc, however they didn't work for comma separated multi-values as you said.
Could you please tell how I can achieve this without creating multi field
on every dynamic property?.

All I want to do is to get facets based on dynamic fields without breaking
them into words except the ones containg comma.

Material:

Material A (5)
Material B (10)

Size:

L(1)
XL(5)
XL(10)

instead of

Material:

Material (5)
A (5)
B (5)

Onur

On Monday, April 29, 2013 5:19:59 PM UTC+3, Ivan Brusic wrote:

If you are worried about memory issues, are you sure you want to have a
multi field on every dynamic property? In your example, your values are
multi-valued and are not a single value that is comma delimited (the
correct way to do things). Are you sure you need a custom analyzer?

--
Ivan

On Mon, Apr 29, 2013 at 2:26 AM, Onur Aktaş <onurr...@gmail.com<javascript:>

wrote:

Adding this as mapping type resolves my problem.

dynamic_templates: [

{

template_feature: {

mapping: {

type: multi_field

fields: {

{name}: {

type: {dynamic_type}

index: analyzed
}

org: {

type: {dynamic_type}

index: not_analyzed
}
}
}

match: f_*
}
}

]

On Monday, April 29, 2013 11:08:34 AM UTC+3, Onur Aktaş wrote:

Hi,

I'm new to ES and need some advice about mapping / adding analyzers to
dynamic fields. The product document I want to Index in ES has some dynamic
fields (fields starting with "f_") based on the Product type.

Product A:
{
name: "Product A",
f_material: ["Material A", "Material B"],
f_type: "Type A"
}

Product B:
{
name: "Product B",
f_gender: "Male",
f_size: ["M", "L", "XL", "XXL"]
}

As you can see Fields starting with f_ are dynamic and change based on
the product type. One has f_material and f_type while the other has f_size,
f_gender.
So;

How can I add comma separated analyzer to dynamic fields? Can we
define analyzers to the field names that match some regex? (i.e "^f_.+")

There may be around 200, 300 dynamically named fields, as I read
from different topics I saw that they cause memory issues on Lucene. If I
kept their names as "f_1", "f_2".. instead of "f_material", "f_type",
would it prevent memory issues?

Thanks for your advices,
Onur

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ivan · April 30, 2013, 4:27pm

To me, it sounds like the processing needs to happen once at index time,
unless you need the original value to be returned. Your client-side
indexing process can do the comma tokenization process and you can just
index the tokens as non-analyzed. Or you can setup a index analyzer which
knows how to process the text and set your search analyzer as a simple
KeywordAnalyzer (identical to using not analyzed).

--
Ivan

On Tue, Apr 30, 2013 at 4:32 AM, Onur Aktaş onurraktas@gmail.com wrote:

With multi-field I was able to get facets by"f_type.org", "f_material.org"
etc, however they didn't work for comma separated multi-values as you said.
Could you please tell how I can achieve this without creating multi field
on every dynamic property?.

All I want to do is to get facets based on dynamic fields without breaking
them into words except the ones containg comma.

Material:

Material A (5)

Material B (10)

Size:

L(1)

XL(5)

XL(10)

instead of

Material:

Material (5)

A (5)

B (5)

Onur

On Monday, April 29, 2013 5:19:59 PM UTC+3, Ivan Brusic wrote:

If you are worried about memory issues, are you sure you want to have a
multi field on every dynamic property? In your example, your values are
multi-valued and are not a single value that is comma delimited (the
correct way to do things). Are you sure you need a custom analyzer?

--
Ivan

On Mon, Apr 29, 2013 at 2:26 AM, Onur Aktaş onurr...@gmail.com wrote:

Adding this as mapping type resolves my problem.

dynamic_templates: [

{

template_feature: {

mapping: {

type: multi_field

fields: {

{name}: {

type: {dynamic_type}

index: analyzed
}

org: {

type: {dynamic_type}

index: not_analyzed
}
}
}

match: f_*
}
}

]

On Monday, April 29, 2013 11:08:34 AM UTC+3, Onur Aktaş wrote:

Hi,

I'm new to ES and need some advice about mapping / adding analyzers to
dynamic fields. The product document I want to Index in ES has some dynamic
fields (fields starting with "f_") based on the Product type.

Product A:
{
name: "Product A",
f_material: ["Material A", "Material B"],
f_type: "Type A"
}

Product B:
{
name: "Product B",
f_gender: "Male",
f_size: ["M", "L", "XL", "XXL"]
}

As you can see Fields starting with f_ are dynamic and change based on
the product type. One has f_material and f_type while the other has f_size,
f_gender.
So;

How can I add comma separated analyzer to dynamic fields? Can we
define analyzers to the field names that match some regex? (i.e "^f_.+")

There may be around 200, 300 dynamically named fields, as I read
from different topics I saw that they cause memory issues on Lucene. If I
kept their names as "f_1", "f_2".. instead of "f_material", "f_type",
would it prevent memory issues?

Thanks for your advices,
Onur

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_out https://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Adding analyzers during runtime Elasticsearch	3	1651	July 6, 2017
Disabling analyzer for dynamically added fields Elasticsearch	4	687	July 6, 2017
Adding analyzers during runtime Elasticsearch	1	263	July 6, 2017
Adding a field analyzer to an existing field creates duplicate Elasticsearch	1	527	September 21, 2018
Multiple analyzers for dynamic template mapping Elasticsearch	3	1600	December 12, 2018

Adding analyzer to dynamic fields

Related topics