Process string when storing in index

Hi all,

I'm using Elasticsearch as a backend for a website, where documents coming
from a Content Management System are stored in Elasticsearch.

Now I'm looking into a mapping scheme from documents to friendly URLs and
back. For instance, a document having title 'G3PH industrial high-power
solid state relay' should map to
'g3ph-industrial-high-power-solid-state-relay'. That name can then be used
in a URL. Using that URL, the correct document should be fetched from
Elasticsearch and shown on a page.

What would be the best way to implement this?

One solution would be to transform the title field and store it in a
multi_field upon storing a document in the index.
Should I create a custom analyzer for this? Use a scripting field?

Cheers,
Jippe Holwerda

You mean you want to store the field differently? Index it differently? Or,
perhaps, fetch is differently when executing a search?

On Sat, Mar 17, 2012 at 3:06 PM, Jippe Holwerda mail@jippeholwerda.nlwrote:

Hi all,

I'm using Elasticsearch as a backend for a website, where documents coming
from a Content Management System are stored in Elasticsearch.

Now I'm looking into a mapping scheme from documents to friendly URLs and
back. For instance, a document having title 'G3PH industrial high-power
solid state relay' should map to
'g3ph-industrial-high-power-solid-state-relay'. That name can then be used
in a URL. Using that URL, the correct document should be fetched from
Elasticsearch and shown on a page.

What would be the best way to implement this?

One solution would be to transform the title field and store it in a
multi_field upon storing a document in the index.
Should I create a custom analyzer for this? Use a scripting field?

Cheers,
Jippe Holwerda

Well, I want to be able to search on the friendly URL, so it definitely
should be indexed. That's why I think fetching it differently when
executing a search is too late.

My ideal solution would be to be able to specify a multi_field in the
mapping where I have 3 subfields: one where the title is analyzed, one
where it is not analyzed and one where I can specify a custom mapping
function (Javascript?) that takes the title as input.

The alternative would be to simply make sure I have the necessary
information in the JSON available and not analyze it. But I was wondering
whether I could solve it in Elasticsearch as well.

On Tuesday, March 20, 2012 10:49:38 AM UTC+1, kimchy wrote:

You mean you want to store the field differently? Index it differently?
Or, perhaps, fetch is differently when executing a search?

Hi all,

I'm using Elasticsearch as a backend for a website, where documents
coming from a Content Management System are stored in Elasticsearch.

Now I'm looking into a mapping scheme from documents to friendly URLs and
back. For instance, a document having title 'G3PH industrial high-power
solid state relay' should map to
'g3ph-industrial-high-power-solid-state-relay'. That name can then be used
in a URL. Using that URL, the correct document should be fetched from
Elasticsearch and shown on a page.

What would be the best way to implement this?

One solution would be to transform the title field and store it in a
multi_field upon storing a document in the index.
Should I create a custom analyzer for this? Use a scripting field?

Cheers,
Jippe Holwerda

There isn't an option to define a field that indexes the result of a
script. You could write a plugin that has a custom type, but the simplest
option is to preprocess the json before indexing.

On Tue, Mar 20, 2012 at 12:23 PM, Jippe Holwerda mail@jippeholwerda.nlwrote:

Well, I want to be able to search on the friendly URL, so it definitely
should be indexed. That's why I think fetching it differently when
executing a search is too late.

My ideal solution would be to be able to specify a multi_field in the
mapping where I have 3 subfields: one where the title is analyzed, one
where it is not analyzed and one where I can specify a custom mapping
function (Javascript?) that takes the title as input.

The alternative would be to simply make sure I have the necessary
information in the JSON available and not analyze it. But I was wondering
whether I could solve it in Elasticsearch as well.

On Tuesday, March 20, 2012 10:49:38 AM UTC+1, kimchy wrote:

You mean you want to store the field differently? Index it differently?
Or, perhaps, fetch is differently when executing a search?

Hi all,

I'm using Elasticsearch as a backend for a website, where documents
coming from a Content Management System are stored in Elasticsearch.

Now I'm looking into a mapping scheme from documents to friendly URLs
and back. For instance, a document having title 'G3PH industrial high-power
solid state relay' should map to 'g3ph-industrial-high-power-**solid-state-relay'.
That name can then be used in a URL. Using that URL, the correct document
should be fetched from Elasticsearch and shown on a page.

What would be the best way to implement this?

One solution would be to transform the title field and store it in a
multi_field upon storing a document in the index.
Should I create a custom analyzer for this? Use a scripting field?

Cheers,
Jippe Holwerda

Ok, that's clear. Thanks for your answer.

On Mar 20, 2012, at 12:01 , Shay Banon wrote:

There isn't an option to define a field that indexes the result of a script. You could write a plugin that has a custom type, but the simplest option is to preprocess the json before indexing.

On Tue, Mar 20, 2012 at 12:23 PM, Jippe Holwerda mail@jippeholwerda.nl wrote:
Well, I want to be able to search on the friendly URL, so it definitely should be indexed. That's why I think fetching it differently when executing a search is too late.

My ideal solution would be to be able to specify a multi_field in the mapping where I have 3 subfields: one where the title is analyzed, one where it is not analyzed and one where I can specify a custom mapping function (Javascript?) that takes the title as input.

The alternative would be to simply make sure I have the necessary information in the JSON available and not analyze it. But I was wondering whether I could solve it in Elasticsearch as well.

On Tuesday, March 20, 2012 10:49:38 AM UTC+1, kimchy wrote:
You mean you want to store the field differently? Index it differently? Or, perhaps, fetch is differently when executing a search?

Hi all,

I'm using Elasticsearch as a backend for a website, where documents coming from a Content Management System are stored in Elasticsearch.

Now I'm looking into a mapping scheme from documents to friendly URLs and back. For instance, a document having title 'G3PH industrial high-power solid state relay' should map to 'g3ph-industrial-high-power-solid-state-relay'. That name can then be used in a URL. Using that URL, the correct document should be fetched from Elasticsearch and shown on a page.

What would be the best way to implement this?

One solution would be to transform the title field and store it in a multi_field upon storing a document in the index.
Should I create a custom analyzer for this? Use a scripting field?

Cheers,
Jippe Holwerda