Which index and store atrributes of string field in explict mapping?

Hello,
I'm writting this post because i am new ES user and i am not sure if i understand correctly index and store attributes of string fields in explict mapping.

I want to have an ES index which contans list of internet sites.
The document type "site" has 3 fields: url, content, inner_note

I will be searching documents which contains given phrases in "content" field.
I will be retrieving single document which has particular url in "url" field.
The field "inner_note" is only for my internal use and i will not use this field to searching/retrieving documents.

I prepared following mapping:

"site" : {
    "properties" : {
        "url" : {"type" : "string","store" : "no", "index" : "not_analyzed"},
        "content" : {"type" : "string", "store" : "yes", "index" : "analyzed"},
        "inner_note" : {"type" : "string","store" : "no", "index" : "no"}
    }

and i have following questions:

  1. Have i choosed optimal "store" and "index" attributes for my scenario?
  2. Is retrieving document by url field as fast as i would retrieve by ID? Comparing to traditional SQL version: if i want to retrieve single row in SQL table by WHERE url = ? i would create SQL index on url column.

I would be grateful for any help!

Best regards

Hello,

On Thu, Jul 4, 2013 at 1:15 PM, Mithrawnuroudo mojcalyspam@gmail.comwrote:

Hello,
I'm writting this post because i am new ES user and i am not sure if i
understand correctly index and store attributes of string fields in explict
mapping.

the "index" attribute tells ES how to index your field:

  • "analyzed" will break it in terms, so by default you'll be able to
    search, for example, for a single word in that field
  • "not_analyzed" will just index your field as a single term, so only exact
    matches will return results
  • "no" won't index your field at all, making your field unsearcheable

"store" tells ES to store the contents of your field or not.

However, there's are two more things to consider here:

I want to have an ES index which contans list of internet sites.
The document type "site" has 3 fields: url, content, inner_note

I will be searching documents which contains given phrases in "content"
field.
I will be retrieving single document which has particular url in "url"
field.
The field "inner_note" is only for my internal use and i will not use this
field to searching/retrieving documents.

I prepared following mapping:

"site" : {
    "properties" : {
        "url" : {"type" : "string","store" : "no", "index" :

"not_analyzed"},
"content" : {"type" : "string", "store" : "yes", "index" :
"analyzed"},
"inner_note" : {"type" : "string","store" : "no", "index" :
"no"}
}

and i have following questions:

  1. Have i choosed optimal "store" and "index" attributes for my scenario?

It should work. I'm not sure about the "optimal" part. I wouldn't store the
"content" field, because it can be retrieved anyway, by using _source.

  1. Is retrieving document by url field as fast as i would retrieve by ID?

Getting by ID is faster, because it doesn't imply a search (even though
your url searches should be fast enough, because you have not_analyzed so
you have a low number of terms). Plus, getting is realtime, as described
here:

Comparing to traditional SQL version: if i want to retrieve single row in
SQL table by WHERE url = ? i would create SQL index on url column.

In the case of ES, the fields are indexed by default. You have to say
index: no if you want it otherwise.

I would be grateful for any help!

I hope this helps :slight_smile:

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.