Term and string


(Jason Wee) #1

Can anybody explain what is the different between term and string in
elasticsearch context?

When we index using default mapping
(http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html),
the default type is string.

But when we query, we use the word term
(http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html)
instead of string?

I google lucene documentation, the term is define as

A query is broken up into terms and operators. There are two types of
terms: Single Terms and Phrases. A Single Term is a single word such as
"test" or "hello". A Phrase is a group of words surrounded by double quotes
such as "hello dolly".

but it has no mentioned on string.

https://lucene.apache.org/core/4_10_3/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description

Thank you.

Jason

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/df898132-f7f8-4476-8a81-21e3891dfb1a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Doug Turnbull) #2

A term in a purely technical sense is an entry in the inverted index.
Technically it is a very low-level entity.

For example, if you tokenized and analyzed doc1: "Dougie Turnbull" using
the English analyzer (which stems words to root forms, lowercases, etc),
you'd get an inverted index that looks somethinglike:

doug
document: 1
position 0
freq 1
turnbul
document: 1
position 1
freq 1

A "term query" therefore directly accesses terms. Its a bit of a low-level
concern. You'd have to query "doug" directly even though the original text
said "dougie".

However, loosely people use the word "search term" to mean words people
enter into a search bar.

"string" is a concept that just reflects the text being analyzed. IE
"Dougie Turnbull". This type is at the Elasticsearch level, and is a peer
for integer, floats, doubles etc. This type dicates how Elasticsearch
understands the value passed from the client and converts it to the
inverted index structure above. A string type will be analyzed, picked
apart into terms, etc based on the associated analyzer. Other types like
numeric types have other low-level magic that helps convert them to the
inverted index data structure.

Hope that helps,
-Doug

On Thu, Apr 23, 2015 at 10:42 AM, Jason Wee peichieh@gmail.com wrote:

Can anybody explain what is the different between term and string in
elasticsearch context?

When we index using default mapping (
http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html),
the default type is string.

But when we query, we use the word term (
http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html)
instead of string?

I google lucene documentation, the term is define as

A query is broken up into terms and operators. There are two types of
terms: Single Terms and Phrases. A Single Term is a single word such as
"test" or "hello". A Phrase is a group of words surrounded by double
quotes such as "hello dolly".

but it has no mentioned on string.

https://lucene.apache.org/core/4_10_3/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description

Thank you.

Jason

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/df898132-f7f8-4476-8a81-21e3891dfb1a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/df898132-f7f8-4476-8a81-21e3891dfb1a%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
LLC | 240.476.9983 | http://www.opensourceconnections.com
Author: Taming Search http://manning.com/turnbull from Manning
Publications
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALG6HL8RtomqZ3tWxB%2BEN2q_JHZmppGwzVw0HfPWJjTmzVNXCw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jason Wee) #3

Yeap, that help, thanks Doug! :slight_smile:

On Thu, Apr 23, 2015 at 10:56 PM, Doug Turnbull
dturnbull@opensourceconnections.com wrote:

A term in a purely technical sense is an entry in the inverted index.
Technically it is a very low-level entity.

For example, if you tokenized and analyzed doc1: "Dougie Turnbull" using the
English analyzer (which stems words to root forms, lowercases, etc), you'd
get an inverted index that looks somethinglike:

doug
document: 1
position 0
freq 1
turnbul
document: 1
position 1
freq 1

A "term query" therefore directly accesses terms. Its a bit of a low-level
concern. You'd have to query "doug" directly even though the original text
said "dougie".

However, loosely people use the word "search term" to mean words people
enter into a search bar.

"string" is a concept that just reflects the text being analyzed. IE "Dougie
Turnbull". This type is at the Elasticsearch level, and is a peer for
integer, floats, doubles etc. This type dicates how Elasticsearch
understands the value passed from the client and converts it to the inverted
index structure above. A string type will be analyzed, picked apart into
terms, etc based on the associated analyzer. Other types like numeric types
have other low-level magic that helps convert them to the inverted index
data structure.

Hope that helps,
-Doug

On Thu, Apr 23, 2015 at 10:42 AM, Jason Wee peichieh@gmail.com wrote:

Can anybody explain what is the different between term and string in
elasticsearch context?

When we index using default mapping
(http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html),
the default type is string.

But when we query, we use the word term
(http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html)
instead of string?

I google lucene documentation, the term is define as

A query is broken up into terms and operators. There are two types of
terms: Single Terms and Phrases. A Single Term is a single word such as
"test" or "hello". A Phrase is a group of words surrounded by double quotes
such as "hello dolly".

but it has no mentioned on string.

https://lucene.apache.org/core/4_10_3/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description

Thank you.

Jason

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/df898132-f7f8-4476-8a81-21e3891dfb1a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Doug Turnbull | Search Relevance Consultant | OpenSource Connections, LLC |
240.476.9983 | http://www.opensourceconnections.com
Author: Taming Search from Manning Publications
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless of
whether attachments are marked as such.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALG6HL8RtomqZ3tWxB%2BEN2q_JHZmppGwzVw0HfPWJjTmzVNXCw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHO4itziR1cpQn2jkB8SQmNDqYJUpxqqaNHDmPQu1d4u63dpiA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jason Wee) #4

There are some terminology explain at this link.
http://www.elastic.co/guide/en/elasticsearch/reference/0.90/glossary.html

On Fri, Apr 24, 2015 at 10:09 AM, Jason Wee peichieh@gmail.com wrote:

Yeap, that help, thanks Doug! :slight_smile:

On Thu, Apr 23, 2015 at 10:56 PM, Doug Turnbull
dturnbull@opensourceconnections.com wrote:

A term in a purely technical sense is an entry in the inverted index.
Technically it is a very low-level entity.

For example, if you tokenized and analyzed doc1: "Dougie Turnbull" using
the
English analyzer (which stems words to root forms, lowercases, etc),
you'd
get an inverted index that looks somethinglike:

doug
document: 1
position 0
freq 1
turnbul
document: 1
position 1
freq 1

A "term query" therefore directly accesses terms. Its a bit of a
low-level
concern. You'd have to query "doug" directly even though the original
text
said "dougie".

However, loosely people use the word "search term" to mean words people
enter into a search bar.

"string" is a concept that just reflects the text being analyzed. IE
"Dougie
Turnbull". This type is at the Elasticsearch level, and is a peer for
integer, floats, doubles etc. This type dicates how Elasticsearch
understands the value passed from the client and converts it to the
inverted
index structure above. A string type will be analyzed, picked apart into
terms, etc based on the associated analyzer. Other types like numeric
types
have other low-level magic that helps convert them to the inverted index
data structure.

Hope that helps,
-Doug

On Thu, Apr 23, 2015 at 10:42 AM, Jason Wee peichieh@gmail.com wrote:

Can anybody explain what is the different between term and string in
elasticsearch context?

When we index using default mapping
(
http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html
),

the default type is string.

But when we query, we use the word term
(
http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html
)

instead of string?

I google lucene documentation, the term is define as

A query is broken up into terms and operators. There are two types of
terms: Single Terms and Phrases. A Single Term is a single word such as
"test" or "hello". A Phrase is a group of words surrounded by double
quotes

such as "hello dolly".

but it has no mentioned on string.

https://lucene.apache.org/core/4_10_3/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description

Thank you.

Jason

--
You received this message because you are subscribed to the Google
Groups

"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an

email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/df898132-f7f8-4476-8a81-21e3891dfb1a%40googlegroups.com
.

For more options, visit https://groups.google.com/d/optout.

--
Doug Turnbull | Search Relevance Consultant | OpenSource Connections,
LLC |
240.476.9983 | http://www.opensourceconnections.com
Author: Taming Search from Manning Publications
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless of
whether attachments are marked as such.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/CALG6HL8RtomqZ3tWxB%2BEN2q_JHZmppGwzVw0HfPWJjTmzVNXCw%40mail.gmail.com
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHO4ityjCw-gCnSHxOyrHTkW1%2B4U13JbPvbaxcF6aeCmw%3DPH3A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #5