No terms generated for trigram analyzer


(Andreas Falk) #1

Hey,

I'm trying to get a trigram analyzer working but i'm fairly sure i'm doing
something wrong because as i understand it it doesn't generate any terms at
all for my document. I've done a complete log with curl commands of what
i'm doing here: https://gist.github.com/luuse/cb707b85c73f8e82cd8d

  1. So i start with creating the index and at the same time i add the
    analyzer and a mapping for all fields in my document. The response when i
    create it is in create.json and the body i send is in mapping.json.
  2. I index the document in url.json and get the response in index.json
  3. I get the termvector in termvector.json
  4. I query it with "jen" and the analyzer trigrams figuring it should match
    against "jenkins" but no results
  5. I query it with "jen*" and still the analyzer trigrams and get the
    jenkins result

So I have two questions...

a. When i fetch the termvector it looks like it empty. Is this correct?
b. Have i missed some detail or what am i doing wrong? Why isn't it working?

I can provide more details if you want. I'm running v1.2.1 in a docker
container.

Cheers
Andreas

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e4fef398-0941-4471-8efa-a97878fcb210%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Cédric Hourcade) #2

Hello,

You are performing a search by uri, by default it searches in the _all
field. In your case this field doesn't use at all your trigrams
analyzer.

You could either pass an explicit query : {"query": {...} }, or
specify which field you want to match: curl -XGET
'http://localhost:9200/urls/_search?q=jen&analyzer=trigrams&pretty=true&df=title'

I think it works for "jen*" because it's converted into a wildcard query.

For the termvectors, you have to enable them in your mapping:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string

Cédric Hourcade
ced@wal.fr

On Tue, Jun 24, 2014 at 12:47 PM, Andreas Falk adde.falk@gmail.com wrote:

Hey,

I'm trying to get a trigram analyzer working but i'm fairly sure i'm doing
something wrong because as i understand it it doesn't generate any terms at
all for my document. I've done a complete log with curl commands of what i'm
doing here: https://gist.github.com/luuse/cb707b85c73f8e82cd8d

  1. So i start with creating the index and at the same time i add the
    analyzer and a mapping for all fields in my document. The response when i
    create it is in create.json and the body i send is in mapping.json.
  2. I index the document in url.json and get the response in index.json
  3. I get the termvector in termvector.json
  4. I query it with "jen" and the analyzer trigrams figuring it should match
    against "jenkins" but no results
  5. I query it with "jen*" and still the analyzer trigrams and get the
    jenkins result

So I have two questions...

a. When i fetch the termvector it looks like it empty. Is this correct?
b. Have i missed some detail or what am i doing wrong? Why isn't it working?

I can provide more details if you want. I'm running v1.2.1 in a docker
container.

Cheers
Andreas

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e4fef398-0941-4471-8efa-a97878fcb210%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJQxjPM_j2EzFnLrLwXkdFV8GJ8WonedystnwiOw%3DDmCQasACQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Andreas Falk) #3

Hey,

Yeah, that was it. Thanks!

Andreas

On Tuesday, June 24, 2014 2:13:10 PM UTC+2, Cédric Hourcade wrote:

Hello,

You are performing a search by uri, by default it searches in the _all
field. In your case this field doesn't use at all your trigrams
analyzer.

You could either pass an explicit query : {"query": {...} }, or
specify which field you want to match: curl -XGET
'
http://localhost:9200/urls/_search?q=jen&analyzer=trigrams&pretty=true&df=title'

I think it works for "jen*" because it's converted into a wildcard query.

For the termvectors, you have to enable them in your mapping:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string

Cédric Hourcade
c...@wal.fr <javascript:>

On Tue, Jun 24, 2014 at 12:47 PM, Andreas Falk <adde...@gmail.com
<javascript:>> wrote:

Hey,

I'm trying to get a trigram analyzer working but i'm fairly sure i'm
doing
something wrong because as i understand it it doesn't generate any terms
at
all for my document. I've done a complete log with curl commands of what
i'm
doing here: https://gist.github.com/luuse/cb707b85c73f8e82cd8d

  1. So i start with creating the index and at the same time i add the
    analyzer and a mapping for all fields in my document. The response when
    i
    create it is in create.json and the body i send is in mapping.json.
  2. I index the document in url.json and get the response in index.json
  3. I get the termvector in termvector.json
  4. I query it with "jen" and the analyzer trigrams figuring it should
    match
    against "jenkins" but no results
  5. I query it with "jen*" and still the analyzer trigrams and get the
    jenkins result

So I have two questions...

a. When i fetch the termvector it looks like it empty. Is this correct?
b. Have i missed some detail or what am i doing wrong? Why isn't it
working?

I can provide more details if you want. I'm running v1.2.1 in a docker
container.

Cheers
Andreas

--
You received this message because you are subscribed to the Google
Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/e4fef398-0941-4471-8efa-a97878fcb210%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/285ac2a0-4698-4264-996b-372534ecc5c1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Andreas Falk) #4

I understand that my fields aren't in the _all field and that's why my
query fails. But shouldn't they be included in _all by default according to
the documentation, "index" defaults to "analyzed" and "include_in_all"
default to "true"?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string

In any case when i set both of those explicitly it still doesn't return
anything when i query with:
curl -XGET '
http://localhost:9200/urls/_search?q=jen&analyzer=trigrams&pretty=true
http://localhost:9200/urls/_search?q=jen&analyzer=trigrams&pretty=true&df=title
'

I updated the gist with the explicit use of index and include_in_all as
reported by elasticsearch:
https://gist.github.com/luuse/cb707b85c73f8e82cd8d#file-new_mapping-json

On Wednesday, June 25, 2014 2:28:28 PM UTC+2, Andreas Falk wrote:

Hey,

Yeah, that was it. Thanks!

Andreas

On Tuesday, June 24, 2014 2:13:10 PM UTC+2, Cédric Hourcade wrote:

Hello,

You are performing a search by uri, by default it searches in the _all
field. In your case this field doesn't use at all your trigrams
analyzer.

You could either pass an explicit query : {"query": {...} }, or
specify which field you want to match: curl -XGET
'
http://localhost:9200/urls/_search?q=jen&analyzer=trigrams&pretty=true&df=title'

I think it works for "jen*" because it's converted into a wildcard query.

For the termvectors, you have to enable them in your mapping:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string

Cédric Hourcade
c...@wal.fr

On Tue, Jun 24, 2014 at 12:47 PM, Andreas Falk adde...@gmail.com
wrote:

Hey,

I'm trying to get a trigram analyzer working but i'm fairly sure i'm
doing
something wrong because as i understand it it doesn't generate any
terms at
all for my document. I've done a complete log with curl commands of
what i'm
doing here: https://gist.github.com/luuse/cb707b85c73f8e82cd8d

  1. So i start with creating the index and at the same time i add the
    analyzer and a mapping for all fields in my document. The response when
    i
    create it is in create.json and the body i send is in mapping.json.
  2. I index the document in url.json and get the response in index.json
  3. I get the termvector in termvector.json
  4. I query it with "jen" and the analyzer trigrams figuring it should
    match
    against "jenkins" but no results
  5. I query it with "jen*" and still the analyzer trigrams and get the
    jenkins result

So I have two questions...

a. When i fetch the termvector it looks like it empty. Is this correct?
b. Have i missed some detail or what am i doing wrong? Why isn't it
working?

I can provide more details if you want. I'm running v1.2.1 in a docker
container.

Cheers
Andreas

--
You received this message because you are subscribed to the Google
Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/e4fef398-0941-4471-8efa-a97878fcb210%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0e44d37c-222b-4162-a26f-fa5e4687acb7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Cédric Hourcade) #5

In fact they are in the _all field, but not analyzed with your
trigrams analyzer.
Cédric Hourcade
ced@wal.fr

On Wed, Jun 25, 2014 at 3:12 PM, Andreas Falk adde.falk@gmail.com wrote:

I understand that my fields aren't in the _all field and that's why my query
fails. But shouldn't they be included in _all by default according to the
documentation, "index" defaults to "analyzed" and "include_in_all" default
to "true"?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string

In any case when i set both of those explicitly it still doesn't return
anything when i query with:
curl -XGET
'http://localhost:9200/urls/_search?q=jen&analyzer=trigrams&pretty=true'

I updated the gist with the explicit use of index and include_in_all as
reported by elasticsearch:
https://gist.github.com/luuse/cb707b85c73f8e82cd8d#file-new_mapping-json

On Wednesday, June 25, 2014 2:28:28 PM UTC+2, Andreas Falk wrote:

Hey,

Yeah, that was it. Thanks!

Andreas

On Tuesday, June 24, 2014 2:13:10 PM UTC+2, Cédric Hourcade wrote:

Hello,

You are performing a search by uri, by default it searches in the _all
field. In your case this field doesn't use at all your trigrams
analyzer.

You could either pass an explicit query : {"query": {...} }, or
specify which field you want to match: curl -XGET

'http://localhost:9200/urls/_search?q=jen&analyzer=trigrams&pretty=true&df=title'

I think it works for "jen*" because it's converted into a wildcard query.

For the termvectors, you have to enable them in your mapping:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string

Cédric Hourcade
c...@wal.fr

On Tue, Jun 24, 2014 at 12:47 PM, Andreas Falk adde...@gmail.com wrote:

Hey,

I'm trying to get a trigram analyzer working but i'm fairly sure i'm
doing
something wrong because as i understand it it doesn't generate any
terms at
all for my document. I've done a complete log with curl commands of
what i'm
doing here: https://gist.github.com/luuse/cb707b85c73f8e82cd8d

  1. So i start with creating the index and at the same time i add the
    analyzer and a mapping for all fields in my document. The response when
    i
    create it is in create.json and the body i send is in mapping.json.
  2. I index the document in url.json and get the response in index.json
  3. I get the termvector in termvector.json
  4. I query it with "jen" and the analyzer trigrams figuring it should
    match
    against "jenkins" but no results
  5. I query it with "jen*" and still the analyzer trigrams and get the
    jenkins result

So I have two questions...

a. When i fetch the termvector it looks like it empty. Is this correct?
b. Have i missed some detail or what am i doing wrong? Why isn't it
working?

I can provide more details if you want. I'm running v1.2.1 in a docker
container.

Cheers
Andreas

--
You received this message because you are subscribed to the Google
Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/e4fef398-0941-4471-8efa-a97878fcb210%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0e44d37c-222b-4162-a26f-fa5e4687acb7%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJQxjPN5t4nZa31bW1-6UGMsnSxPYwbznD6zN4A60uHQEwnxJA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Andreas Falk) #6

Ok, thanks again for the help!

On Wednesday, June 25, 2014 3:37:00 PM UTC+2, Cédric Hourcade wrote:

In fact they are in the _all field, but not analyzed with your
trigrams analyzer.
Cédric Hourcade
c...@wal.fr <javascript:>

On Wed, Jun 25, 2014 at 3:12 PM, Andreas Falk <adde...@gmail.com
<javascript:>> wrote:

I understand that my fields aren't in the _all field and that's why my
query
fails. But shouldn't they be included in _all by default according to
the
documentation, "index" defaults to "analyzed" and "include_in_all"
default
to "true"?

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string

In any case when i set both of those explicitly it still doesn't return
anything when i query with:
curl -XGET
'http://localhost:9200/urls/_search?q=jen&analyzer=trigrams&pretty=true'

I updated the gist with the explicit use of index and include_in_all as
reported by elasticsearch:
https://gist.github.com/luuse/cb707b85c73f8e82cd8d#file-new_mapping-json

On Wednesday, June 25, 2014 2:28:28 PM UTC+2, Andreas Falk wrote:

Hey,

Yeah, that was it. Thanks!

Andreas

On Tuesday, June 24, 2014 2:13:10 PM UTC+2, Cédric Hourcade wrote:

Hello,

You are performing a search by uri, by default it searches in the _all
field. In your case this field doesn't use at all your trigrams
analyzer.

You could either pass an explicit query : {"query": {...} }, or
specify which field you want to match: curl -XGET

'
http://localhost:9200/urls/_search?q=jen&analyzer=trigrams&pretty=true&df=title'

I think it works for "jen*" because it's converted into a wildcard
query.

For the termvectors, you have to enable them in your mapping:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string

Cédric Hourcade
c...@wal.fr

On Tue, Jun 24, 2014 at 12:47 PM, Andreas Falk adde...@gmail.com
wrote:

Hey,

I'm trying to get a trigram analyzer working but i'm fairly sure i'm
doing
something wrong because as i understand it it doesn't generate any
terms at
all for my document. I've done a complete log with curl commands of
what i'm
doing here: https://gist.github.com/luuse/cb707b85c73f8e82cd8d

  1. So i start with creating the index and at the same time i add the
    analyzer and a mapping for all fields in my document. The response
    when

i
create it is in create.json and the body i send is in mapping.json.
2. I index the document in url.json and get the response in
index.json

  1. I get the termvector in termvector.json
  2. I query it with "jen" and the analyzer trigrams figuring it
    should

match
against "jenkins" but no results
5. I query it with "jen*" and still the analyzer trigrams and get
the

jenkins result

So I have two questions...

a. When i fetch the termvector it looks like it empty. Is this
correct?

b. Have i missed some detail or what am i doing wrong? Why isn't it
working?

I can provide more details if you want. I'm running v1.2.1 in a
docker

container.

Cheers
Andreas

--
You received this message because you are subscribed to the Google
Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send

an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/e4fef398-0941-4471-8efa-a97878fcb210%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/0e44d37c-222b-4162-a26f-fa5e4687acb7%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3b74a9d2-2e85-482a-b770-e7581fe6dfe9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #7