I have data stored in ES(1.4.2) as
1st document :
{
"col1":"123","col2":"tag1,tag2,tag4"
}
2nd...
{
"col1":"333","col2":"tag1,tag4,tag5"
}
3rd...
{
"col1":"111","col2":"tag1,tag1,tag5,tag5"
}
now when I am searching it via making search api call - Search is for tag1
it returns me the count of 3 where as I am looking for 4 since 3rd
document is having tag1 two times -
I am generating the stats data via kibana (3.1.2) so thats my client who
makes call to ES?
Is there any analyzer or tokenizer that I should be using while creating an
index?
Though I am not using any special tokenizer or anything while creating
index - I am creating index on both col1 and col2.
I don't think you can aggregate substrings in a single field since normal
aggregation is based on matches against the entire document. By having
nested (or parent child), now you have documents for each tag that you can
aggregate against.
On Wednesday, January 14, 2015 at 11:35:07 AM UTC-8, Bhumir Jhaveri wrote:
Hey,
I have data stored in ES(1.4.2) as
1st document :
{
"col1":"123","col2":"tag1,tag2,tag4"
}
2nd...
{
"col1":"333","col2":"tag1,tag4,tag5"
}
3rd...
{
"col1":"111","col2":"tag1,tag1,tag5,tag5"
}
now when I am searching it via making search api call - Search is for tag1
it returns me the count of 3 where as I am looking for 4 since 3rd
document is having tag1 two times -
I am generating the stats data via kibana (3.1.2) so thats my client who
makes call to ES?
Is there any analyzer or tokenizer that I should be using while creating
an index?
Though I am not using any special tokenizer or anything while creating
index - I am creating index on both col1 and col2.
Is there any other alternate since this particular field would be accessed
from kibana? and I dont think kibana has support for such aggregation or
may be I have not explored enough - to say it better.
On Wednesday, January 14, 2015 at 12:00:17 PM UTC-8, Ed Kim wrote:
I don't think you can aggregate substrings in a single field since normal
aggregation is based on matches against the entire document. By having
nested (or parent child), now you have documents for each tag that you can
aggregate against.
On Wednesday, January 14, 2015 at 11:35:07 AM UTC-8, Bhumir Jhaveri wrote:
Hey,
I have data stored in ES(1.4.2) as
1st document :
{
"col1":"123","col2":"tag1,tag2,tag4"
}
2nd...
{
"col1":"333","col2":"tag1,tag4,tag5"
}
3rd...
{
"col1":"111","col2":"tag1,tag1,tag5,tag5"
}
now when I am searching it via making search api call - Search is for
tag1 - it returns me the count of 3 where as I am looking for 4 since 3rd
document is having tag1 two times -
I am generating the stats data via kibana (3.1.2) so thats my client who
makes call to ES?
Is there any analyzer or tokenizer that I should be using while creating
an index?
Though I am not using any special tokenizer or anything while creating
index - I am creating index on both col1 and col2.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.