Hi folks, I've been trying to figure out the default analyzer for
'_all'. At first, I was simply thinking that it would be the standard
analyzer. But as my testing shows, it's not the case at all (stop
words are kept?!)? After some testing, it would appear to be using the
standard tokenizer, with a lowercase filter.
I would have tried to find out more myself... but it's not as if you
could query the mapping and try to see the result for yourself (it
doesnt show for '_all'). Does anybody have any more info? Is it using
any provided analyzer?
thanks,
-jf
--
He who settles on the idea of the intelligent man as a static entity
only shows himself to be a fool.
Mensan / Full-Stack Technical Polymath / System Administrator
12 years over the entire web stack: Performance, Sysadmin, Ruby and Frontend
Hi folks, I've been trying to figure out the default analyzer for
'_all'. At first, I was simply thinking that it would be the standard
analyzer. But as my testing shows, it's not the case at all (stop
words are kept?!)? After some testing, it would appear to be using the
standard tokenizer, with a lowercase filter.
I would have tried to find out more myself... but it's not as if you
could query the mapping and try to see the result for yourself (it
doesnt show for '_all'). Does anybody have any more info? Is it using
any provided analyzer?
thanks,
-jf
--
He who settles on the idea of the intelligent man as a static entity
only shows himself to be a fool.
Mensan / Full-Stack Technical Polymath / System Administrator
12 years over the entire web stack: Performance, Sysadmin, Ruby and Frontend
Hi folks, I've been trying to figure out the default analyzer for
'_all'. At first, I was simply thinking that it would be the standard
analyzer. But as my testing shows, it's not the case at all (stop
words are kept?!)? After some testing, it would appear to be using the
standard tokenizer, with a lowercase filter.
I would have tried to find out more myself... but it's not as if you
could query the mapping and try to see the result for yourself (it
doesnt show for '_all'). Does anybody have any more info? Is it using
any provided analyzer?
thanks,
-jf
--
He who settles on the idea of the intelligent man as a static entity
only shows himself to be a fool.
Mensan / Full-Stack Technical Polymath / System Administrator
12 years over the entire web stack: Performance, Sysadmin, Ruby and Frontend
My cluster has 1 master and data node and 3 data nodes. Each have 6 GB
heap size (which is nearly half of the machine). I have 250 shards with
replicas and I have 600 GB data in total. When I start the cluster, I
can use it for 1 week without any problem. After a week, my cluster
begins to fail due to low memory (below 10%). When I restart all the
nodes, everything is fine, again. Free memory goes up to 40%. And, it
fails again 1 week after the restart.
I think some data is remaining in the memory for a long time even if it
is not used. Is there any configuration to optimize this? Do I need to
flush indices or clear cache periodically?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.