Analyzing SpanNearQuery

Niko · June 29, 2011, 8:27am

Hi everyone,

I have a question regarding SpanNearQueries. As I saw Term Queries
were not analyzed before search, but are SpanNearQueries? Is there a
way to analyze the SpanTermQueries inside of SpanNearQueries using the
analyzer defined in the field mapping.

Thanks in advance.

Cheers,
Niko

Clinton_Gormley · June 29, 2011, 10:25am

Hi Niko

I have a question regarding SpanNearQueries. As I saw Term Queries
were not analyzed before search, but are SpanNearQueries? Is there a
way to analyze the SpanTermQueries inside of SpanNearQueries using the
analyzer defined in the field mapping.

Span queries are not analyzed either.

clint

Niko · June 29, 2011, 10:28am

Hi Clint,

okay, is there a way to workaround? We would like to have analyzed terms
within the SpanQuery, or is the only way to analyze by hand and put the
terms then back into the SpanQuery?

Niko

2011/6/29 Clinton Gormley clinton@iannounce.co.uk

Hi Niko

I have a question regarding SpanNearQueries. As I saw Term Queries
were not analyzed before search, but are SpanNearQueries? Is there a
way to analyze the SpanTermQueries inside of SpanNearQueries using the
analyzer defined in the field mapping.

Span queries are not analyzed either.

clint

--

Gruß

Niko Gross

Clinton_Gormley · June 29, 2011, 10:36am

Hi Niko

okay, is there a way to workaround? We would like to have analyzed
terms within the SpanQuery, or is the only way to analyze by hand and
put the terms then back into the SpanQuery?

Yes, I'm afraid that's the only way.

One way that you could approach it is to use the 'analyze' API. So you
pass your text to ES, get the analyzed results back, then use those.

Not sure it is the most efficient way of doing it though.

clint

Niko · June 29, 2011, 10:39am

Hi,

ok thanks. That what we thought we have to do.

Niko

2011/6/29 Clinton Gormley clinton@iannounce.co.uk

Hi Niko

okay, is there a way to workaround? We would like to have analyzed
terms within the SpanQuery, or is the only way to analyze by hand and
put the terms then back into the SpanQuery?

Yes, I'm afraid that's the only way.

One way that you could approach it is to use the 'analyze' API. So you
pass your text to ES, get the analyzed results back, then use those.

Not sure it is the most efficient way of doing it though.

Elasticsearch Platform — Find real-time answers at scale | Elastic

clint

--

Gruß

Niko Gross

xam · December 16, 2016, 10:36am

Hi all,

I just found this forum thread which is related to my problem.
Is there anything new in Elasticsearch to avoid the workaround proposed by Clinton (as far as I know, it is not the case)?
Maybe an enhancement to the workaround is to create a plugin on the ES side that does the analysis in order to avoid a query to the analysis API from the client side. Is it a correct approach?

Thanks,

Ivan · December 16, 2016, 10:39pm

If you using Java, I used to use the analysis service locally using the
TransportClient. Not sure that is even possible, since so many useful
workarounds have been closed. The issue does not stem from Elasticsearch,
but Lucene, since that is where the limitation is.

xam · December 29, 2016, 8:45am

Thank you,

I'll try this approach.

Ivan · December 30, 2016, 8:40pm

It should be clear that when I said the issue is in Lucene, I mean the need
to pre-analyze tokens.

It used to be easy to create your own AnalysisService locally (if using the
Java client), but I might have gotten difficult with each new version of
Elasticsearch. I have not SpanQueries since 1.x. Just look at any unit
tests for the analysis chain such as CharFilterTests:

2.4:

github.com

elastic/elasticsearch/blob/2.4/core/src/test/java/org/elasticsearch/index/analysis/CharFilterTests.java#L60


            .putArray("index.analysis.analyzer.custom_with_char_filter.char_filter", "my_mapping")
            .put("path.home", createTempDir().toString())
            .build();
    Injector parentInjector = new ModulesBuilder().add(new SettingsModule(settings), new EnvironmentModule(new Environment(settings))).createInjector();
    Injector injector = new ModulesBuilder().add(
            new IndexSettingsModule(index, settings),
            new IndexNameModule(index),
            new AnalysisModule(settings, parentInjector.getInstance(IndicesAnalysisService.class)))
            .createChildInjector(parentInjector);


    AnalysisService analysisService = injector.getInstance(AnalysisService.class);


    NamedAnalyzer analyzer1 = analysisService.analyzer("custom_with_char_filter");
    
    assertTokenStreamContents(analyzer1.tokenStream("test", "jeff quit phish"), new String[]{"jeff", "qit", "fish"});


    // Repeat one more time to make sure that char filter is reinitialized correctly
    assertTokenStreamContents(analyzer1.tokenStream("test", "jeff quit phish"), new String[]{"jeff", "qit", "fish"});
}


@Test

5.0:

github.com

elastic/elasticsearch/blob/5.0/core/src/test/java/org/elasticsearch/index/analysis/CharFilterTests.java#L44


public void testMappingCharFilter() throws Exception {
    Settings settings = Settings.builder()
            .put(IndexMetaData.SETTING_VERSION_CREATED, Version.CURRENT)
            .put("index.analysis.char_filter.my_mapping.type", "mapping")
            .putArray("index.analysis.char_filter.my_mapping.mappings", "ph=>f", "qu=>q")
            .put("index.analysis.analyzer.custom_with_char_filter.tokenizer", "standard")
            .putArray("index.analysis.analyzer.custom_with_char_filter.char_filter", "my_mapping")
            .put(Environment.PATH_HOME_SETTING.getKey(), createTempDir().toString())
            .build();
    IndexSettings idxSettings = IndexSettingsModule.newIndexSettings("test", settings);
    AnalysisService analysisService = createAnalysisService(idxSettings, settings);
    NamedAnalyzer analyzer1 = analysisService.analyzer("custom_with_char_filter");


    assertTokenStreamContents(analyzer1.tokenStream("test", "jeff quit phish"), new String[]{"jeff", "qit", "fish"});


    // Repeat one more time to make sure that char filter is reinitialized correctly
    assertTokenStreamContents(analyzer1.tokenStream("test", "jeff quit phish"), new String[]{"jeff", "qit", "fish"});
}


public void testHtmlStripCharFilter() throws Exception {
    Settings settings = Settings.builder()

master:
https://github.com/elastic/elasticsearch/blob/master/core/src/test/java/org/elasticsearch/index/analysis/CharFilterTests.java#L42

In 2.x, all you needed was the injector. Now it seems that you cannot even
create the AnalysisService (have not fully explored the new code).

Cheers,

Ivan

Topic		Replies	Views
Forcing Analysis of Terms and Span Terms? Elasticsearch	5	795	July 6, 2017
Using ElasticSearch analyzers outside of ElasticSearch Elasticsearch	2	702	July 6, 2017
Span first queries Elasticsearch	2	342	July 6, 2017
SpanNearQuery Elasticsearch	3	415	January 13, 2017
Looking for source code that shows term based query analyzer behavior Elasticsearch	4	440	July 5, 2017

Analyzing SpanNearQuery

--

--

Related topics