I have two documents on my index. I want to play football and I want to play basketball
With use shingle token filter, i want to analyze this documents like that
I
I want
I want to
Want
Want to
Want to play etc.
And I want to search with querystring something like this play\ foot*.
I expected one record with this query but after execution, i had two records. Because Elasticsearch executed this querystring as play OR foot*. Escape character is not working in this situation.
I am using NEST library in .Net Framework. And my code is
var node = new Uri("http://localhost:9200");
const string indexName = "exampleforshingleanalyzer";
var settings = new ConnectionSettings(
node,
defaultIndex: indexName
);
Console.WriteLine("Program Starting...");
var client = new ElasticClient(settings);
client.DeleteIndex(t => t.AllIndices());
var analyzerSettings = new CustomAnalyzer();
analyzerSettings.Filter = new List<string>();
analyzerSettings.Filter.Add("shingle_token_filter");
analyzerSettings.Tokenizer = "standard";
client.CreateIndex(indexName, t =>
t.Analysis(a =>
a.TokenFilters(token => token.Add("shingle_token_filter",new ShingleTokenFilter
{
MaxShingleSize = 3,
MinShingleSize = 2,
OutputUnigrams = true,
TokenSeparator = "-"
}))
.Analyzers(an => an.Add("shingle_analyzer", analyzerSettings)))
.NumberOfReplicas(1)
.NumberOfShards(1)
.AddMapping<FtsResult>(c => c.MapFromAttributes()
.IgnoreConflicts()
.Type("ftsresult")
.Index(indexName)
.Properties(p => p
.String(s => s.Name(f => f.AgentText).Analyzer("shingle_analyzer").SearchAnalyzer("shingle_analyzer").IndexAnalyzer("shingle_analyzer"))
.String(s => s.Name(f => f.CustomerText).Analyzer("shingle_analyzer").SearchAnalyzer("shingle_analyzer").IndexAnalyzer("shingle_analyzer"))
)
));
var sample = new FtsResult
{
Id = 1,
AgentText = "i want to play football"
};
var sample2 = new FtsResult
{
Id = 2,
AgentText = "i want to play basketball"
};
client.Index(sample, t => t.Index(indexName).Type("ftsresult"));
client.Index(sample2, t => t.Index(indexName).Type("ftsresult"));
client.Refresh(r => r.Index(indexName));
//Getting two records
var result = client.Search<FtsResult>(s => s
.Index(indexName)
.Query(q => q.QueryString(qs => qs.Query("play\\ foot*").AnalyzeWildcard()))
);
//Getting two records
var result2 = client.Search<FtsResult>(s => s
.Index(indexName)
.Query(q => q.QueryString(qs => qs.Query(@"play\ foot*").AnalyzeWildcard()))
);
//Getting nothing
var result3 = client.Search<FtsResult>(s => s
.Index(indexName)
.Query(q => q.QueryString(qs => qs.Query("\"play foot*\"").AnalyzeWildcard()))
);
Console.ReadLine();
And my FtsResult object is
[Serializable]
public class FtsResult
{
public long Id { get; set; }
[ElasticProperty(Index = FieldIndexOption.No)]
public int Rank { get; set; }
[ElasticProperty(Index = FieldIndexOption.Analyzed)]
public string AgentText { get; set; }
[ElasticProperty(Index = FieldIndexOption.No)]
public string AgentTextTimes { get; set; }
[ElasticProperty(Index = FieldIndexOption.Analyzed)]
public string CustomerText { get; set; }
[ElasticProperty(Index = FieldIndexOption.No)]
public string CustomerTextTimes { get; set; }
[ElasticProperty(Index = FieldIndexOption.NotAnalyzed)]
public string EsIndiceName { get; set; }
}
I writed results with comments on result objects. I expect only one record with this query. Could anyone point me in the right direction ?