Elasticsearch speed comparison issue with sql

I compare the speed of Sql with ElasticSearch. I have 1.5 million data. But sql query runs faster than elastic search. I don't understand the problem. Why is the word like query coming faster in sql?

My code for Sql
<

public static List<Sales> GetAllRecords(string itemType)
        {
            List<Sales> salesReports = new List<Sales>();

            string sqlQuery = String.Format(@"SELECT * FROM dbo.Sales  where Region like '%{0}%'", itemType);
            using (SqlConnection connection = new SqlConnection(CONNECTION_STRING))
            {
                var result = connection.Query<Sales>(sqlQuery);
                foreach (var item in result)
                {
                    Sales global = new Sales()
                    {
                        Region = item.Region,
                        Country = item.Country,
                        Item_Type=item.Item_Type,
                        Order_Date=item.Order_Date,
                        Order_ID = item.Order_ID,
                        Order_Priority=item.Order_Priority,
                        Sales_Channel=item.Sales_Channel,
                        Ship_Date = item.Ship_Date,
                        Total_Cost=item.Total_Cost,
                        Total_Profit=item.Total_Profit,
                        Total_Revenue=item.Total_Revenue,
                        Units_Sold=item.Units_Sold,
                        Unit_Cost=item.Unit_Cost,
                        Unit_Price = item.Unit_Price
                    };
                    salesReports.Add(global);
                   
                }
                return result.ToList();
            }
        }

/>

My code for ElasticSearch. I search the data that I indexed before with elasticsearch here.

<

public static List<Sales> ConfigureES(string inputText)
            {
                List<Sales> salesReports = new List<Sales>();
    
                // 1. Connection URL's elastic search
                var listOfUrls = new Uri[]
                {
                    // here we can set multple connectionn URL's...
                     new Uri("http://localhost:9200/")
                };
    
                StaticConnectionPool connPool = new StaticConnectionPool(listOfUrls);
                ConnectionSettings connSett = new ConnectionSettings(connPool);
                ElasticClient eClient = new ElasticClient(connSett);
    
                //  var see = eClient.DeleteIndex(INDEX_NAME);
    
                // check the connection health
                var checkClusterHealth = eClient.ClusterHealth();
                if (checkClusterHealth.ApiCall.Success && checkClusterHealth.IsValid)
                {
                    // 2. check the index exist or not 
                    var checkResult = eClient.IndexExists(INDEX_NAME);
                    if (!checkResult.Exists)
                    {
                        // Raise error to Index not avaliable
                    }
                    // Search particular text field 
                    var searchResponse = eClient.Search<Sales>(s => 
                   s.Index(INDEX_NAME).From(0).Size(5000).Scroll("10m")
                   .Query(q => q.Match(m => m.Field(f => f.Region).Query(inputText))));
    
                    //var results = eClient.Scroll<Salesreport>("10m", searchResponse.ScrollId);
                    while (searchResponse.Documents.Any())
                    {
                        var res = searchResponse.Documents;
                        var sds = res.Cast<Sales>();
                        salesReports.AddRange(sds);
                        searchResponse = eClient.Scroll<Sales>("10m", searchResponse.ScrollId);
                    }
                }
                else
                {
                    // fail log the exception further use
                    var exception = checkClusterHealth.OriginalException.ToString();
                    var debugException = checkClusterHealth.DebugInformation.ToString();
                }
    
                return salesReports;
            }

/>

where I index data to elasticsearch.

<

public static string CONNECTION_STRING = string.Empty;
            public static string INDEX_NAME = "elastic";
            public static string INDEX_TYPE = "report4";
            private static ElasticClient eClient;
            static void Main(string[] args)
            {
                try
                {
                    // read the config file ...
                    var configuration = new ConfigurationBuilder()
                    .SetBasePath(@"C:\Users\Celal\Desktop\ElasticSearch-master\ElasticSearch-master\ElasticSearchBGPJob\ElasticSearchBGPJob\ElasticSearchBGPJob")
                    .AddJsonFile("appsettings.json", false)
                    .Build();
    
                    CONNECTION_STRING = configuration.GetSection("DefaultConnection").Value;
    
                    if (string.IsNullOrEmpty(CONNECTION_STRING))
                        throw new ArgumentException("No connection string in appsettings.json");
    
                    // 1. Connection URL's elastic search
                    var listOfUrls =
                    // here we can set multple connectionn URL's...
                     new Uri("http://localhost:9200/");
                    ConnectionSettings connSett = new ConnectionSettings(listOfUrls);
                     eClient = new ElasticClient(connSett);
                    
                    //  var see = eClient.DeleteIndex(INDEX_NAME);
                    var createIndexDescriptor = new CreateIndexDescriptor(INDEX_NAME).Mappings(ms => ms.Map<Sales>(m => m.AutoMap()));
                    
                    // check the connection health
                    var checkClusterHealth = eClient.ClusterHealth();
                    if (checkClusterHealth.ApiCall.Success && checkClusterHealth.IsValid)
                    {
                        // 2. check the index exist or not 
                        var checkResult = eClient.IndexExists(INDEX_NAME);
                        if (!checkResult.Exists)
                        {
                            var createIndexResponse = eClient.CreateIndex(createIndexDescriptor);
                            if (createIndexResponse.ApiCall.Success && createIndexResponse.IsValid)
                            {
                                // index is created successfully....
                            }
                            else
                            {
                                // fail log the exception further use
                                var exception = createIndexResponse.OriginalException.ToString();
                                var debugException = createIndexResponse.DebugInformation.ToString();
                            }
                        }
    
                        // 3. get the last documet id of index
                        var lastRecordResponse = eClient.Search<Sales>(s => s
                            .Index(INDEX_NAME)
                            .Type(INDEX_TYPE)
                            .From(0)
                            .Size(1).Sort(sr => sr.Descending(f => f.Order_ID)));
    
                        if (lastRecordResponse.ApiCall.Success && lastRecordResponse.IsValid)
                        {
                            Console.WriteLine("Start " + DateTime.Now);
                            long salesRecordId = 0;
                            var listofrecords = new List<Sales>();
                            if (lastRecordResponse.Documents.Count >= 1)
                            {
                                var obj = lastRecordResponse.Documents;
                                foreach (var item in obj)
                                {
                                    salesRecordId = item.Order_ID;
                                }
    
                                listofrecords = GetAllRecords(salesRecordId);
    
                            }
                            else
                            {
                                listofrecords = GetAllRecords(salesRecordId);
                            }
    
                            Console.WriteLine("END " + DateTime.Now);
    
                            //   Insert the data into document format corresponding index...
                            if (listofrecords.Count > 0)
                            {
                                Console.WriteLine("===== START========= " + DateTime.Now);
                                BulkInsertData(listofrecords, eClient).Wait();
                                Console.WriteLine("===== END========= " + DateTime.Now);
                            }
                        }
                        else
                        {
                            // fail log the exception further use
                            var exception = lastRecordResponse.OriginalException.ToString();
                            var debugException = lastRecordResponse.DebugInformation.ToString();
                        }
                    }
                    else
                    {
                        // fail log the exception further use
                        var exception = checkClusterHealth.OriginalException.ToString();
                        var debugException = checkClusterHealth.DebugInformation.ToString();
                    }
                    Console.WriteLine("Hello World!");
                    Console.ReadLine();
                }
                catch (Exception ex)
                {
                    Console.WriteLine(ex.ToString());
                    Console.ReadLine();
                }
            }
            public static List<Sales> GetAllRecords(long LastSalesId)
            {
                List<Sales> salesReports = new List<Sales>();
                string sqlQuery = String.Format(@"SELECT * FROM dbo.Sales  where Order_ID > {0} ", LastSalesId);
                using (SqlConnection connection = new SqlConnection(CONNECTION_STRING))
                {
                    connection.Open();
                    using (SqlCommand command = new SqlCommand(sqlQuery, connection))
                    {
                        command.CommandTimeout = 1000;
                        using (SqlDataReader dataReader = command.ExecuteReader())
                        {
    
                            if (dataReader.HasRows)
                            {
                                while (dataReader.Read())
                                {
    
                                    Sales global = new Sales()
                                        {
                                            Order_ID=Convert.ToInt32(dataReader["Order_ID"]),
                                            Region=Convert.ToString(dataReader["Region"]),
                                            Country = Convert.ToString(dataReader["Country"]),
                                            Total_Cost = (decimal)Convert.ToDouble(dataReader["Total_Cost"]),
                                            Total_Revenue = Convert.ToString(dataReader["Total_Revenue"]),
                                            Item_Type = Convert.ToString(dataReader["Item_Type"])
                                        };
                                       
                                    salesReports.Add(global);
                                }
                            }
                        }
                    }
                    connection.Close();
                }
    
                return salesReports;
            }
            static async Task BulkInsertData(List<Sales> ListofData, ElasticClient Eclient)
            {
                try
                {
                    var splitTheLargeList = ChunkBy(ListofData);
                    var test = splitTheLargeList.LastOrDefault();
    
                    foreach (var item in splitTheLargeList)
                    {
                        var bulkResponse = await Eclient.BulkAsync(b => b
                                            .Index(INDEX_NAME)
                                            // .Type(INDEX_TYPE)
                                            .IndexMany(item));
    
                        if (bulkResponse.ApiCall.Success && bulkResponse.IsValid)
                        {
                            // success fully inserted...
                        }
                       
                    }
                }
                catch (Exception ex)
                {
                    Console.WriteLine(ex.InnerException.ToString());
                }
            }
    
            
    
            

/>

You have a quite small data set and are running a wildcard query with leading wildcard, which is the slowest type of query you can use in Elasticsearch. I would expect a relational database to be quite well optimized for this so am not surprised it is faster.

what should I do

Are you currently using a RDBMS? If so, what is the driver behind considering switching to Elasticsearch?

No,I dont use RDBMS right now.I'm throwing a "like" query over only one table to sql.

I would recommend either changing how you index data so you do not need to use wildcard queries or change your mappings to make use of the wildcard field type if you are not able to get around the use of wildcard queries.

What is your use case? How much data do you expect to need to search when live?

I think there will be approximately 1 million data in live.Can you show me an example on the code?

The question is about the use case. What are you using wildcards for?

Let me ask you the question in another way. Do you use wildcards when you search with Google? Probably not.

The same principle applies to elasticsearch. You basically need to know:

  • what are the type of objects my users are going to search for?
  • on which fields they want to search?
  • what are the typical inputs they will enter?

Once you have that, it's easier to build the right index to solve your use case. As an example, here is a way to solve some common problems for a given use case.

That does not mean this will work for you.

Also, retrieving 5000 documents can be slow. Do you really want to show 5000 results in one page to your users?

I did it this way. But only 10 records came. What should I do to bring other records? I am very new to Elasticsearch and I am not familiar with the subject.

var searchResponse = eClient.Search<Sales>(s => s.Index(INDEX_NAME)
   .Query(q => q                    
  .Bool(b => b                     
    .Should(m => m
      .Wildcard(c => c
        .Field(f=>f.Region).Value(inputText.ToLower() + "*")
      )
    )
  )
));

You can use:

  • the size and from parameters to display by default up to 10000 records to your users. If you want to change this limit, you can change index.max_result_window setting but be aware of the consequences (ie memory).
  • the search after feature to do deep pagination.
  • the Scroll API if you want to extract a resultset to be consumed by another tool later.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.