Choosing Between elastic/mcp-server-elasticsearch and Elasticsearch API for Natural Language Querying in ASP.NET Core

I'm looking to implement a "chat with Elasticsearch index" feature in my ASP.NET Core app, where users can query an index using natural language. Here are my key requirements:

  1. Each chat is limited to a single index—no need for tools like "list_indices" or "get_shards."
  2. ”esql” is also not needed.
  3. Index mappings won't change dynamically.
  4. Final results should be a list of raw Elasticsearch documents (not natural language responses), as I'll display them in a UI list.

I've been exploring the elastic/mcp-server-elasticsearch for this. My plan is to use a Microsoft Semantic Kernel agent that integrates the MCP server's tools as plugins: it uses "get_mappings" to understand the index structure, translates the user's natural language query into an Elasticsearch DSL query, and then runs it via the "search" tool.

As an alternative, I could skip the MCP server and use the Elasticsearch API directly. I'd create a custom class with methods for getting mappings and searching, then add those as plugins to the Semantic Kernel agent.

Some concerns with the MCP server approach:

  • It calls "get_mappings" for every query, and since mappings can be large, this spikes input token usage in the LLM. Given mappings don't change dynamically, it seems wasteful—better to pass the mapping once as context to the agent.
  • Using MCP for the actual search adds more token costs. Since I just need the raw document list (not text), it might be more efficient to have the agent generate the DSL query, then execute the search myself via the Elasticsearch API.

Based on these requirements, is there any real advantage to using the MCP server over building it directly with the Elasticsearch API? Which approach would you recommend for efficiency?

Hey @NavaneethaKannan ,

In your specific use case, the Elasticsearch MCP server doesn’t provide any clear advantages especially if your target is efficiency with the additional token overhead and extra execution steps.

You can potentially tweak the system prompt to include the mappings since it won’t change and try to prevent it from calling that tool but again, you will still incur additional token costs.

Since LLM outputs are non-deterministic and can change with different models, you’d want to implement some deterministic guardrails like validating the generated DSL against a schema especially if this is going to be used in production environments.

I would stick to the alternative choice of using the Elasticsearch API directly since you’re only targeting a single index with static mappings. This will give you a lot more control over your token usage, mapping context and your outputs.

Hope this helps!