I'm looking to implement a "chat with Elasticsearch index" feature in my ASP.NET Core app, where users can query an index using natural language. Here are my key requirements:
- Each chat is limited to a single index—no need for tools like "list_indices" or "get_shards."
- ”esql” is also not needed.
- Index mappings won't change dynamically.
- Final results should be a list of raw Elasticsearch documents (not natural language responses), as I'll display them in a UI list.
I've been exploring the elastic/mcp-server-elasticsearch for this. My plan is to use a Microsoft Semantic Kernel agent that integrates the MCP server's tools as plugins: it uses "get_mappings" to understand the index structure, translates the user's natural language query into an Elasticsearch DSL query, and then runs it via the "search" tool.
As an alternative, I could skip the MCP server and use the Elasticsearch API directly. I'd create a custom class with methods for getting mappings and searching, then add those as plugins to the Semantic Kernel agent.
Some concerns with the MCP server approach:
- It calls "get_mappings" for every query, and since mappings can be large, this spikes input token usage in the LLM. Given mappings don't change dynamically, it seems wasteful—better to pass the mapping once as context to the agent.
- Using MCP for the actual search adds more token costs. Since I just need the raw document list (not text), it might be more efficient to have the agent generate the DSL query, then execute the search myself via the Elasticsearch API.
Based on these requirements, is there any real advantage to using the MCP server over building it directly with the Elasticsearch API? Which approach would you recommend for efficiency?