I run a Morse Code translator website that allows users to convert plain text into Morse code, decode Morse messages, browse educational articles, and use interactive learning tools. As the website has grown, I have also started collecting operational data such as application logs, API requests, search queries, performance metrics, and aggregated usage events to better understand how visitors interact with the platform. I recently introduced the Elastic Stack to centralize logging and improve observability, but I am encountering several challenges related to index design, data ingestion, and search performance as the amount of data continues to increase. I would appreciate guidance on whether my current approach aligns with Elasticsearch best practices or if I should rethink the overall architecture.
One of the biggest challenges is designing an indexing strategy for multiple types of data. The website generates application logs, web server logs, frontend event data, backend API metrics, and optional search activity, each with different structures and retention requirements. Initially, I created separate indices for each data source, but as traffic increased, managing mappings, templates, and lifecycle policies became more complicated. I am unsure whether maintaining separate indices is still the recommended approach or if using data streams or another indexing strategy would simplify long-term maintenance while preserving efficient querying.
Another issue involves ingesting large numbers of small events generated by the interactive translator. Every translation request, audio playback action, copy operation, and page interaction can potentially produce telemetry if enabled, resulting in many lightweight events throughout the day. I want to capture enough information to troubleshoot problems and analyze usage patterns without overwhelming the cluster or creating unnecessary storage overhead. I would appreciate advice on balancing event granularity, ingestion throughput, index size, and long-term storage costs while maintaining meaningful observability.
Search performance is also becoming a concern. In addition to operational logs, I would eventually like to provide search functionality for Morse Code learning articles, documentation, and educational content. As the content library expands, I want search results to remain fast and relevant while avoiding excessive query latency. I am interested in recommendations regarding analyzers, mappings, relevance tuning, pagination strategies, and caching techniques that work well for educational content alongside operational datasets within the same Elastic deployment.
Another challenge is managing retention policies and cluster resources. Some operational logs are only useful for a few weeks, while aggregated analytics and educational content may need to remain searchable for much longer. I have started exploring Index Lifecycle Management, rollover policies, and tiered storage, but I am not yet confident that my configuration is appropriate for a relatively small but steadily growing web application. I would like to understand how experienced Elasticsearch users decide when to roll over indices, archive older data, and optimize storage without negatively affecting search performance or operational visibility.
Finally, I would greatly appreciate advice from the Elastic community regarding the overall architecture for an interactive educational website like my Morse Code platform. For a project that combines real-time application logs, performance monitoring, search functionality, and analytical event data, what would be the recommended Elasticsearch design in terms of indexing strategy, ingestion pipeline, lifecycle management, and query optimization? Any practical recommendations, common pitfalls to avoid, or examples of similar deployments would be extremely helpful as I continue scaling the platform. Sorry for long post!