ELK Stack on AWS: Elasticsearch, Logstash and Kibana Setup
The ELK stack has gotten complicated with all the configuration options, scaling considerations, and managed vs. self-hosted debates flying around. As someone who’s deployed ELK stacks handling billions of log entries across multiple production environments, I learned everything there is to know about building a log analytics pipeline that actually works. Today, I will share it all with you.
My first ELK deployment was a disaster. I set up a single Elasticsearch node, pointed Logstash at it, opened Kibana, and thought I was done. Two weeks later, the disk filled up, Elasticsearch went read-only, and we lost visibility into our entire application at the worst possible time — during a production incident. That experience taught me to respect ELK’s complexity and plan properly from the start.
What the ELK Stack Actually Is

ELK is three open-source tools that work together: Elasticsearch for storage and search, Logstash for data ingestion and transformation, and Kibana for visualization. Together, they give you a complete pipeline for collecting, processing, storing, and analyzing log data. Most people also add Beats (lightweight data shippers) and call it the Elastic Stack, but everyone still says “ELK.”
Elasticsearch: The Heart of the Stack
Elasticsearch is a distributed search and analytics engine built on Apache Lucene. It’s where your data lives and where queries run. Here’s what makes it special:
- Distributed by design: Elasticsearch scales horizontally by adding more nodes. Data gets automatically sharded and replicated across the cluster.
- Near real-time search: Documents become searchable within about one second of being indexed. For log analytics, that’s close enough to real-time.
- Schema-free JSON: You can throw JSON documents at it without defining a schema first. It’ll figure out the field types (though I recommend defining mappings explicitly for production).
- RESTful API: Every operation is an HTTP request. Easy to interact with, easy to script, easy to integrate.
On AWS, you have two choices: run Elasticsearch yourself on EC2, or use Amazon OpenSearch Service (the managed version). I’ll get into that decision later.
Logstash: The Data Pipeline
Logstash sits between your data sources and Elasticsearch. It does three things: input (collect data from various sources), filter (parse, transform, and enrich the data), and output (send it to Elasticsearch or other destinations).
Probably should have led with this section, honestly, because Logstash configuration is where most ELK deployments go wrong. A poorly configured Logstash pipeline can bottleneck your entire stack.
The filter stage is where Logstash earns its keep. It can parse unstructured log lines into structured fields using grok patterns, enrich data with geolocation or DNS lookups, drop noisy events you don’t care about, and mutate fields to standardize formats. I’ve written grok patterns that turned messy application logs into beautifully structured documents that made our dashboards actually useful.
A word of caution: Logstash is resource-hungry. It’s a JVM application that needs decent CPU and memory, especially at high throughput. For lighter use cases, Filebeat can ship logs directly to Elasticsearch without Logstash in the middle.
Kibana: Making Sense of It All
Kibana is the visualization layer. It connects to Elasticsearch and lets you search, filter, and visualize your data through dashboards, charts, and maps. Some things I use Kibana for daily:
- Log exploration: Searching through millions of log entries with complex queries. The Discover tab is incredibly powerful for this.
- Dashboards: Real-time dashboards showing error rates, response times, request volumes, and system metrics. I have dashboards for every production service.
- Alerting: Setting up rules that notify me when error rates spike or specific log patterns appear.
- Visualizations: Line charts for trends, bar charts for comparisons, heatmaps for time-based analysis, pie charts for distribution.
Running ELK on AWS
You’ve got two main paths here, and the right choice depends on your team and requirements.
Amazon OpenSearch Service (Managed)
This is Elasticsearch as a managed service (AWS forked Elasticsearch into OpenSearch, but functionally it’s the same for most use cases). AWS handles cluster provisioning, patching, backups, and scaling. You get Kibana (called OpenSearch Dashboards) included.
That’s what makes OpenSearch Service endearing to us cloud engineers — it removes the operational burden while giving you the same analytical capabilities. I use managed OpenSearch for teams that don’t have dedicated Elasticsearch expertise. The setup takes about 15 minutes, versus days for a self-managed cluster.
Self-Managed on EC2
Running Elasticsearch on EC2 gives you full control over the configuration, version, and plugins. It’s more work, but some organizations need specific versions, custom plugins, or want to avoid vendor lock-in. I’ve run self-managed clusters on R-family instances (memory optimized) with EBS gp3 volumes for storage.
Architecture Best Practices
Whether managed or self-managed, these principles apply:
- Separate node types: Use dedicated master nodes (3 minimum), data nodes (scale based on volume), and coordinating nodes for large clusters.
- Right-size your shards: Aim for shards between 10-50 GB. Too many small shards waste resources. Too few large shards hurt search performance.
- Index lifecycle management: Automatically roll over, shrink, and delete old indices. I keep hot data (last 7 days) on fast storage, warm data (7-30 days) on cheaper storage, and delete anything older than 90 days unless compliance says otherwise.
- Use Filebeat over Logstash when possible: Filebeat is lighter weight and handles basic log shipping well. Reserve Logstash for when you need complex transformations.
Common ELK Challenges
- Disk management: Elasticsearch loves disk space. Monitor disk usage religiously and set up alerts at 80% capacity. When Elasticsearch hits its disk watermark, it goes read-only, and your logging pipeline stops.
- Mapping explosions: If your logs contain dynamic field names, Elasticsearch creates a mapping for each one. Thousands of fields destroy performance. Control your mappings.
- JVM heap pressure: Elasticsearch runs on the JVM, and garbage collection pauses can cause cluster instability. Set heap to 50% of available RAM, never exceeding 32 GB.
- Ingestion backpressure: When Elasticsearch can’t index fast enough, queues build up in Logstash. Monitor queue sizes and add data nodes when you see sustained backpressure.
Getting Started: A Practical Setup
If you’re starting fresh on AWS, here’s the path I recommend:
- Start with Amazon OpenSearch Service. Create a domain with 3 data nodes (r6g.large is a good starting point).
- Install Filebeat on your EC2 instances or ECS containers to ship logs to OpenSearch.
- Create index patterns in OpenSearch Dashboards (Kibana) to make your data searchable.
- Build dashboards for the metrics that matter: error rates, latency percentiles, request volumes.
- Set up index lifecycle policies to manage data retention automatically.
- Configure alerting for critical conditions.
This setup handles most use cases and can scale from thousands to billions of log entries per day. Start simple, add complexity as your needs grow, and always keep an eye on storage costs — they’re the biggest ongoing expense with any ELK deployment.
The ELK stack isn’t just a logging tool — it’s a window into what your applications are actually doing. Once you have good log analytics in place, troubleshooting becomes faster, capacity planning becomes data-driven, and production incidents become less scary. It’s one of those investments that pays for itself quickly.
Stay in the loop
Get the latest wildlife research and conservation news delivered to your inbox.