AWS OpenSearch Serverless — When It Makes Sense and When to Avoid It

OpenSearch Serverless has gotten complicated with all the “just go serverless” noise flying around. As someone who’s helped teams spin up both configurations across wildly different workloads — scrappy startup logging pipelines, enterprise search backends grinding through millions of queries daily — I learned everything there is to know about how fast that advice falls apart. This isn’t a feature tour. AWS docs handle that. What follows is a practical guide to when serverless earns its place and when it’ll quietly hollow out your budget before anyone notices.

Serverless vs Provisioned — The Decision Matrix

But what is OpenSearch Serverless, really? In essence, it’s a fundamentally different architecture from OpenSearch Service — compute and storage are separated, no nodes to manage, nothing to size upfront. But it’s much more than that. Capacity scales automatically through OCUs (OpenSearch Compute Units), while provisioned means you pick instance types, node counts, and storage yourself — paying for that capacity whether traffic shows up or not.

That tradeoff is the whole game. Here’s how I actually think about it:

Unpredictable, spiky traffic — Serverless wins. If your workload looks like a heartbeat monitor during an earthquake, provisioned has you sizing for peaks and bleeding money through the valleys. Serverless handles the swings without anyone touching a config.
Dev, test, and staging environments — Serverless wins, with caveats. Nobody’s running these at load around the clock, so paying for idle provisioned nodes is just waste. The cost floor on serverless is real — more on that shortly — but manageable at low utilization.
Steady, predictable production traffic — Provisioned wins. Full stop. Consistent query volume means you can right-size instances, lean on Reserved Instances for up to 36% savings, and actually predict what your invoice looks like. Serverless OCU pricing at steady-state almost always costs more.
Cost optimization at scale — Provisioned wins again. Teams moving from serverless to provisioned at moderate scale — 50+ GB of active index data, consistent patterns — routinely cut their OpenSearch bill by 40 to 60 percent. I’ve watched it happen more than once.
Index management complexity — This is where serverless surprises people badly. No Index State Management policies. No fine-grained rollover automation. No hot-warm-cold tiering. Most logging use cases depend on lifecycle management — serverless yanks that tooling out from under you.

Honest version: use serverless when load is genuinely unpredictable and operational simplicity has real dollar value for your team. Use provisioned when traffic is consistent or you need the full OpenSearch feature surface. That’s it.

Hidden Costs of Serverless

Probably should have opened with this section, honestly. The pricing model is where most people actually get burned.

OpenSearch Serverless charges in OCUs — OpenSearch Compute Units — at $0.24 per hour in us-east-1 as of 2024. Here’s the part that stings: OCUs are billed separately for indexing and search, with a minimum of 2 OCUs for each. Zero traffic to a collection? You’re still paying for 2 indexing OCUs and 2 search OCUs.

Do the math on that floor: 4 OCUs × $0.24 × 24 hours × 30 days = $691.20 per month at absolute zero traffic. Per collection. That number has killed more “let’s just use serverless for everything” proposals than I can count — including a few I’ve sat through in person.

Storage runs separately on top — $0.024 per GB-month. At 500 GB of index data, that’s another $12/month, which barely registers. The OCU cost is what dominates at low-to-moderate scale.

Compare that to provisioned: a single-node t3.medium.search instance runs roughly $0.068 per hour — about $49/month. Not production-grade, obviously, but a two-node r6g.large.search cluster — adequate for plenty of non-critical workloads — lands around $340/month with comparable storage. Cheaper, and you get more control.

Where serverless math finally tips in your favor is genuinely bursty, high-peak workloads — specifically when you’d otherwise be provisioning for a traffic spike that only materializes 10% of the time. High peak-to-average ratios make the case. Low ones don’t.

OCU Allocation Behavior

One thing worth knowing — AWS scales OCUs in whole units and holds capacity for a period after a burst before scaling back down. You won’t see immediate cost reductions when traffic drops off. Cost estimates built on average traffic end up slightly optimistic compared to what actually lands on the invoice. Not a dealbreaker, just a gap to account for.

Performance Characteristics

Cold starts are real. Not catastrophic — but real. When a serverless collection sits idle for even a few minutes with no queries, the first request after that gap will be slow. I’ve measured cold start latency anywhere from 1 to 8 seconds depending on collection size and query complexity. For a user-facing API, that’s a problem. For a background analytics job running every hour, probably not.

Surprised by that range the first time it showed up in production. We had an internal reporting dashboard — scheduled query every 30 minutes — and the first query of each cycle was consistently landing at 4 to 6 seconds. The provisioned equivalent answered in under 200ms. Don’t make my mistake of assuming “close enough” before you’ve actually measured it under your conditions.

Under sustained load, serverless latency is actually reasonably competitive with provisioned — assuming OCUs have scaled to meet demand. That said, the S3-backed storage architecture introduces inherent read amplification compared to provisioned clusters running on local NVMe attached directly to nodes. Heavy aggregations on large datasets will show up as elevated p99 latencies. That’s structural, not a tuning problem.

Throughput Limits

Serverless carries hard service limits that provisioned doesn’t. Default maximum indexing throughput sits at 2 GB per minute per collection — search concurrency is managed by AWS but not fully documented. Limit increases are requestable, but you’re operating inside a managed ceiling regardless. Provisioned throughput limits are essentially determined by your instance sizes and shard configuration — real control, not a request process. For high-volume log ingestion pushing 10+ GB per minute, serverless isn’t in the conversation.

Migration Path — Moving Between Models

This question comes up constantly: can you start on serverless and migrate to provisioned later? Yes — but it’s not automatic, and there’s no AWS-native tool that moves data between the two. They’re different services with different APIs and different index management models.

What migration actually looks like:

Snapshot your serverless collection to S3 using the OpenSearch snapshot API.
Restore from that snapshot into your provisioned domain.
Verify index compatibility — serverless runs OpenSearch 2.x, and provisioned domains span various versions. Version mismatches can cause index features to not translate cleanly.
Update application config to point at the new provisioned endpoint.
Run both in parallel through a validation window before decommissioning serverless.

The snapshot-and-restore approach works fine. Tedious, not dangerous — but the main gotcha is index mappings and custom analysis configurations, which need to be recreated on the provisioned side before restoring data. Budget a full engineering sprint for any non-trivial migration. Not an afternoon.

Going the other direction — provisioned to serverless — is essentially the same process. No magic button either way.

One thing to flag before committing to serverless early: if your use case involves ISM policies, rollup jobs, or transforms, none of those exist in serverless. When you migrate to provisioned, you’ll rebuild that operational logic from scratch. Worth designing for that possibility upfront if migration is a realistic future state.

When to Avoid Serverless Entirely

Let me be direct here — the AWS documentation won’t be.

High-volume log ingestion. Running a centralized logging stack for production — application logs, VPC flow logs, CloudTrail events — serverless will either hit throughput limits or cost significantly more than provisioned at scale. Teams doing this well run OpenSearch Service with dedicated master nodes, UltraWarm for older data, and ISM policies handling lifecycle automatically. Serverless removes all of that.

Steady production search. E-commerce product search, document retrieval, any customer-facing search feature with predictable traffic — go provisioned. Size the cluster, tune shards, use Reserved Instances for long-term workloads. The operational overhead you trade away with serverless is probably overhead you’re willing to keep in exchange for predictable costs and better performance guarantees.

Cost-sensitive workloads at any reasonable scale. Past roughly 100 GB of active index data with consistent query patterns, serverless economics rarely win. Run the numbers for your specific workload. The $691/month floor per collection is the baseline — OCU consumption above that depends entirely on your traffic shape.

Workloads requiring advanced OpenSearch features. No ISM. No fine-grained rollover. No transforms. Security customization is more limited than provisioned — no resource-based access policies at the index level in the same way. AWS adds features over time, but these gaps are real as of now and worth checking before you commit.

Latency-sensitive applications. Cold starts and S3-backed storage make serverless a poor fit for anything where p99 latency matters and traffic isn’t constant enough to keep compute warm. Real-time autocomplete, session-sensitive search, anything with a user waiting on the other end — provisioned with appropriately sized instances will serve you better every time.

Burned by the minimum OCU cost on a client project once — four serverless collections across different environments, nobody ran the numbers beforehand. First invoice came in at $2,764 for collections that were barely touched. We moved three of them onto a single shared provisioned domain within the week. Lesson learned, and honestly a good argument for doing the math before you deploy anything.

OpenSearch Serverless is a genuinely useful service for the right workloads — removes real operational overhead, scales automatically, and makes sense for teams that prioritize simplicity with genuinely unpredictable traffic. That’s what makes it endearing to us infrastructure folks who’ve been manually resizing clusters at 2am. But it’s not a default choice. Know your traffic patterns, run the cost comparison, and make the decision deliberately rather than because serverless sounded easier in the moment.

AWS OpenSearch Serverless — When It Makes Sense and When to Avoid It

Serverless vs Provisioned — The Decision Matrix

Hidden Costs of Serverless

OCU Allocation Behavior

Performance Characteristics

Throughput Limits

Migration Path — Moving Between Models

When to Avoid Serverless Entirely

Marcus Chen

You Might Also Like

AWS KMS Key Rotation Issues How to Fix

AWS VPC Peering Not Working How to Fix It

AWS EC2 Instance Types Explained for Small Teams

Stay in the loop