# Amazon EBS

### 🔗 **Quicklinks (Bookmark):**

* Cost Explorer: [AWS Cloudfront Costs by API](https://us-east-1.console.aws.amazon.com/costmanagement/home?region=us-east-1#/cost-explorer?chartStyle=STACK\&costAggregate=unBlendedCost\&endDate=2025-09-30\&excludeForecasting=false\&filter=%5B%7B%22dimension%22:%7B%22id%22:%22Service%22,%22displayValue%22:%22Service%22%7D,%22operator%22:%22INCLUDES%22,%22values%22:%5B%7B%22value%22:%22Amazon%20CloudFront%22,%22displayValue%22:%22CloudFront%22%7D%5D%7D%5D\&futureRelativeRange=CUSTOM\&granularity=Daily\&groupBy=%5B%22Operation%22%5D\&historicalRelativeRange=LAST_MONTH\&isDefault=true\&reportMode=STANDARD\&reportName=New%20cost%20and%20usage%20report\&showOnlyUncategorized=false\&showOnlyUntagged=false\&startDate=2025-09-01\&usageAggregate=undefined\&useNormalizedUnits=false)
* Cost Explorer: [AWS Cloudfront Datatransfer Costs & Usage](mailto:undefined)
* Cost Explorer: [AWS Cloudfront Invalidations (URL) Costs & Usage](https://us-east-1.console.aws.amazon.com/costmanagement/home?region=us-east-1#/cost-explorer?chartStyle=STACK\&costAggregate=unBlendedCost\&endDate=2025-09-30\&excludeForecasting=false\&filter=%5B%7B%22dimension%22:%7B%22id%22:%22Service%22,%22displayValue%22:%22Service%22%7D,%22operator%22:%22INCLUDES%22,%22values%22:%5B%7B%22value%22:%22Amazon%20CloudFront%22,%22displayValue%22:%22CloudFront%22%7D%5D%7D,%7B%22dimension%22:%7B%22id%22:%22UsageType%22,%22displayValue%22:%22Usage%20type%22%7D,%22operator%22:%22INCLUDES%22,%22values%22:%5B%7B%22value%22:%22Invalidations%22,%22displayValue%22:%22Invalidations%20\(URLs\)%22%7D%5D%7D%5D\&futureRelativeRange=CUSTOM\&granularity=Daily\&groupBy=%5B%22UsageType%22%5D\&historicalRelativeRange=LAST_MONTH\&isDefault=true\&reportMode=STANDARD\&reportName=New%20cost%20and%20usage%20report\&showOnlyUncategorized=false\&showOnlyUntagged=false\&startDate=2025-09-01\&usageAggregate=usageQuantity\&useNormalizedUnits=false)
* Savings: [AWS Cloudfront savings Bundle](https://us-east-1.console.aws.amazon.com/cloudfront/v3/home?#/savings-bundle/purchase)
* Cloudfront Monitoring: [AWS Cloudfront Popular URLs Dashboard](https://us-east-1.console.aws.amazon.com/cloudfront/v3/home#/popular_urls)
* Cloudfront Monitoring: [AWS Cloudfront Usage Dashboard](https://us-east-1.console.aws.amazon.com/cloudfront/v3/home#/usage)
* Cloudfront Queries: [Query CUR on Athena](https://catalog.workshops.aws/cur-query-library/en-US/queries/networking-and-content-delivery#amazon-cloudfront)

Amazon EBS is AWS’s block storage for EC2—fast, durable, and flexible—but it’s also where surprise costs creep in from **over-provisioned IOPS/throughput** and **forgotten snapshots**. This page mirrors the structure and tone of the CloudFront and EC2 guides: what you’re using, what you’re paying, what you should do next, and which AWS-native tools help you get there.

***

### 🚀 What is EBS?

Amazon **Elastic Block Store (EBS)** provides persistent, low-latency **block volumes** for EC2. Volumes are **zonal** (attach within the same AZ), behave like disks for filesystems and databases, support **online resize/type/perf changes (Elastic Volumes)**, and offer **incremental snapshots** for backup/DR.

**Features**

* Provisioned performance: choose **volume types**, **IOPS**, and **throughput** to match workloads
* **Snapshots**: incremental, copy/share across accounts/Regions; **Archive** for long-term retention
* **Encryption by default** with KMS
* **Multi-Attach** (io1/io2) for clustered designs
* Broad regional availability; designed for high durability & availability

***

### ⚙️ Volume Types — pick the right drive

| Type                               | Primary Use Case                            | Notes                                                                                                                            |
| ---------------------------------- | ------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- |
| **gp3 (General Purpose SSD)**      | Boot volumes, app servers, most DB/dev/test | Default choice. Baseline \~**3,000 IOPS / 125 MiB/s**. Independently scale IOPS/throughput. Great **gp2 → gp3** upgrade path.    |
| **io2 Block Express (PIOPS SSD)**  | Mission-critical, latency-sensitive DBs     | Highest single-volume ceilings (e.g., **up to 256k IOPS / 4,000+ MiB/s**). Supports **Multi-Attach**; best with Nitro instances. |
| **io2 (PIOPS SSD)**                | High-IOPS OLTP, consistent latency          | **Up to 64k IOPS / 1,000 MiB/s**. Durable PIOPS; **Multi-Attach**.                                                               |
| **st1 (Throughput-optimized HDD)** | Large, sequential I/O (ETL, logs, scans)    | **High throughput** (hundreds of MiB/s). Not for random I/O or boot.                                                             |
| **sc1 (Cold HDD)**                 | Cold, infrequently accessed sequential data | Lowest **$/GiB** HDD. Not for boot; long minimums and lower performance.                                                         |

> **Rule of thumb:** Random/low-latency → **SSD (gp3/io2)**. Large sequential → **HDD (st1/sc1)**. Start with **gp3** unless you have measured PIOPS needs.

***

### 🧬 Performance & advanced features

| Enhancement                     | What it does                                | Where to use it                     | Notes                                                                           |
| ------------------------------- | ------------------------------------------- | ----------------------------------- | ------------------------------------------------------------------------------- |
| **Elastic Volumes**             | Online **resize/type/perf** changes         | Routine rightsizing                 | No detach for supported ops; ideal for **gp2→gp3** or gp3 tuning.               |
| **Multi-Attach**                | Attach one volume to **multiple instances** | Clustered apps (e.g., RAC, quorum)  | **io1/io2 only**, same AZ, up to 16 Nitro instances; cluster-aware FS required. |
| **Fast Snapshot Restore (FSR)** | **Instant** full-speed restores             | Cutovers, DR drills, fleet rollouts | Billed while enabled per snapshot/AZ; toggle only where RTO needs it.           |
| **Snapshot Archive**            | Low-cost snapshot storage                   | Long-term retention                 | Much cheaper; slower retrieval & minimum duration.                              |
| **Recycle Bin / Snapshot Lock** | Prevents accidental/malicious deletion      | Compliance, ransomware resilience   | Time-based recovery and **WORM** governance.                                    |

***

### ️🏛 Attachment & deployment patterns

| Pattern              | When to use          | Notes                                                                                                |
| -------------------- | -------------------- | ---------------------------------------------------------------------------------------------------- |
| **Single-Attach**    | Most EC2 workloads   | Default; pair with **EBS-optimized** instances for consistent bandwidth/latency.                     |
| **Multi-Attach**     | Shared-disk clusters | **io1/io2**, same AZ; application coordinates writes.                                                |
| **RAID0 (striping)** | Very high throughput | Prefer **io2 Block Express** if one big volume can meet needs; stripe only when documented/required. |

> EBS is **zonal**—use **snapshots** (and/or **AWS Backup**) for cross-AZ/Region copy and recovery patterns.

***

### 🧠 EBS optimization strategy (FinOps + reliability)

| Strategy                        | Actions                                                                     | Tools / Notes                                                                          |
| ------------------------------- | --------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- |
| **Migrate gp2 → gp3**           | Convert in place with Elastic Volumes; keep size; raise perf only as needed | Typical **15–20% $/GiB** savings vs gp2; independent IOPS/throughput knobs             |
| **Right-size IOPS/throughput**  | Align to **p95/p99** + headroom (not “worst day ever”)                      | CloudWatch (**IOPS**, **queue depth**, **throughput%**); Compute Optimizer volume recs |
| **Tune by access pattern**      | Sequential → **st1/sc1**; Random/latency → **gp3/io2**                      | Don’t boot from HDD; validate burst/credit behavior on HDD                             |
| **Use EBS-optimized instances** | Ensure EC2 EBS bandwidth isn’t the bottleneck                               | Many Nitro families include it; check instance caps before chasing volume limits       |
| **Snapshot hygiene**            | **Lifecycle** policies; archive or delete stale snaps                       | **DLM** or **AWS Backup**; tag + expire                                                |
| **FSR only where needed**       | Enable for migrations/drills; disable afterwards                            | Avoid background metering                                                              |
| **Find & delete orphans**       | Remove **detached volumes** & old snaps                                     | Tags + Resource Explorer/Config/Scripts to list and clean                              |

**Common bill-busters**

* Paying for **PIOPS you don’t use** (io1/io2) or gp3 add-ons while utilization is low
* **Forgotten snapshots** (including cross-Region copies)
* **FSR left on** after cutovers
* **Detached volumes** with no owner tags

***

### 💸 Pricing model & gotchas

* **Volumes**: billed per **provisioned GiB-month**
  * **gp3** includes a baseline; **extra IOPS/throughput** billed separately
  * **io1/io2** bill **PIOPS** in addition to GiB
* **Snapshots**: incremental, billed per **GiB-month**; **Archive** is cheaper with slower retrieval and a minimum duration
* **FSR**: metered per **snapshot/AZ** while enabled
* **No Savings Plans/RIs** for EBS itself — storage savings come from **rightsizing + lifecycle + cleanup**

> Keep **regional prices** out of docs; link to pricing and model with your actual metrics.

***

### ⏱️ Automation patterns

* **Amazon DLM**: automate snapshot/AMI create-retain-delete (incl. cross-Region/account copies)
* **AWS Backup**: centralized policies, **vault lock**, cross-account protections, compliance reporting
* **EventBridge + Lambda**: detect/remediate unencrypted volumes, **public snapshots**, or lingering **FSR**

***

### 🔒 Security & compliance

* Turn on **EBS encryption by default**; scope **KMS** keys per env/app; rotate & audit key usage
* **Snapshot Lock** (governance/compliance) and **Backup Vault Lock** for immutability
* Block **public snapshots** at account/Region level; share explicitly and temporarily
* Enforce least-privilege IAM on `ec2:CreateVolume`, `CreateSnapshot`, `ModifyVolume`, and KMS actions

***

### 📊 Monitoring & tools

* **CloudWatch (EBS)**: `VolumeReadOps/WriteOps`, `VolumeReadBytes/WriteBytes`, `VolumeThroughputPercentage`, `VolumeQueueLength`, `BurstBalance` (gp2), latency where available
* **EC2 instance metrics**: confirm you’re not saturating instance **EBS bandwidth/IOPS caps**
* **Compute Optimizer (EBS)**: per-volume type/size/IOPS/throughput recommendations
* **Cost Explorer / CUR**: tag volumes/snapshots; watch **gp3 add-ons**, snapshot growth, **archive vs standard** tiers

***

### 🧪 Practical selection cheat-sheet

* **Default**: **gp3** for most; raise IOPS/throughput only with evidence
* **High IOPS / tight latency**: **io2** or **io2 Block Express** (prefer a single large BE volume over many striped gp3 when feasible)
* **Big sequential reads/writes**: **st1**; **cold sequential**: **sc1**
* **Shared-disk clusters**: **io1/io2** with **Multi-Attach** and a cluster-aware filesystem

***

### ✅ EBS FinOps Checklist

* [ ] Standardize on **gp3**; migrate **gp2 → gp3** at scale using **Elastic Volumes** (in-place), then dial IOPS/throughput only where metrics prove the need.
* [ ] **Right-size performance** to **p95/p99** demand + modest headroom; avoid “worst-day-ever” provisioning. Re-check monthly.
* [ ] Verify **instance EBS bandwidth/IOPS caps** (EBS-optimized Nitro where possible) before paying for more volume IOPS you can’t use.
* [ ] Put **snapshot lifecycle** everywhere (create → retain → delete); **Archive** long-term snapshots; purge orphans and stale cross-Region copies.
* [ ] Enable **Fast Snapshot Restore (FSR)** only for migrations/DR drills; **disable** immediately afterward.
* [ ] Enforce **encryption by default**; scope **KMS** keys per env/app; block **public snapshots**; use **Snapshot Lock**/**Backup Vault Lock** for immutability.
* [ ] Prefer **io2 Block Express** when single-volume ceilings matter; avoid unnecessary striping unless documented and tested.
* [ ] Tag **owners/teams** on volumes & snapshots; build **Budgets/alerts** for snapshot growth, gp3 add-ons, FSR metering.
* [ ] Monitor **VolumeThroughput%**, **QueueLength**, **IOPS (provisioned vs. used)**; watch for detached volumes; clean up regularly.
* [ ] Use **Compute Optimizer (EBS)** to surface type/size/IOPS/throughput downsizing opportunities.

***

### 🧠 AWS EBS Cost Optimization Challenges

EBS looks inexpensive until **PIOPS over-provisioning**, **snapshot sprawl**, **FSR left on**, and **instance bottlenecks** ambush the bill. Here are the non-trivial traps teams hit—and fixes that actually move the needle.

***

**Q1: “gp2 everywhere” is simple—why switch?**

gp2 ties performance to size and encourages oversizing.

**✅ Solution**

* Migrate **in place** to **gp3**; decouple size from IOPS/throughput.
* Start at gp3 baseline; add IOPS/throughput only where CloudWatch shows a need.
* Track before/after $/GiB and latency to prove savings.

***

**Q2: We’re paying for PIOPS (io1/io2) we barely use.**

Provisioned IOPS add up when utilization is low.

**✅ Solution**

* Compare **provisioned vs. observed** IOPS/Throughput%/QueueLength in CloudWatch.
* Reduce IOPS on gp3 or move io1/io2 → **gp3** where latency permits.
* Use **Compute Optimizer (EBS)** to identify downsizing candidates.

***

**Q3: We bought more IOPS but performance is still capped.**

Often the **instance** EBS bandwidth/IOPS limit—not the volume—is the bottleneck.

**✅ Solution**

* Check your instance’s **EBS-optimized caps**; move to a Nitro class with more headroom.
* Consolidate to **io2 Block Express** if you need higher single-volume ceilings.
* Watch **QueueLength** and **VolumeThroughput%** to confirm relief.

***

**Q4: Snapshot sprawl is eating the bill.**

Long retention, duplicates, and cross-Region copies creep in.

**✅ Solution**

* Enforce **Data Lifecycle Manager (DLM)** or **AWS Backup** policies (create → retain → delete).
* **Archive** compliance/long-term snapshots; delete orphans and stale cross-Region copies.
* Tag owners; alert on growth by team/application.

***

**Q5: We left Fast Snapshot Restore (FSR) on—oops.**

FSR is billed per snapshot × AZ × hour.

**✅ Solution**

* Enable **only** for migrations/DR tests; **disable** immediately after.
* Add budgets/alerts or an EventBridge rule to auto-turn-off FSR post-window.
* Review monthly for stray enabled FSR entries.

***

**Q6: Cross-Region snapshot copies are pricier than expected.**

You pay to copy and then to store in the destination Region.

**✅ Solution**

* Reduce frequency/volume of copies; filter by **tags/prefix** so you only copy what matters.
* **Archive** rarely restored DR copies; document RPO/RTO to justify cadence.
* Periodically expire historical copies beyond policy.

***

**Q7: We oversize volumes because workloads are sequential.**

SSD random performance isn’t needed for big scans/logs.

**✅ Solution**

* Use **st1** (throughput-optimized HDD) for warm sequential workloads and **sc1** for cold sequential data.
* Never boot from HDD; don’t expect good random I/O on st1/sc1.
* Validate throughput needs vs. HDD burst/credit behavior.

***

**Q8: Shared-disk cluster needs are forcing expensive designs.**

Apps sometimes need a shared block device.

**✅ Solution**

* Use **Multi-Attach** on **io1/io2** (same AZ, Linux only) with a **cluster-aware filesystem** or app-level fencing.
* Test failover and fencing thoroughly; monitor write conflicts and latency.
* Compare TCO against non-block patterns if feasible.

***

**Q9: Detached volumes and public/shared snapshots keep showing up.**

Unattached volumes bill forever; overly permissive snapshots add risk and cost.

**✅ Solution**

* Schedule discovery & cleanup (Config/Resource Explorer/automation) of **detached** volumes and stale snapshots.
* Enforce org-level controls to **block public snapshots**; share only temporarily and explicitly.
* Require **owner tags** at creation; audit monthly.

***

**Q10: How do we prove savings and performance gains?**

Stakeholders want numbers, not promises.

**✅ Solution**

* Report before/after **$ per GiB**, **PIOPS utilization**, **VolumeThroughput%**, **QueueLength**, and **restore RTO**.
* Tie snapshot growth to lifecycle policy compliance by team/tag.
* Keep a quarterly loop: **Compute Optimizer → changes → Cost Explorer/CUR → verify**.

***

#### ⚙️ Quick Wins

* **Fleet-wide gp2 → gp3** with Elastic Volumes; reduce add-ons where utilization is low.
* Turn on **snapshot lifecycle** + **Archive** for long-term retention; delete orphans.
* Verify **instance EBS caps**; right-size instance families before buying more PIOPS.
* Keep **FSR** off by default; enable only during cutovers/tests and auto-disable after.
* Build **dashboards & alerts** for Throughput%, QueueLength, PIOPS utilization, snapshot growth (by team/tag).

***

### 📚 Handy references

* **EBS pricing**: volumes, snapshots, Archive, FSR, Time-based Copy, Clones. [oai\_citation:10‡Amazon Web Services, Inc.](https://aws.amazon.com/ebs/pricing/)
* **Volume types & limits** (gp3, io2, st1/sc1). [oai\_citation:11‡AWS Documentation](https://docs.aws.amazon.com/ebs/latest/userguide/ebs-volume-types.html?utm_source=chatgpt.com)
* **Compute Optimizer (EBS)**: recommendations by type/size/IOPS/throughput. [oai\_citation:12‡AWS Documentation](https://docs.aws.amazon.com/compute-optimizer/latest/ug/view-ebs-recommendations.html?utm_source=chatgpt.com)
* **Lifecycle/Backup**: DLM & AWS Backup docs. [oai\_citation:13‡AWS Documentation](https://docs.aws.amazon.com/ebs/latest/userguide/snapshot-lifecycle.html?utm_source=chatgpt.com)
* **Community how-tos**: gp2→gp3 migration tips and real-world savings. [oai\_citation:14‡Amazon Web Services, Inc.](https://aws.amazon.com/blogs/storage/migrate-your-amazon-ebs-volumes-from-gp2-to-gp3-and-save-up-to-20-on-costs/?utm_source=chatgpt.com)

***

### Cost View in Cost Explorer

* **Filter** → `Service = Amazon EBS` and/or `Usage Type Group = EBS: Volume/Snapshot`
* **Group by** → `Usage type`, `Region`, `Tag (env/app/team)`, `Linked Account`
* Use “EC2-Other” filters for related **EBS-Optimized** surcharges and data movement

#### Cost Explorer: Fast-Triage Usage Types

| Usage Type Pattern              | Likely Meaning                                            |
| ------------------------------- | --------------------------------------------------------- |
| `EBS:VolumeUsage.gp3` / `.io2`  | Baseline $/GiB volume charges                             |
| `EBS:PIOPS.io2`                 | Provisioned IOPS add-on (pio2/io2)                        |
| `EBS:Throughput.gp3`            | gp3 throughput add-on                                     |
| `EBS:SnapshotUsage`             | Standard snapshot storage                                 |
| `EBS:SnapshotArchiveStorage`    | Archive snapshot storage                                  |
| `EBS:FastSnapshotRestore`       | FSR metering per snapshot/AZ                              |
| `EC2: EBSOptimized` (EC2-Other) | Instance-side EBS optimization surcharge (older families) |

**Action tips**

* Sort volumes by **low utilization** vs **high provisioned IOPS/throughput**
* Tag & surface **detached volumes** and **stale snapshots**
* Trend **snapshot growth** by team/project and enforce lifecycle targets

***

### Deep Dive with CUR

Columns to keep handy:

* `line_item_resource_id` (volume/snapshot IDs)
* `product_volume_api_name` (`gp3`, `io2`, `st1`, …)
* `line_item_usage_type` (VolumeUsage, PIOPS, SnapshotUsage, FSR)
* `resourceTags/*` (team, app, env, owner)
* `line_item_unblended_cost` / `line_item_blended_cost`

Example prompt: **“Find gp3 volumes with extra IOPS/throughput provisioned but <30% utilization over last 14 days”** → candidates to dial down.

***

### 📚 References

* EBS **pricing** (volumes, PIOPS, snapshots, archive, FSR)
* EBS **volume types & limits**; Multi-Attach docs
* **Elastic Volumes**, **DLM**, **AWS Backup**, **Snapshot Lock**, **Recycle Bin**
* CloudWatch metrics & **EBS-optimized** instances
* **Compute Optimizer for EBS**; CUR field guide for EBS spend

> *Features and limits evolve. Validate in your Region before production changes.*


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://aws.cloudshim.com/aws-top-services/amazon-ebs.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
