# Amazon EC2

### 🔗 **Quicklinks (Bookmark):**

* Cost Explorer: [AWS EC2 by Instance type and Running hours](https://us-east-1.console.aws.amazon.com/costmanagement/home?region=us-east-1#/cost-explorer?chartStyle=STACK\&costAggregate=unBlendedCost\&endDate=2025-08-31\&excludeForecasting=false\&filter=%5B%7B%22dimension%22:%7B%22id%22:%22UsageTypeGroup%22,%22displayValue%22:%22Usage%20type%20group%22%7D,%22operator%22:%22INCLUDES%22,%22values%22:%5B%7B%22value%22:%22EC2:%20Running%20Hours%22,%22displayValue%22:%22EC2:%20Running%20Hours%22%7D%5D%7D%5D\&futureRelativeRange=CUSTOM\&granularity=Daily\&groupBy=%5B%22InstanceType%22%5D\&historicalRelativeRange=LAST_MONTH\&isDefault=true\&reportMode=STANDARD\&reportName=New%20cost%20and%20usage%20report\&showOnlyUncategorized=false\&showOnlyUntagged=false\&startDate=2025-08-01\&usageAggregate=usageQuantity\&useNormalizedUnits=false)
* Reservation Coverage: [AWS EC2 RI coverage](https://us-east-1.console.aws.amazon.com/costmanagement/home?region=us-east-1#/ri/coverage?chartStyle=LINE\&timePeriod=%7B%22timePeriodStart%22%3A%222025-06-01%22%2C%22timePeriodEnd%22%3A%222025-09-01%22%2C%22historicalDateRangeOptionId%22%3A%22LAST_3_MONTHS%22%7D\&granularity=DAILY\&target=100\&service=Amazon+Elastic+Compute+Cloud+-+Compute)
* Savings Plan Coverage: [AWS EC2 SP coverage](https://us-east-1.console.aws.amazon.com/costmanagement/home?region=us-east-1#/savings-plans/coverage?chartStyle=LINE\&timePeriod=%7B%22timePeriodStart%22%3A%222025-06-01%22%2C%22timePeriodEnd%22%3A%222025-09-01%22%2C%22historicalDateRangeOptionId%22%3A%22LAST_3_MONTHS%22%7D\&granularity=DAILY\&target=100\&service=Amazon+Elastic+Compute+Cloud+-+Compute)
* Compute Rightsizing: [AWS Compute Optimizer Rightsizing](https://us-east-1.console.aws.amazon.com/compute-optimizer/home?region=us-east-1#/resources-lists/ec2)
* Idle Compute: [AWS Compute Optimizer Idle](https://us-east-1.console.aws.amazon.com/compute-optimizer/home?region=us-east-1#/resources-lists/idle)
* EC2 Pricing table: [AWS EC2 Pricing](https://cloudprice.net/aws/ec2)
* EC2 CUR Queries: [Query CUR on Athena](https://catalog.workshops.aws/cur-query-library/en-US/queries/compute)

**Amazon EC2** is the backbone of AWS compute, scalable, customizable, and dangerously easy to overspend on.

Let’s break it down by:&#x20;

→ What you’re using\
→ What you’re paying\
→ What you should be doing\
→ And the **AWS-native tools** to make it happen.

***

### 🚀 What is EC2?

Amazon Elastic Compute Cloud (EC2) provides resizable virtual servers in the cloud.

* Available in **every AWS region**
* Billed by the **second** or **hour**
* Can run Linux, Windows, or custom AMIs
* Comes in dozens of **instance families** across generations

***

### ⚙️ Instance Families — Pick the Right Hammer

| Family                        | Use Case                      | Notes                              |
| ----------------------------- | ----------------------------- | ---------------------------------- |
| `t` (Burstable)               | Dev, test, low-traffic apps   | Great for idle workloads           |
| `m` (General Purpose)         | Web apps, small services      | Good default starting point        |
| `c` (Compute Optimized)       | High CPU workloads            | Perfect for encoding, ML inference |
| `r` / `x` (Memory Optimized)  | DBs, caches, SAP              | Watch memory:cost ratio            |
| `i` / `d` (Storage)           | OLTP, NoSQL, logs, IOPS-heavy | High EBS throughput                |
| `g`, `inf`, `p` (Accelerated) | AI/ML, HPC                    | GPU-backed, very expensive         |

***

### 🧬 Instance Generations

<table><thead><tr><th width="276.62890625">Generation</th><th width="136.75390625">Architecture</th><th width="139.68359375">OS Support</th><th>⚠️ Caveats</th></tr></thead><tbody><tr><td><strong>Graviton</strong> (<code>g6</code>, <code>t4g</code>, etc.)</td><td>ARM</td><td>✅ Linux only</td><td>❌ No Windows, may need recompiled apps</td></tr><tr><td><strong>x86 Intel/AMD</strong> (<code>m5</code>, <code>c6a</code>, etc.)</td><td>x86</td><td><p>✅ Linux, </p><p>✅ Windows</p></td><td>More costly, but universal compatibility</td></tr></tbody></table>

> Run **Linux**? Try **Graviton**.\
> Run **Windows** or legacy binaries? Stick to **x86**.

***

### 🏛️ Tenancy Options

<table><thead><tr><th width="203.65234375">Tenancy Type</th><th width="176.1875">Use When</th><th>Notes</th></tr></thead><tbody><tr><td><strong>Shared</strong></td><td>Default</td><td>✅ Best for 90% of workloads</td></tr><tr><td><strong>Dedicated Instance</strong></td><td>You need isolation</td><td>⚠️ Slightly more expensive</td></tr><tr><td><strong>Dedicated Host</strong></td><td>BYOL licensing</td><td>💸 Most expensive, per-socket billing possible</td></tr></tbody></table>

***

### 🧠 EC2 Rightsizing Strategy

| Strategy                   | What to Do                                                             | Tools / Notes                                                                                                                                                                |
| -------------------------- | ---------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| ✅ **Quick Wins**           | Find underutilized instances (e.g. CPU < 10%)                          | [Compute Optimizer](https://console.aws.amazon.com/compute-optimizer/home), [Rightsizing in Cost Explorer](https://console.aws.amazon.com/cost-management/home#/rightsizing) |
| 🔁 **Same-Family Resize**  | Downsize within current instance family (e.g. `m5.2xlarge → m5.large`) | No re-architecture needed                                                                                                                                                    |
| 🔄 **Cross-Family Change** | Migrate to cost-effective families (e.g. `m5 → t3` or `m5 → m6g`)      | Use Graviton for Linux (⚠️ no Windows support)                                                                                                                               |
| 💤 **Shut Down Idle**      | Stop non-prod or idle EC2s automatically during off-hours              | Use tags + [Instance Scheduler](https://docs.aws.amazon.com/solutions/latest/instance-scheduler/)                                                                            |

> 💡 Review and adjust sizing monthly — usage changes, so should your provisioning.

***

### 💸 Purchase Model Optimization

<table><thead><tr><th width="185.15625">Model</th><th width="126.6328125">Savings</th><th>Best For</th><th>Risk</th></tr></thead><tbody><tr><td><strong>On-Demand</strong></td><td>0%</td><td>Dev/test, unpredictable workloads</td><td>💸 High cost</td></tr><tr><td><strong>Savings Plans</strong></td><td>30–66%</td><td>Steady-state compute</td><td>⚠️ Locked 1–3 yrs</td></tr><tr><td><strong>Reserved Instances</strong></td><td>30–72%</td><td>Predictable, type-specific workloads</td><td>⚠️ Less flexibility</td></tr><tr><td><strong>Spot</strong></td><td>70–90%</td><td>Fault-tolerant, stateless apps</td><td>⚠️ Can be interrupted anytime</td></tr></tbody></table>

➡ Use **Savings Plans** for baseline.\
➡ Use **Spot** for scale-out workers.

***

### ⏱ Scheduled Usage

Stop non-prod resources when not in use.

Tools:

* [AWS Instance Scheduler](https://docs.aws.amazon.com/solutions/latest/instance-scheduler/)
* Lambda + EventBridge + Tags

***

### 🔍 EC2 FinOps Toolbox

| Tool                      | Purpose                            | Link                                                                           |
| ------------------------- | ---------------------------------- | ------------------------------------------------------------------------------ |
| **Cost Explorer**         | Analyze trends, tags, reservations | [Open](https://console.aws.amazon.com/cost-management/home#/cost-explorer)     |
| **Compute Optimizer**     | Rightsize + Graviton tips          | [Open](https://console.aws.amazon.com/compute-optimizer/home)                  |
| **Trusted Advisor**       | Idle EC2, EBS, Elastic IPs         | [Open](https://console.aws.amazon.com/trustedadvisor/home)                     |
| **Savings Plans Console** | Commit to usage, save up to 66%    | [Open](https://console.aws.amazon.com/savingsplans/home)                       |
| **Reserved Instances**    | Buy fixed-term EC2 savings         | [Open](https://console.aws.amazon.com/ec2/v2/home?#ReservedInstances)          |
| **CUR + Athena**          | Deep cost analytics                | [CUR Guide](https://docs.aws.amazon.com/cur/latest/userguide/what-is-cur.html) |

***

### 📉 Cost View in Cost Explorer

In **Cost Explorer**:

* Filter → `Service = EC2`
* Group by → `Instance Type`, `Region`, `Tag`, or `Linked Account`
* Use **RI Coverage**, **SP Utilization**, and **Forecasting**

🔗 [Open Cost Explorer](https://console.aws.amazon.com/cost-management/home#/cost-explorer)

***

**Cost Explorer: Fast-Triage Usage Types 🔍**

When you load EC2 in Cost Explorer or in CUR, watch for these usage types and what they often indicate:

<table><thead><tr><th width="243.75">Usage Type Pattern</th><th>Likely Meaning</th></tr></thead><tbody><tr><td><code>BoxUsage:*</code></td><td>Base EC2 instance hours — the main compute cost bucket</td></tr><tr><td><code>CPUCredits:*</code></td><td>T-family instances earning unused CPU credits</td></tr><tr><td><code>EBSOptimized:*</code></td><td>EC2-Other surcharge for instance type EBS optimization</td></tr><tr><td><code>DataTransfer-*</code></td><td>Network egress (inter-AZ, cross-AZ, internet)</td></tr><tr><td><code>ElasticIP:*</code></td><td>Idle or unattached Elastic IPs, incurring cost</td></tr></tbody></table>

**Action Tips:**

* Filter by low `vCPU-hours` but non-zero `BoxUsage` to find idle instances.
* High `CPUCredits` accumulation suggests your T-class is over-provisioned.
* Use tag filters (project, team) to group and triage waste quickly.

***

### 📊 Deep Dive with CUR

When querying CUR for EC2 insights, these are your go-to columns:

* `line_item_resource_id` — the EC2 instance ID
* `product_instance_type` — the instance family and size
* `line_item_usage_type` — e.g. BoxUsage, CPUCredits, DataTransfer
* `line_item_operation` — start/stop, resize, etc.
* `resourceTags/*` — your team/project tag dimensions
* `line_item_unblended_cost` / `line_item_blended_cost` — cost values

**Example Query Prompt:**\
\&#xNAN;*Find t3 instances with low `vCPU-hours` and high `CPUCredits` — candidate for downsizing or retirement.*

🔗 [CUR Setup](https://docs.aws.amazon.com/cur/latest/userguide/what-is-cur.html)

***

### **⚠️ Data Transfer & EBS Callouts**

* **Inter-AZ traffic** between EC2 instances is billable; intra-AZ is free (still monitor).
* **Cross-region transfers** and internet egress can dominate cost in chatty applications.
* **EBS is tightly coupled** — most storage cost lives under EBS volumes and snapshots. Migrate `gp2 → gp3`, right-size throughput/IOPS, clean up orphaned volumes.
* Co-locate high-traffic tiers (API + DB, worker + storage) in same AZ or use private link constructs to reduce transfer cost.

***

### 🔮 Advanced Tactics

| Strategy                      | Why It Matters                      |
| ----------------------------- | ----------------------------------- |
| **Graviton Migration**        | Save 20–40% for Linux workloads     |
| **Mixed-Instance ASG**        | Use cheapest family type across AZs |
| **Spot + On-Demand fallback** | Scale with resilience               |
| **Instance Scheduler**        | Shut down dev/test nights/weekends  |
| **Tagging**                   | Enables showback by team/project    |
| **Convertible RIs**           | Switch types during term            |
| **Auto Scaling Right**        | Prevent zombie capacity             |
| **Forecasting via CE**        | Plan future RI/SP purchases         |

**Spot Strategy — next level**

* Use **MixedInstancesPolicy** in ASGs with multiple families and sizes to increase availability.
* Define **interruption budget** (e.g. allow 10 % of capacity to be interrupted) to trade lower cost vs. reliability.
* Use dynamic **max price caps** (e.g. set to 70–90 % of on-demand) and fallback to On-Demand when Spot is reclaimed.
* Monitor **spot interruption events** and automate instance drainage/shutdown gracefully.

***

### **🚨 Security & Compliance for EC2**

* Ensure latest **AMI patching cadence**, automate image refresh.
* Enforce **IMDSv2** usage and disable IMDSv1 to soften SSRF risks.
* Limit **public IP access**; use NAT/Load Balancers + security groups.
* Use **SCPs / Guardrails** to prevent unapproved instance types or regions.
* Enforce **SSM Patch Manager** and logging agents for visibility and drift detection.

***

### ✅ EC2 FinOps Checklist

* [ ] Rightsize with Compute Optimizer
* [ ] Schedule non-prod instance shutdowns
* [ ] Migrate eligible workloads to Graviton
* [ ] Buy RIs or SPs for steady workloads
* [ ] Track RI/SP coverage & utilization
* [ ] Audit unused EBS volumes and Elastic IPs
* [ ] Set up CUR and run Athena queries
* [ ] Monitor EC2 cost trends monthly

***

### 🧠 EC2 Cost Optimization Challenges

A Q\&A-style deep dive into the most persistent, high-impact AWS EC2 cost problems — and actionable solutions that go beyond “just rightsize it.”

***

#### **Q1: Why do EC2 bills spiral from over-provisioning or bad pricing choices?**

Because workloads evolve, but instance sizes and pricing models don’t. Teams keep on-demand instances running 24/7, even when utilization hovers below 20%.

**✅ Solution:**

* Run **AWS Compute Optimizer** and **Cost Explorer** weekly.
* Shift predictable loads to **Savings Plans / Reserved Instances** (up to 72% off).
* Use **Spot Instances** for fault-tolerant or batch workloads (up to 90% off).
* Implement **instance schedules** to stop non-prod workloads after hours.

***

#### **Q2: Why is committing to Savings Plans or RIs so confusing?**

Because predicting your baseline usage is part science, part art. Misjudging it either locks in waste or misses savings.

**✅ Solution:**

* Default to **Compute Savings Plans** for flexibility.
* Use **Reserved Instances** only where you need guaranteed capacity.
* Monitor **coverage vs utilization** KPIs monthly and rebalance quarterly.

***

#### **Q3: What’s behind random slowdowns on burstable (T-family) instances?**

**CPU credits.** Once burst credits run out, throttling hits, silently killing performance.

**✅ Solution:**

* Monitor `CPUCreditBalance` via **CloudWatch alarms**.
* Switch to **Unlimited mode** (with awareness of extra cost) or scale out horizontally.
* Move sustained loads to **M/C/R/Graviton** families.

***

#### **Q4: Why do EBS volumes cause unpredictable slowness and high costs?**

Older **gp2** volumes tie IOPS to size, forcing over-provisioning for performance.

**✅ Solution:**

* Migrate to **gp3** (decouples size and performance).
* Allocate precise **IOPS/throughput**.
* For critical workloads, use **io2 / io2 Block Express** and enable **EBS-optimized** instances.

***

#### **Q5: How does using the wrong instance family burn money?**

Running compute-heavy workloads on general-purpose (M-family) instances or vice versa leads to underutilization or overpayment.

**✅ Solution:**

* Let **Compute Optimizer** recommend the right family.
* Benchmark using **sysbench** or internal metrics.
* Try **Graviton (ARM)** instances — 15–40% better price-performance, after verifying compatibility.

***

#### **Q6: Why does networking architecture silently inflate EC2 costs?**

Cross-AZ chatter, poor placement, and hairpin NAT traffic increase latency and data transfer costs.

**✅ Solution:**

* Group chatty microservices in **cluster Placement Groups**.
* Use **VPC Endpoints** (S3, DynamoDB) to bypass NAT.
* Deploy **Global Accelerator** or **CloudFront** for edge proximity.

***

#### **Q7: Why do memory-heavy workloads (ML/analytics) overrun budgets?**

Memory leaks and over-sized R-instances hide behind “just working” apps.

**✅ Solution:**

* Choose **R-family** or **Graviton memory-optimized** instances.
* Use **CloudWatch mem metrics** to rightsize.
* For AI workloads, add **KV caching**, quantization, or batching.

***

#### **Q8: How can I safely use Spot Instances without chaos from interruptions?**

Spot can save 70–90%, but interruptions kill unprepared apps.

**✅ Solution:**

* Mix **Spot + On-Demand** in **Auto Scaling Groups** using attribute-based selection.
* Implement **checkpointing** and handle **2-minute interruption notices**.
* Enable **capacity rebalancing** for smarter recovery.

***

#### **Q9: Why do self-managed databases on EC2 eat into cost savings?**

DIY databases accumulate inefficiencies: missing indexes, old AMIs, I/O-heavy storage.

**✅ Solution:**

* Audit queries using **Performance Insights** or **pg\_stat\_statements**.
* Move to **Amazon RDS/Aurora** when possible.
* For EC2 DBs: use **gp3/io2**, tune auto-vacuum, and monitor **read/write IOPS**.

***

#### **Q10: Why does Auto Scaling waste resources or fail to respond fast enough?**

Bad scaling signals or cooldowns cause over-provisioning or late scaling events.

**✅ Solution:**

* Use **Target Tracking** policies with metrics like CPU, queue depth, or requests/sec.
* Mix instance types with **attribute-based selection** and **capacity rebalancing**.
* Add **warm pools** for near-instant scale-out.

***

### ⚙️ Quick Wins

* Migrate all **gp2 → gp3** volumes.
* Cover steady baselines with **Savings Plans**.
* Implement **instance scheduling** for non-prod.
* Pilot **Graviton instances** for 20–30% better price/performance.
* Add **Spot diversification** and cost alarms for accountability.

***

### 📚 References

* [Graviton Instances](https://aws.amazon.com/ec2/graviton/)
* [Cost Optimization Hub](https://aws.amazon.com/aws-cost-management/cost-optimization-hub/)
* [Savings Plans FAQ](https://aws.amazon.com/savingsplans/faq/)
* [Reserved Instances Docs](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/reserved-instances-types.html)
* [CUR Documentation](https://docs.aws.amazon.com/cur/latest/userguide/what-is-cur.html)

***


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://aws.cloudshim.com/aws-top-services/amazon-ec2.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
