Knowledge Center Monthly Newsletter - September 2025

Stay up to date with the latest from the Knowledge Center. See all new Knowledge Center articles published in the last month, and re:Post’s top contributors.

EC2 Troubleshooting

Hello World!

In order to troubleshoot a server bottleneck in terms of memory/storage/CPU/network, what is the best way to approach the issues? Being a guy with Linux background, I only am limited to checking the server from os internally, but on AWS, not quite sure how to make the approach?

Any help is appreciated! Thanks :)

Topics: Compute
Tags: Linux Amazon EC2 Compute
Language: English

rePost-User-Debadri

asked 2 months ago47 views

2 Answers

Newest
Most votes
Most comments

Are these answers helpful? Upvote the correct answer to help the community benefit from your knowledge.

This recommendation was generated using AWS Generative AI capabilities. You are responsible for evaluating the recommendation in your specific context and implementing appropriate oversight and safeguards. Learn more

When troubleshooting server bottlenecks on AWS EC2 instances, you can combine your Linux expertise with AWS-specific tools for a comprehensive approach:

System-level monitoring: Continue using familiar Linux tools like top, htop, free -m, vmstat, iotop, and iftop to monitor CPU, memory, disk I/O, and network usage from within the instance.
CloudWatch metrics: AWS CloudWatch provides metrics for EC2 instances including CPU utilization, network throughput, and disk operations. However, memory usage isn't included by default.
CloudWatch agent: Install the CloudWatch agent to collect and monitor detailed system-level metrics, including memory usage, which isn't available in the default EC2 metrics. This is particularly useful for identifying memory constraints.
Log analysis: Check system and application logs (/var/log/syslog, /var/log/dmesg, application-specific logs) for clues about performance issues.
Resource constraints: For T-series instances (like t2.micro, t3.micro), monitor CPU credit usage as these are burstable instances with baseline performance.
Instance sizing: Verify if your workload is appropriate for the instance type. Running resource-intensive applications on smaller instances can lead to performance issues.
EC2Rescue tool: For more serious issues, AWS provides EC2Rescue which can help diagnose and troubleshoot problems. This can be run manually or automatically using AWS Systems Manager Automation with the AWSSupport-ExecuteEC2Rescue runbook.
Swap file: If memory is the bottleneck, consider adding a swap file, though this is a temporary solution and might impact performance.

Common bottleneck scenarios include:

Instances appearing to run normally but becoming unresponsive (often memory-related)
Periodic high system load despite normal-looking metrics (could be application-specific tasks or I/O bottlenecks)
Late response times despite low CPU/memory usage (possibly network latency or application code issues)

If issues persist after your investigation, consider upgrading to a larger instance type or optimizing your application code and resource usage.
Sources
Troubleshoot issues with Amazon EC2 instances - Amazon Elastic Compute Cloud
Run the EC2Rescue tool on unreachable instances - AWS Systems Manager
EC2 Ubuntu instance stops working everyday | AWS re:Post
EC2 Server Late response issue | AWS re:Post
AWS EC2 T2 Medium Performance Issue | AWS re:Post

re:Post Agent

answered 2 months ago

EXPERT

Riku_Kobayashi

reviewed 2 months ago

Step-by-Step EC2 Bottleneck Troubleshooting Guide

Start with AWS CloudWatch CloudWatch provides system-level metrics even if you don’t log into the instance: CPUUtilization – if it's >80% for long periods, CPU is likely a bottleneck. DiskReadOps / DiskWriteOps – high values might indicate I/O issues. NetworkIn / NetworkOut – check for bandwidth saturation. StatusCheckFailed – shows instance-level issues (hardware or networking). Note: Enable detailed monitoring (1-minute granularity) if it's disabled.
Check OS-Level Metrics (inside EC2 Linux) From your Linux background: top, htop, vmstat, iostat, free -m, df -h → CPU, memory, swap, disk I/O usage. netstat, ss, iftop, nethogs → Network traffic analysis. Example: top -o %MEM # Sort by memory usage iotop # Real-time I/O usage (if installed) dstat # All-in-one overview (needs to be installed)
Enable EC2 Instance-Level Diagnostics Install CloudWatch Agent to push memory and disk metrics to CloudWatch. sudo yum install amazon-cloudwatch-agent sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard Log stream monitoring with CloudWatch Logs (optional but recommended).

Consider SSM Agent for access without SSH.

Review EC2 Instance Type vs Workload If resource usage is high: Are you using the right instance family (compute-optimized, memory-optimized, storage-optimized)? Would burstable (T series) behavior be a limiting factor? Check CPU Credit Balance.
Check EBS Performance If your app is I/O heavy: Is EBS volume gp2 or gp3? gp2 has burst behavior, check VolumeReadOps/WriteOps and BurstBalance. Upgrade to gp3/io1/io2 for more consistent IOPS.
Use AWS Compute Optimizer (Free Tool) This can tell you if the instance is over/under-provisioned based on recent metrics.
Capture a Performance Snapshot If troubleshooting something transient: Create a CPU profile (e.g., perf, flamegraph, py-spy for Python apps). Use dstat or sar to log metrics over time.

Manvitha Potluri

answered 2 months ago

Relevant content

What is the best way to approach for quick support from AWS?
kraza
asked 3 years ago
Best approach for API Server on EC2 or ECS with Fargate
Fady Nabil
asked a year ago
Troubleshooting Node.js App Downtime on EC2 t2.micro After Updates
rePost-User-7087000
asked a year ago
What's the best way to monitor NVIDIA GPU utilization on Linux (Ubuntu) during model training?
Accepted Answer
Ioan
asked 5 years ago
What tools can I use with EC2Rescue for Linux to troubleshoot performance bottlenecks within my instances?
AWS OFFICIALUpdated a year ago
How do I troubleshoot NTP synchronization issues on my Amazon EC2 Linux servers?
AWS OFFICIALUpdated a year ago
How do I troubleshoot network performance issues between EC2 Linux or Windows instances in a VPC and an on-premises host over the internet gateway?
AWS OFFICIALUpdated a year ago
How do I troubleshoot an EC2 Linux instance that failed the instance status check because of OS issues?
AWS OFFICIALUpdated 2 months ago
Amazon VPC Lattice Troubleshooting Part 1 - Client to Amazon VPC Lattice Communication
EXPERT
nbaws
published 6 months ago
Amazon VPC Lattice Troubleshooting Part 3 - Troubleshooting Target Connectivity
EXPERT
nbaws
published 6 months ago