AWS continuously updates its cloud services and releases new instance types, and this leads to the following questions from customers:
- Which instance type is better for my needs?
- Which instance type is cheaper?
- How does CPU/Networking performance differ between instance types?
In this article, I have compared different generations for general-purpose EC2 instances of “M” type. The “Large” size was used (2 vCPU, 8 GB RAM), including the following:
- Pricing
- CPU performance
- RAM performance
- Network performance
For M5a, M5zn, M5n, M5, M6a, M6in, M6i, M6g, M7a, M7i, M7i-flex, M7g, M8g instance types.
Instance specifications and pricing
AWS documentation and the lscpu tool were used to collect the following information:
In general, we have three CPU vendors in this table:
- Intel (M5, M5zn, M5n, M6i, M6in, M7i, M7i-flex)
- AMD (M5a, M6a, M7a)
- AWS Graviton — ARM architecture (M6g, M7g, M8g)
Price information was taken for “on-demand” instances in US-East-1 (N. Virginia region) in May 2025.
The table shows that M6g is the cheapest one.
Network performance
For this test, I used thespeedtest
tool, for example (M8g test):
$ curl -s https://raw.githubusercontent.com/sivel/speedtest-cli/master/speedtest.py | python3 - Retrieving speedtest.net configuration... Testing from Unknown (13.217.55.225)... Retrieving speedtest.net server list... Selecting best server based on ping... Hosted by Shentel (Ashburn, VA) [1774.52 km]: 1.684 ms Testing download speed................................................................................ Download: 5921.29 Mbit/s Testing upload speed...................................................................................................... Upload: 3541.71 Mbit/s
Several attempts were performed for every instance to avoid incidental performance degradation. And here is the result:
M8g and M7a showed the best results.
CPU performance
sysbench was used to test CPU and Memory, for example:
$ sysbench cpu run sysbench 1.1.0-3ceba0b (using bundled LuaJIT 2.1.0-beta3) Running the test with following options: Number of threads: 1 Initializing random number generator from current time Prime numbers limit: 10000 Initializing worker threads... Threads started! CPU speed: events per second: 3342.29 Throughput: events/s (eps): 3342.2875 time elapsed: 10.0003s total number of events: 33424 Latency (ms): min: 0.30 avg: 0.30 max: 0.34 95th percentile: 0.30 sum: 9996.44 Threads fairness: events (avg/stddev): 33424.0000/0.00 execution time (avg/stddev): 9.9964/0.00
Or with 4 threads:
$ sysbench --threads=4 cpu run sysbench 1.1.0-3ceba0b (using bundled LuaJIT 2.1.0-beta3) Running the test with following options: Number of threads: 4 Initializing random number generator from current time Prime numbers limit: 10000 Initializing worker threads... Threads started! CPU speed: events per second: 6790.93 Throughput: events/s (eps): 6790.9294 time elapsed: 10.0003s total number of events: 67911 Latency (ms): min: 0.29 avg: 0.59 max: 20.31 95th percentile: 0.30 sum: 39964.88 Threads fairness: events (avg/stddev): 16977.7500/8.29 execution time (avg/stddev): 9.9912/0.01
CPU speed: events per second
sysbench
measures raw CPU performance by calculating prime numbers up to a certain value. Shows per-thread CPU capability and latency.
M7a was found to be the leader.
CPU Latency (Average) (ms)
Once again, M7a was at the top.
Memory performance
sysbench
can test memory bandwidth (read/write) and latency. It also helps identify RAM speed and NUMA performance.
$ sysbench memory run sysbench 1.1.0-3ceba0b (using bundled LuaJIT 2.1.0-beta3) Running the test with following options: Number of threads: 1 Initializing random number generator from current time Running memory speed test with the following options: block size: 1KiB total size: 102400MiB operation: write scope: global Initializing worker threads... Threads started! Total operations: 57387660 (5738739.43 per second) 56042.64 MiB transferred (5604.24 MiB/sec) Throughput: events/s (eps): 5738739.4302 time elapsed: 10.0000s total number of events: 57387660 Latency (ms): min: 0.00 avg: 0.00 max: 0.06 95th percentile: 0.00 sum: 3962.93 Threads fairness: events (avg/stddev): 57387660.0000/0.00 execution time (avg/stddev): 3.9629/0.00
In this category, M5zn was top.
MUTEX
sysbench
measures pthread mutex lock/unlock performance under contention. It simulates thread synchronization overhead.
$ sysbench mutex run sysbench 1.1.0-3ceba0b (using bundled LuaJIT 2.1.0-beta3) Running the test with following options: Number of threads: 1 Initializing random number generator from current time Initializing worker threads... Threads started! Throughput: events/s (eps): 5.5396 time elapsed: 0.1805s total number of events: 1 Latency (ms): min: 180.46 avg: 180.46 max: 180.46 95th percentile: 179.94 sum: 180.46 Threads fairness: events (avg/stddev): 1.0000/0.00 execution time (avg/stddev): 0.1805/0.00
Here, M7a was top.
General purpose vs. Compute optimized vs. Memory optimized
In the previous tests I compared CPU and Memory performance for different generations of the “General Purpose” instances (M type), but logically, if you care about CPU performance the most, you would use the “Compute optimized” instance and the same logic with Memory performance and “Memory optimized” instances.
Let’s compare CPU and Memory performance for the latest generations of:
General purpose
m8g.large (2 vCPU, 8 GB RAM)Compute optimized
c8g.large (2 vCPU, 4 GB RAM)Memory optimized
r8g.large (2 vCPU, 16 GB RAM)
First of all, what is the price (on-demand instance in the us-east-1 region):
CPU, Memory, and Network performance are nearly the same:
So, the only chance to see the difference in performance is to use the “optimized” instances for relevant workloads:
Compute optimized
Run 100% CPU-bound real-world apps (e.g., video encoding, compute-heavy tasks) that scale with vCPU countMemory optimized
Load >50–75% of system memory or run large in-memory DBs (Redis, Memcached, Elasticsearch)General purpose
Need balanced performance (web apps, microservices, small DBs)
Conclusion
The newer the instance type, the better? Not at all.
In this post, I benchmarked different EC2 families and generations, tested CPU, Memory, and Network performance, and provided comparison graphs.