The GPU Hierarchy

Thursday, November 24, 2022

GPU Performance Hierarchy and the Best Graphics Cards, November 2022

With graphics cards finally back in stock, I've put together this comprehensive GPU performance hierarchy of the past two (now starting on three) generations of graphics cards. These are the best graphics cards around, but it's also important to know what you're getting. I've benchmarked all of these cards, at multiple resolutions and settings. Here's how they stack up across a test suite of 15 games.

I've also listed current GPU prices, which have improved substantially over the past few months. Most AMD graphics cards are now selling at or below their official MSRPs, and Nvidia's fastest GPUs (RTX 3080 Ti, RTX 3090, and RTX 3090 Ti) are also well below their MSRPs. The RTX 4090 and RTX 4080 have also launched, alongside the various Intel Arc GPUs. Here's the current list, updated regularly as new GPUs arrive.

Crossy Road Medieval Character Unlocks

I'm going to interrupt my normally scheduled GPU talk to go off on a tangent: Crossy Road. It's a stupid game. I've played it waaaaaay too much! Recently, the Medieval update came out with a new map, plus 14 new characters. I've already unlocked every other character, except for the Piggy Bank that requires you to pay $2.99 — I'm cheap, and also addicted. LOL

So, here are the 14 medieval characters, brief notes on each, and what you need to do to unlock them.

Gorgeous Prince: Prize Machine. Laughs knowingly and appears quite full of himself.
Princess: Prize Machine. Sometimes giggles, sometimes sniffs arrogantly, she's apparently the female counterpart to the Gorgeous Prince.
Blacksmith: Prize Machine. Anvils, anvils, everywhere! Lots of the regular objects on the medieval level are replaced with anvils, which the blacksmith can whack. It doesn't do anything other than make a sound.
Peasant: Prize Machine. Laughs like Beavis and Butthead, or some evildoer. If you stand still, he'll throw rotten fruit in whatever direction he's facing.
Falconer: Prize Machine. Not much to say, except the bird squawks and when you die it flies away.
Monk: Prize Machine. Carries a bible, which he lifts into the air when you stand still. He's the only character that can find Robin Hood.
Healer: Prize Machine. Carries around a book and a backpack, but otherwise nothing special that I could see.
Noblewoman: Prize Machine. Hums to herself and people bow to her when she passes.
Nobleman: Prize Machine. Mumbles importantly to himself and people bow to him when he passes.

AMD Radeon RX 6700 10GB (Non-XT) Review and Specifications, Featuring Sapphire

We've got something new for today: My first graphics card review for the site. It's mostly because this is a card that hasn't been reviewed much, the AMD Radeon RX 6700 non-XT 10GB model. This GPU came later in the life cycle, and it sits roughly between the RX 6650 XT and RX 6700 XT — in price as well as performance. It's sort of a last gasp for RDNA 2, right as RDNA 3 cards are about to launch, though the RX 7900 XTX and RX 7900 XT will be priced in an entirely different category. You can already find the FPS results in our GPU performance hierarchy, but let's dig into things a bit deeper for this review.

AMD Radeon RX 6700 10GB Specifications
Architecture	Navi 22
Process Technology	TSMC N7
Transistors (Billion)	17.2
Die size (mm^2)	336
Compute Units	36
GPU Cores (Shaders)	2304
Ray Accelerators	36
Boost Clock (MHz)	2450
VRAM Speed (Gbps)	16
VRAM (GB)	10
VRAM Bus Width	160
Infinity Cache	80
Render Outputs	64
Texture Mapping Units	144
FP32 TFLOPS (Single-Precision)	11.3
FP16 TFLOPS (Half-Precision)	22.6
Bandwidth (GB/s)	320
Total Board Power (Watts)	175
Launch Date	March 18, 2021
Launch Price	$369 (unofficial)

For quite some time, the Radeon RX 6700 XT was the only desktop card to use AMD's Navi 22 GPU. Later, the RX 6750 XT joined the fun, and both are fully enabled variants of the chip. Given there are always defective die from a wafer, and most of those can be harvested by disabling the affected functional units (CUs, SMs, memory controllers, etc.), we long expected there would be an RX 6700 non-XT card. But it didn't come, and it didn't come, and then it suddenly stealth-launched from a couple of AMD's AIC (add-in card) partners: Sapphire and XFX. Maybe there are others making the 6700 10GB, but in the US at least these are the only ones available.

AMD Radeon RX 7900 XT Specifications

The AMD Radeon RX 7900 XT is the slightly smaller sibling of the just-announced Radeon RX 7900 XTX. Using the same RDNA 3 architecture and GPU chiplets, but with one fewer MCD (Memory Cache Die), it will presumably land just a bit lower down on our GPU performance hierarchy when it arrives on December 13.

We have to admit that the naming certainly represents a bit of an annoyance. The only difference between the RX 7900 XTX and the 7900 XT is one letter, and yet the latter will definitely have less performance. For all the people that were up in arms over the RTX 4080 16GB and RTX 4080 12GB shenanigans, this doesn't really feel any better. Regardless, these are the names AMD has chosen and we'll have to see how they match up. Here's the rundown of the RX 7900 XT specs.

AMD Radeon RX 7900 XT Specifications
Architecture	Navi 31
Process Technology	TSMC N5 + N6
Transistors (Billion)	58
Die size (mm^2)	300 + 185
Compute Units	84
GPU Cores (Shaders)	10752
Ray Accelerators	84
Boost Clock (MHz)	2400
VRAM Speed (Gbps)	20
VRAM (GB)	20
VRAM Bus Width	320
Infinity Cache	80
Render Outputs	192
Texture Mapping Units	336
FP32 TFLOPS (Single-Precision)	51.6
FP16 TFLOPS (Half-Precision)	103.2
Bandwidth (GB/s)	800
Total Board Power (Watts)	300
Launch Date	December 13, 2022
Launch Price	$899

The AMD Radeon RX 7900 XTX has officially been announced, the first in a salvo of RDNA 3 architecture GPUs that will compete with the best graphics cards. It's a radical new approach to GPU designs, using chiplets — in a similar fashion to how AMD uses chiplets on its Zen 3 and Zen 4 CPUs. Except here the use of chiplets has been tuned and tweaked to work best for graphics rather than CPUs.

The specifications for the Radeon RX 7900 XTX are mostly known at this point, though there are a few missing pieces. For example, AMD provided details on its "Game Clock" but not on the Boost Clock, and it appears to have rounded things off to the nearest 100 MHz. Maybe. We do have a compute teraflops figure that seems to be based off the boost clock, however, which means we mostly have what we need.

AMD Radeon RX 7900 XTX Specifications
Architecture	Navi 31
Process Technology	TSMC N5 + N6
Transistors (Billion)	58
Die size (mm^2)	300 + 222
Compute Units	96
GPU Cores (Shaders)	12288
Ray Accelerators	96
Boost Clock (MHz)	2500
VRAM Speed (Gbps)	20
VRAM (GB)	24
VRAM Bus Width	384
Infinity Cache	96
Render Outputs	192
Texture Mapping Units	384
FP32 TFLOPS (Single-Precision)	61.4
FP16 TFLOPS (Half-Precision)	122.8
Bandwidth (GB/s)	960
Total Board Power (Watts)	355
Launch Date	December 13, 2022
Launch Price	$999

Nvidia GeForce RTX 4090 Specifications

The Nvidia GeForce RTX 4090 currently reigns as the king of graphics cards — to see how performance stacks up, check our extensive list of GPU benchmarks and the best graphics cards. It uses Nvidia's latest Ada Lovelace graphics architecture, with the largest of the Ada chips: AD102. The combination of lots of cores and high clock speeds results in incredible levels of performance — the reference Founders Edition has a 2,520 MHz boost clock, and often runs at over 2.7 GHz in testing.

Note that the RTX 4090 doesn't use a fully enabled AD102 chip, as it has 128 of the potential 144 SM (Streaming Multiprocessor) blocks enabled, along with 72MB of the potential 96MB of L2 cache. We'll probably see a GeForce RTX 4090 Ti in the future, as there's certainly room for it.

Nvidia GeForce RTX 4090 Specifications
Architecture	AD102
Process Technology	TSMC 4N
Transistors (Billion)	76.3
Die size (mm^2)	608.4
Streaming Multiprocessors	128
GPU Cores (Shaders)	16384
Tensor Cores	512
RT Cores	128
Boost Clock (MHz)	2520
VRAM Speed (Gbps)	21
VRAM (GB)	24
VRAM Bus Width	384
L2 Cache	72
Render Outputs	176
Texture Mapping Units	512
FP32 TFLOPS (Single-Precision)	82.6
FP16 TFLOPS with Sparsity (FP8)	661 (1321)
Bandwidth (GB/s)	1008
Total Board Power (Watts)	450
Launch Date	October 12, 2022
Launch Price	$1,599

Let's take a moment this Thanksgiving season to express our gratitude over the fact that Nvidia caved into pressure from the PC enthusiast community and canceled the RTX 4080 12GB, which means the RTX 4080 16GB is now the only RTX 4080-class GPU coming down the pipeline — we'll add it to our list of GPU benchmarks and the best graphics cards as soon as it's available.

There will of course be future RTX 4080 Ti / Super / Whatever cards at some point — probably next year — but those will presumably be clearly labeled as such. Really, it's a case of people being pissed about Nvidia trying to charge $899 for a card with only 12GB VRAM on a 192-bit memory interface.

While the RTX 4080 takes the reigns from the previous generation RTX 3080, it's also worth pointing out that it's inheriting the pricing of the RTX 3080 Ti rather than the far more attractively priced RTX 3080 (10GB). The RTX 4080 (formerly 16GB) meanwhile still has the following specs.

Nvidia GeForce RTX 4080 Specifications
Architecture	AD103
Process Technology	TSMC 4N
Transistors (Billion)	45.9
Die size (mm^2)	378.6
Streaming Multiprocessors	76
GPU Cores (Shaders)	9728
Tensor Cores	304
RT Cores	76
Boost Clock (MHz)	2505
VRAM Speed (Gbps)	22.4
VRAM (GB)	16
VRAM Bus Width	256
L2 Cache	64
Render Outputs	112
Texture Mapping Units	304
FP32 TFLOPS (Single-Precision)	48.7
FP16 TFLOPS with Sparsity (FP8)	390 (780)
Bandwidth (GB/s)	717
Total Board Power (Watts)	320
Launch Date	November 16, 2022?
Launch Price	$1,199

The Nvidia GeForce RTX 4070 launched in Aprile 2023, with reduced specs compared to the higher tier models. Nvidia increased the generational pricing by $100 this round, and performance basically matches the previous generation RTX 3080 10GB card.

Nvidia GeForce RTX 4070 Specifications
Architecture	AD104
Process Technology	TSMC 4N
Transistors (Billion)	32.0
Die size (mm^2)	294.5
Streaming Multiprocessors	46
GPU Cores (Shaders)	5888
Tensor Cores	184
RT Cores	46
Boost Clock (MHz)	2475
VRAM Speed (Gbps)	21
VRAM (GB)	12
VRAM Bus Width	192
L2 Cache	36
Render Outputs	64
Texture Mapping Units	184
FP32 TFLOPS (Single-Precision)	29.1
FP16 TFLOPS with Sparsity (FP8)	233 (466)
Bandwidth (GB/s)	504
Total Board Power (Watts)	200
Launch Date	April 2023
Launch Price	$599 ($549)

The simple solution is to not launch a $600 RTX 4070 until most of the RTX 3080/3090 inventory has been sold. It's the holiday shopping ~~spree~~ season now, and obviously Nvidia thinks it can get away with this tactic. Delaying the RTX 4070 a couple of months — and building up a larger inventory of such GPUs in the meanwhile — won't really hurt, especially if Nvidia can successfully clear out all of those Ampere GPUs.

And it probably can! Technologically savvy people who read this blog might not be duped into buying an overpriced RTX 30-series card at this point in time, but there are tons of less knowledgeable gamers that just want a new high-end PC or graphics card for Christmas, Hanukah, or whatever. And if Nvidia and its partners can't sell all of those Ampere GPUs directly to gamers, rest assured there are large OEMs like Dell (Alienware), HP (Omen), Lenovo (Legion), etc. who will buy loads of cards at a discount and foist them off on the type of people who can't be bothered to build their own PCs.

Looking at the potential RTX 4070 specifications, it could very well end up with the same amount of VRAM and theoretical compute performance as the previous generation RTX 3080 Ti. Of course it has just a bit more than half the memory bandwidth, but the 48MB of L2 cache should make up for that. You also get all the Ada Lovelace architectural upgrade, like support for DLSS3.

While we wait the arrival of the RTX 4070, anyone who wants more performance and are willing to pay for it can step up to the RTX 4080 or RTX 4090, Just the way Lord Jensen would like it.

Nvidia GeForce RTX 3090 Ti Specifications

Last generation's ultimate GPU, the GeForce RTX 3090 Ti only launched in late March of 2022 — to see how performance stacks up, check our extensive list of GPU benchmarks and the best graphics cards. It was also a product of the cryptocurrency era, which meant GPU supply still hadn't caught up with demand and Nvidia felt it could get away with an exorbitant launch price of $1,999.

A few months later, demand plummeted and Nvidia dropped the 3090 Ti Founders Edition price to $1,099 at Best Buy, which in turn pissed off some of its add-in card (AIC) partners like EVGA, who ultimately decided to stop making graphics cards altogether — and who probably has a shload of RTX 30-series parts still in stock that it's trying to offload as quickly as possible. But don't feel bad for EVGA or the other AIC partners, because they made out like bandits throughout all of 2020 and 2021!

As far as the RTX 3090 Ti specifications, this is the fully enabled GA102 chip, the top of the Ampere architecture. It's interesting to compare this chip with the RTX 4090 specifications, to see how much things have changed. Note that the die size on GA102 is slightly larger than the new AD102, and yet AD102 packs nearly three times as many transistors. That's all thanks to the move from Samsung 8N ("8nm Nvidia") that was really just a tweaked version of Samsung's 10nm process, over to TSMC 4N ("4nm Nvidia") that's a tweaked variant of TSMC's 5nm N5 process. Obviously, the latter offers significantly higher transistor density.

Nvidia GeForce RTX 3090 Ti Specifications
Architecture	GA102
Process Technology	Samsung 8N
Transistors (Billion)	28.3
Die size (mm^2)	628.4
Streaming Multiprocessors	84
GPU Cores (Shaders)	10752
Tensor Cores	336
RT Cores	84
Boost Clock (MHz)	1860
VRAM Speed (Gbps)	21
VRAM (GB)	24
VRAM Bus Width	384
L2 Cache	6
Render Outputs	112
Texture Mapping Units	336
FP32 TFLOPS (Single-Precision)	40.0
FP16 TFLOPS (Sparsity)	160 (320)
Bandwidth (GB/s)	1008
Total Board Power (Watts)	450
Launch Date	March 29, 2022
Launch Price	$1,999

The maxed out Ampere chip in the RTX 3090 Ti has 84 SMs and 10,752 CUDA cores, along with 336 Tensor cores. Combined with clocks, the 3090 Ti offers up to 40 teraflops of FP32, or 320 teraflops of FP16. Unlike the new Ada Lovelace Tensor cores, there's no FP8 support, which means theoretical compute for AI workloads is about one-fourth of what the RTX 4090 can offer — not accounting for other architectural differences.

In order to try and differentiate the RTX 3090 Ti from the existing RTX 3090, Nvidia also bumped up the TBP (Total Board Power) by 100W, which allows the GPU to sustain substantially higher clocks compared to the 3090. It also allowed Nvidia and its partners a chance to test the waters for a 450W graphics card and pave the way for the 4090. Ironically, where the 12-pin and 16-pin adapters that came with RTX 3090 Ti cards all seemed to work fine, the new 4090 16-pin adapters appear to have used poor soldering that's prone to cracking and can cause a fire hazard. Oops.

The RTX 3090 Ti also uses new 2GB (16Gb) 21 Gbps GDDR6X memory, where the original RTX 3090 used 1GB (8Gb) 21 Gbps chips — this is the same 21 Gbps GDDR6X memory as the new RTX 4090, which means they have the same memory bandwidth. However, Ada cards have significantly larger L2 caches — 12X larger for the 4090 versus 3090 Ti — and that in turn means effective memory bandwidth is substantially improved. If the L2 cache hit rate on a 4090 for example is 60% at 4K, that means only 40% of memory accesses actually hit the GDDR6X memory, which means 2.5X more effective bandwidth. Nvidia has not detailed the actual L2 hit rates, which will vary by workload, but 50–70 percent seems likely.

At present, RTX 3090 Ti cards can be found starting at around $1,290 on Amazon. Considering the pending arrival of the GeForce RTX 4080, I can't help but think the RTX 3090 Ti represents a terrible value at this time — or actually any time since launch. At first it was far too expensive, but by the time Nvidia slashed prices and forced the AIC partners to do the same, Ada Lovelace was right around the corner and there was no reason to even consider buying a 3090 Ti. It's generally the fastest graphics card from the Nvidia Ampere and AMD RDNA 2 era — at least at 4K and in professional workloads — but it arrived too late, provided too little of a performance uplift (only about 10% faster than the 3090), and also required substantially more power.

If you're even thinking about buying an RTX 3090 Ti card, used or otherwise, you should instead look at the RTX 4090. Well, unless you can pick one up for under $700. And don't forget that RTX 4070 is also on the way.

Nvidia GeForce RTX 3090 Specifications

It's hard to believe that Nvidia's GeForce RTX 3090 has been around for over two years now, though it has now been displaced by the new RTX 4090. You can see where the RTX 3090 ranks among other cards in our GPU performance hierarchy.

Originally launched in September 2020 alongside the RTX 3080, people complained about the price — how it was a Titan-class price without the Titan-class features like improved professional application drivers and support. And then the cryptocurrency mining boom of 2020 through 2022 hit, and suddenly $1,500 for an RTX 3090 that could potentially earn (at its highest point) over $30 per day seemed like a steal.

Naturally, prices shot up to compensate, scalpers got super involved with graphics cards, and the rest is sort of history. Painful history, at least for gaming enthusiasts — miners loved the RTX 3090. This was the fastest consumer graphics card throughout 2021, packed with features to make 4K gaming truly viable. And yet most of these cards probably spent all of 2021 chipping away in the Ethereum mines. RIP, Ethereum mining! But let's look at the specifications.

Nvidia GeForce RTX 3090 Specifications
Architecture	GA102
Process Technology	Samsung 8N
Transistors (Billion)	28.3
Die size (mm^2)	628.4
Streaming Multiprocessors	82
GPU Cores (Shaders)	10496
Tensor Cores	328
RT Cores	82
Boost Clock (MHz)	1695
VRAM Speed (Gbps)	19.5
VRAM (GB)	24
VRAM Bus Width	384
L2 Cache	6
Render Outputs	112
Texture Mapping Units	328
FP32 TFLOPS (Single-Precision)	35.6
FP16 TFLOPS (Sparsity)	142 (285)
Bandwidth (GB/s)	936
Total Board Power (Watts)	350
Launch Date	September 24, 2020
Launch Price	$1,499

RTX 3090 uses the same GA102 chip as several other Nvidia GPUs — RTX 3080, RTX 3080 12GB, RTX 3080 Ti, and RTX 3090 Ti are all based off GA102. Back in 2020, there probably weren't that many nearly fully functional GA102 chips available, so supplies of the 3090 were quite limited and Nvidia pushed the narrative that these were "professional cards, not just for gamers." That's despite having GeForce branding right in the card name, so you can guess how well received such claims were.

With 82 of the available 84 SMs enabled, and the full 384-bit memory interface matched up with 24 1GB GDDR6X chips, this was a beast of a card. Power consumption on the reference model was rated at 350W, but third-party AIC vendors pushed things well into the 400W and higher range on factory overclocked cards.

One of the big issues with the RTX 3090, particularly for cryptocurrency miners, is that the GDDR6X chips ran hot. Fire up Ethereum mining and many 3090 cards, including the Founders Edition, would quickly reach 110C on the memory before starting to throttle. Dismantling the cards to replace the original thermal pads with better variants was common, and even in gaming workloads you could easily hit more than 100C on the memory.

The root of the problem was that, because Micron was only making 1GB (8Gb) capacity GDDR6X chips at the time, the RTX 3090 required half of the 24 chips to reside on the back of the PCB, while the other half were on the same side as the GPU. Take a look at that chunky Founders Edition up top: It has a chunky a triple-slot cooler to help keep all the chips cool... except that's only for the "front" of the PCB; the back of the PCB just has a metal cover with thermal pads and no active cooling. I can state from experience that the back of the card — with the RTX 3090 logo seen above, which usually faces upwards in most computer cases —can get extremely toasty under load!

Theoretical compute performance for the RTX 3090 tips the scales at 35.6 teraflops, and the Tensor cores can do up to 285 teraflops of FP16 with sparsity enabled. All of that number crunching prowess is put to good use in ray tracing games, and Nvidia's DLSS feature can leverage the Tensor cores to help with image upscaling that looks almost as good as native, at least in Quality mode. Almost all of the most demanding ray tracing games support DLSS, though, which actually brings 4K gaming into reach.

Thursday, November 24, 2022

GPU Performance Hierarchy and the Best Graphics Cards, November 2022

Monday, November 21, 2022

Crossy Road Medieval Character Unlocks

Friday, November 11, 2022

AMD Radeon RX 6700 10GB (Non-XT) Review and Specifications, Featuring Sapphire

Friday, November 4, 2022

AMD Radeon RX 7900 XT Specifications

AMD Radeon RX 7900 XTX Specifications

Tuesday, November 1, 2022

Nvidia GeForce RTX 4090 Specifications

Nvidia GeForce RTX 4080 Specifications

Nvidia GeForce RTX 4070 Specifications

Nvidia GeForce RTX 3090 Ti Specifications

Nvidia GeForce RTX 3090 Specifications