Introduction
Besides AMD announcing faster speeds of their Athlon processor, which seems to be on a monthly basis, they have been pretty quiet lately. Even though the higher speed processors are exciting, everyone is still anxiously waiting for the next generation of Athlon supported chipsets. Right now AMD’s current 750 chipset is a little behind the times. For one it lacks AGP 4X, which will be pretty important with the current and upcoming Transform and Lighting (T&L) based video boards. At the same time it cannot support the higher memory bandwidth that must accompany AGP 4X. Many of us are also still waiting for the final launch of an Athlon-chipset with support of SMP (multi processor setups). AMD is working closely with VIA and Samsung, to name a few, on these next generation chipsets and if companies like VIA should be nice enough to find some time developing the next generation Athlon-chipset besides competing with Intel’s i820, we might even see something in the next millennium. Besides cooperating with 3rd party chipset makers AMD has also been improving their own 750 chipset. Quietly the new feature within their Northbridge called “Super Bypass” was finally enabled.
What is “Super Bypass”?
Basically super bypass removes some unnecessary memory latencies between the main memory and CPU. How much of the memory latency does super bypass remove? AMD-documents claim that super bypass reduces the latency by no less than 25%! One of the primary goals for any chipset designer is to develop a chipset that can communicate to the different internal buses with the least amount of latencies for each transaction. Take a look at AMD 750 chipset diagram below.
The AMD-751 (north bridge) handles all the communication between the CPU, SDRAM (or main memory), AGP bus and the PCI bus. Amongst other things, the 751-chip contains the memory request organizer (MRO). The MRO controls all the traffic between the CPU, PCI, AGP, and memory buses. When Super Bypass is disabled, the MRO has to go through numerous stages before getting the data to and from the main memory. When it is enabled the MRO can skip some of these stages, thus eliminating unnecessary clocks, which in-turn speeds up the transaction. There are certain conditions that have to be met on both the AGP and PCI buses for the super bypass feature to function. I am told by AMD that during normal system operation super bypass is able to function between 90-95% of the time. That sounds like a pretty good hit rate to me. You are probably asking “Why did AMD wait till now to add this new feature?” Actually super bypass was slated to work in AMD’s 750 chipset from the get-go. Unfortunately, the super bypass feature was broken in the earlier revision of the AMD 750 chipset.
Reflecting back on the Athlon’s bus
I wanted to quickly touch on the Athlon’s CPU and memory bus. There is a lot of confusion out there regarding the memory that Athlon platforms use. A lot of people are confusing the front side bus (FSB) with the memory bus. The Athlon’s FSB is able to communicate with the north bridge at 200MHz while the memory interface only runs at 100MHz. The FSB is running at a double data rate (100MHz X 2). That’s how the CPU is able to run with a FSB of 200MHz. The memory only runs at 100MHz! So if you are looking to purchase memory for your Athlon system all you need is PC100 SDRAM. If you would like to investigate more information on the Athlon processor and chipset please check out The New Athlon Processor – AMD Is Finally Overtaking Intel article.
The good, the bad, and the ugly
Well I’ve covered the good, which is that the super bypass feature will provide better performance. What could possibly be bad? The bad is that you cannot physically identify if a board is equipped with a super bypass supported AMD 751 chip. That’s right! There are no markings on the chip that reflect if it has super bypass or not. The only way to identify if a chipset has super bypass is through the PCI registers. The BIOS can check this special register to see if super bypass is available or not. The Gigabyte 7IX platform that I received had a setting in the Award BIOS under the “Advanced Chipset Features” section called “Bypass enable”. I seriously doubt that production motherboard BIOS will have this option, most likely this feature will be transparent (automatically set) to the user. If you do try to enable super bypass on a non-supported motherboard you will see no benefit since the feature is broken or non-existent. I guess this is why AMD has been so quiet about promoting this new feature.
Performance Expectations
Well given the 90-95% hit rate of the super bypass feature and the touted 25% reduction in memory latency I would guess that we should see a pretty reasonable gain in performance across the board. I expected to see about a 3% – 5% gain through out our test suite. Obviously, memory intensive applications should receive the most benefit. Another reason I expected to see a pretty good gain is due to the fact that one of the Athlon’s platforms biggest bottlenecks is its memory bus. While the CPU is cruising with a whopping 200MHz front side bus (1.6GB/sec) the memory is slowly moving at 100MHz (800MB/s). Any increase to the platforms memory performance should have a big impact.
Benchmark Setup
System Settings | |
Processor | AMD Athlon (750MHz & 800MHz) |
Motherboard | Gigabyte GA-7IX revision 1.1 BIOS revision 1.2a (11/23/99) AGP Driver v4.45 (AMD) IDE Bus Mastering Driver v1.22rc (AMD) |
Memory | 128MB SDRAM CAS 2 / RAS 2 |
Video Board | NVIDIA GeForce 32MB DDR Core / Memory Clock 120MHz / 300MHz Video Driver v3.62 |
Environment Settings | |
OS Versions | Windows 98 SE 4.10.2222 A Windows NT 4.0 w/Service Pack 6 |
DirectX Version | 7.0 |
Quake 3 Arena | Retail Version DEMO001 (NORMAL) command line = +set cd_nocd 1 +set s_initsound 0 |
Descent 3 | Retail Version SECRET2 |
Half-Life | v1.0.0.9 SMOKIN |
Shogo | v2.214 FORTRESS |
Quake 2 | AMD 3DNow! V3.20 |
Expendable | Downloadable Demo -TIMEDEMO |
SYSMark98 | 1024x768x16x85Hz |
3DMark2000 | Build 335 |
TreeMark | SIMPLE 35,000 polygons/frame w/4 lights |
Unreal Tournament | Retail Version UTBENCH |
How I tested
The platform I tested was a Gigabyte 7IX revision 1.1 motherboard. This motherboard was outfitted with AMD’s super bypass capable north bridge (AMD 751). Gigabyte isn’t the only manufacturer shipping this new north bridge; other Athlon motherboard manufacturers are shipping it as well. The Gigabyte board looked like a typical production 7IX motherboard. I am told that the newer north bridges are already being shipped on motherboards. Here’s a picture of the Gigabyte 7IX platform I used.
To get an idea of the performance impact with some typical business applications I used SYSmark98 and tested it under Windows 98 and NT 4.0. 3D games are probably some of the most CPU intensive applications. Although business application performance is important, most of its time is spent waiting for hard disk and not the CPU. Gaming however is very critical to CPU and video performance. If there isn’t enough horsepower the game play is critically affected. To exercise the impact that super bypass has on gaming I chose a large suite of 3D applications.
3D Application | Description |
3Dmark2000 | Mad Onion’s newly released 3DMark 2000 benchmark that uses the Max Payne game engine from Remedy/3Drealms. It provides an overall 3D performance tests in various game environments. The game engine has CPU optimized pipelines for Intel x87, Intel KNI, AMD Athlon and AMD 3DNow! This benchmark was ran at the lower 640x480x16 resolution to keep the video board from becoming the bottleneck. |
TreeMark | NVIDIA’s TreeMark is a benchmark tool that they use to promote their T&L based GeForce 256 product. This benchmark relies heavily on the video boards T&L. With a given CPU the scores usually stay the same no matter what CPU frequency is being used. |
Descent 3 | Interplay’s Descent 3 is a very popular 3D game title. This game was tested in D3D mode at 640x480x16 to keep the video board from becoming the bottleneck. |
Shogo | D3D game title that is a popularly used application for benchmarking. Makes a great CPU/platform benchmark when used at lower resolutions. |
Expendable Demo | D3D game title that is a popularly used application for benchmarking. Makes a great CPU/platform benchmark when used at lower resolutions. |
Half-Life | Half-Life is a great CPU/platform test given that its bottleneck isn’t with the video board. |
Unreal Tournament | The Unreal Tournament 3D engine is very CPU intensive much like Half-Life, which makes it a great CPU/platform test. |
Quake 3 Arena | Extremely popular OpenGL based game that is a great tool for extracting CPU/platform performance when it is tested at lower resolutions. |
All of the charts are setup the same way. The red lines indicate super bypass is enabled or on. While the green lines indicate that super bypass is disabled or off. Each test was ran on both a 750MHz and 800MHz Athlon.
Business Application Performance under Windows 98 SE
Looking at the SYSMark results under Windows 98 the 2% gain given by super bypass doesn’t seem to be much. However, if you compare the difference to going from 750MHz to 800MHz you’ll see that super bypass gives you almost a half a speed grade.
Business Application Performance under Windows NT
Windows NT on the other seems to enjoy the benefit of super bypass. A 5% gain is achieved. This is pretty impressive given that there is only a 4% gain from going from 750MHz to 800MHz. Super bypass is giving us a free speed grade in performance.
3D Gaming Performance under Windows 98 SE
The 3Dmark2000 results look very similar to SYSMark under Windows 98, only a 2% gain with super bypass enabled.
Using NVIDIA’s TreeMark, super bypass gives the Athlon platform a huge 15% performance gain. This is pretty cool given that an 800MHz CPU scores the same as a 750MHz CPU in this benchmark.
3D Gaming Performance under Windows 98 SE, Continued
Descent 3 gets a 5% boost in frame rate thanks to super bypass. Again, this is pretty awesome since the performance gain from an Athlon 750MHz to 800MHz is only 4%.
Shogo seems to enjoy super bypass. It gets a 5% boost via the new feature. This boost is like a free speed grade given that going from 750MHz to 800MHz only has a 3% difference in frame rate.
3D Gaming Performance under Windows 98 SE, Continued
Just as with Descent 3, Expendable gets a 5% boost in performance due to super bypass. Once again, super bypass gives as much of a performance gain as going up a CPU speed grade.
Half-Life gets a small 2% boost in performance from super bypass. I guess that’s not to bad since CPU speed grade is only 3%.
3D Gaming Performance under Windows 98 SE, Continued
Unreal Tournament enjoys a 4% gain in frame rate matching the gain one would get going from 750MHz to 800MHz. Again, a free speed grade!
The gain under Quake 3 is a whopping 6%! Can you say “2 speed grades”? That right! Super bypass provides as much performance gain as one would see going from 700MHz to 800MHz!
Conclusion
It is clear that AMD’s “Super Bypass” feature provides substantial performance improvement. However, it really bothers me that there is no physical way to determine if a motherboard is outfitted with the new north bridge that supports super bypass. Even if a utility existed which would read back the special super bypass registers, most computer shops would probably frown if you asked them to setup the platform so you could run the program. If you were planning on ordering through the net or catalog you wouldn’t even have that option. This gives me the impression that AMD was trying to hide this transition to super bypass. Why? I think AMD was looking to cover some of the performance loss due to the external cache becoming an issue. Keep in mind that AMD had to move to a divisor of more than 2 for the core clock/L2 cache clock ratio in Athlon processors of above 650 MHz, because there isn’t a cost effective cache solution that will reliably run above 300 MHz. Changing the divisor naturally resulted in a performance loss. This isn’t good news with Intel’s Coppermine already nipping at AMD’s heels. Super bypass is a blessing because it more than balances out the losses they took from moving to the new cache setting. Why do you need to keep something like this so quiet? Well, there are thousands of non-super bypass chipsets out there that aren’t sold yet. Who would want a non-super bypass if they knew about it? The results could be that the chipsets (and motherboards) without super bypass would become much more difficult to sell. It would be better to secretly insert this new feature and as soon as the old chipsets are flushed from inventories, begin talking about super bypass. This is just a theory of mine but it makes perfect sense to me.
Looking at the performance benefit the Athlon platform gains when removing some of the memory latencies makes me anxious to see what the next generation chipsets will have to offer, such as VIA’s KX133 Athlon chipset. Given that the AMD’s 750 chipset is quickly falling behind the times with their 800MB/sec memory throughput and 532MB/s AGP 2X performance, VIA’s new chipset should breathe even more performance into the Athlon platform. The KX133 will offer 4X AGP (>1GB/sec) and a 133MHz memory interface (<1GB/sec). It’s important to remember that the motherboards chipset is every bit as important in regards to performance as the CPU itself.