Introduction
There once was a little PC 3D-chip company in Silicon Valley comprised of ex-SGI people that tried a comeback in 1997, after it had gotten dangerously close to bankruptcy with its first and very unsuccessful product the year before. The new chip in 1997 had the code name ‘NV3’, but it is today more known as the ‘RIVA128’. The year 1997 saw the real birth of mainstream 3D-hardware. AGP had been born and 3D-chip makers had finally learned what it took to design acceptable 3D-accelerators.
What happened since has long become history. NVIDIA’s RIVA128 was pretty successful, though not by far as well performing as a chip called ‘Voodoo’ from a company by the name of ‘3Dfx’. It took until spring 1999 that NVIDIA was finally able to produce a chip that was superior to products from 3Dfx, but since that time NVIDIA was unstoppable. 3Dfx lost its capital ‘D’ and marked with the name change to ‘3dfx’ its slow but certain demise. Finally, 3dfx died and NVIDIA bought what was left of its technology.
Microsoft Handing Out To NVIDIA
The above short history of NVIDIA is missing one important fact. Although NVIDIA has doubtlessly become the most successful player in the 3D-arena, it wouldn’t be in a situation as good as it is, if there wouldn’t be the most magnificent software company by the name of ‘Microsoft’. I am pretty sure you have heard of it. It had taken NVIDIA quite a long and painful process to finally ‘go public’ (a process that is painful by definition, believe me). Once NVIDIA had made it to the NASDAQ, its stock price remained in the ‘usual’ area for 3D-chip makers. ATi as well as 3dfx used to be in the same stock price area for quite a while. Then came the moment that should change NVIDIA’s history and that should be made into a company internal memorial day. Microsoft chose NVIDIA as the key designer and supplier of the 3D, core logic and sound hardware of its upcoming Xbox. Since this day NVIDIA’s stock price went through the roof and doesn’t follow the (recently rather disastrous) trend of chipmaker stocks at NASDAQ anymore. NVIDIA had become tremendously rich and powerful over night and its future was suddenly painted in platinum colors.
Xbox’s Byproducts
Now NVIDIA wasn’t NVIDIA if it wasn’t able to capitalize on this exceptional situation. Instead of simply designing and supplying Xbox-hardware, it started to release its normal mainstream products with Xbox technology, thus benefiting twice from the deal with Microsoft. The first step was the release of GeForce3 a few months ago. NVIDIA’s latest and greatest 3D-chip includes the ‘vertex and pixel shaders’, both key features of the 3D-chip that will be used in the Xbox. Today marks the second step in NVIDIA’s progress to the Xbox. GeForce3 has got the 3D-technology of Xbox and the now released ‘nForce’ chipset contains the core logic and sound hardware we will find in Xbox.
Conquering New Markets
So far NVIDIA has only been making 3D-hardware. It started with mainstream 3D-chips targeted at the upper scale of 3D-users. Then NVIDIA realized that the most money is made with value 3D-chips and it released TNT2 M64 and later GeForce2 MX. At the same time NVIDIA addressed the professional 3D market with its first workstation product Quadro, which was later replaced by Quadro 2. All of that stuff is 3D hardware and thus not exactly surprising. Today NVIDIA will start to enter two more markets, and this won’t make the current players particularly happy. nForce is a chipset, but it comes with integrated high-tech sound as well, so NVIDIA will not only make VIA and Intel rather concerned, Creative Labs won’t be delighted about nForce either.
The Job Of A PC Chipset
Just as the motherboard, the chipset is one of the least flashy parts of a PC-system. The average guy out there who uses a computer neither knows about motherboards nor chipsets. He dismisses people who talk about motherboards and chipsets as geeks. However, as little publicity as technical PC-components like chipsets or motherboards may get, as high is their impact on system behavior, reliability and performance.
A chipset is the ‘glue’ that keeps the other system components together. It’s the backbone of every PC, since it connects the processor(s), memory, graphics cards, PCI cards, ports, … with each other. Without the chipset, all those components couldn’t communicate with each other and the PC wouldn’t run, it wouldn’t even be a PC. The chipset could be seen as the telecommunication network of a large company or government building. It’s used by the boss to talk to his employees or to other companies and it’s used by the workers to communicate with other departments and so on. Once this network breaks down, the company is not operational anymore. If the network is faulty, things go wrong in the company, if the network is slow, things get delayed, … I guess you are getting the picture.
Basically, the chipset has to provide the fastest and most reliable communication between all the system components. A bad chipset only provides slow communication, thus slowing the system down, and a good chipset can speed up a system significantly. Let’s have a look at the common problems that come with today’s PC chipset:
The North Bridge
Typical chipsets consist of two chips, thus the name ‘chipset’. The larger of the two chips is the high-speed part, which connects the processor with the system memory and the graphics hardware. It’s commonly referred to as the ‘north bridge’. This chip is not necessarily bigger than the other chip because of its silicon die, but because it requires more pins. The interface to the processor and memory consists of a huge amount of signal lines, so that the packaging of this chip has to accommodate more pins. This ‘north bridge’ has a significant impact on system performance, since it defines the speed at which data can flow between the processor and the memory as well as the graphics subsystem. As it contains the interface to system memory, it also defines the memory type that can be used with the system. A chipset for RDRAM is different to a chipset for SDRAM. A north bridge can only communicate with the memory type it was designed for.
The most crucial issues for a north bridge that impact its performance are
- memory bandwidth
- overall latency, particularly memory latency
- data bandwidth to the graphics subsystem
- data bandwidth to the south bridge
NVIDIA’s nForce ‘north bridge’, named ‘IGP’ for ‘Integrated Graphics Processor’ addresses each of those issues.
- Memory bandwidth is high due to the IGP’s twin bank memory controller for SDR as well as DDR SDRAM, with a 128-bit wide data path.
- Memory latency is reduced due to the crossbar configuration of the two separate 64-bit wide memory controllers as well as due to the integrated DASP (Dynamic Adaptive Speculative Pre-Processor), which is something like a intelligent small third-level cache for the processor.
- The integrated GeForce2MX-like graphics controller is internally connected to the chipset core with a path that has a bandwidth of 1.5 GB/s, equal to 6xAGP. External graphics cards can take advantage of 4xAGP.
- The connection to the south bridge is realized with AMD’s HyperTransport interface, which offers a much higher bandwidth than the traditionally used PCI bus as well as the lately used ‘hub architecture’ (Intel) or ‘V-Link’ (VIA).
NVIDIA’s nForce IGP connects to AMD Athlon processors, making nForce an Athlon chipset. For the time being, nForce does NOT support any Intel processors.
Integrated Graphics
To reduce overall system costs, some chipsets in today’s market come with an integrated graphics core. The currently available integrated chipsets, such as e.g. Intel’s i810 and i815 as well as VIA’s KM133 and others, are providing pretty slow 3D-performance, which could be dubbed as ‘3D-deceleration’ rather than 3D-acceleration. Especially Intel’s as well as VIA’s solutions are very sad examples for this. Systems equipped with those chipsets don’t really allow any reasonable 3D-gameplay.
NVIDIA’s nForce has an internal GeForce2MX-like graphics core that comes with integrated transform and lighting as well as high fill rates, so that 3D-gaming is very well possible as well as enjoyable. It’s easy to understand that it wasn’t difficult for NVIDIA to integrate their own GeForce2 architecture, but nForce was also designed to tackle the crucial graphics memory issue, which is the major reason why current chipsets with integrated graphics are so slow. Usually, a part of the much slower system memory is used for 3D-graphics, slowing everything down. nForce’s twin bank memory controller supplies the internal GeForce2MX unit with enough memory bandwidth to offer good performance.
The South Bridge
The other chip that makes up a complete chipset is the so-called ‘south bridge’. It connects to the north bridge either through the traditionally used PCI-bus or through a dedicated bus. The job of the south bridge is to offer PCI-functionality for PCI-cards that are plugged into the system, it usually comes with an integrated IDE hard drive controller, a floppy controller, serial, parallel and USB-ports, the power and system management and lately also with some hard wired codecs, such as e.g. the AC97 codec for low-end sound.
The biggest issue for the south bridge is its vulnerability to compatibility issues, as it has to communicate with all those components that you can add to a system (PCI cards, ISA cards, USB-devices, printers, modems,…). Here we will see if NVIDIA has done a good job with nForce. Besides that you want the south bridge to have a fast IDE disk interface, many USB-ports and if it is an integrated chipset, it is good if it comes with integrated sound as well as networking capabilities.
nForce’s south bridge, named ‘MCP’ for ‘Media and Communications Processor’ offers the following special features to make it a very special south bridge indeed:
- High bandwidth HyperTransport interface as connection to the IGP.
- Dual ATA100 IDE controller
- NVIDIA APU (Audio Processing Unit) sound device with a huge feature set, including Dolby Digital encoding for AC3-output.
- Full networking feature set, including FastEthernet 100/10 Mbit, HomePNA 2.0 (home phone line networking) and SoftModem
- Six concurrent USB-ports with 2 USB-hubs
- SteamThru, allowing high bandwidth and guaranteed real time memory access of all devices connected to the MCP, as typically required by video or audio broadcasting from disk, CD, DVD, LAN, WAN, IEEE1394 or when burning a CDROM.
Costs!
The final important issue to make a chipset successful is the costs it creates for OEMs to implement it on motherboards and its versatility. NVIDIA’s nForce is supposed to be attractively priced and the board design only requires four layers, which keeps the production costs of motherboards down. nForce can be implemented in many different configurations, from value platforms at very low system costs to high-end systems.
nForce’s Features In Detail
NVIDIA sees nForce as a completely new kind of chipset and if you look at all the new features you have no problems to believe it. Let’s have a look at those features in more detail.
The Twinbank Memory Controller Of nForce IGP (not nForce220)
Common chipsets, as e.g. AMD760, VIA’s KT133A or Intel’s i815 chipset, come with a 64-bit wide memory interface, which is just as wide as the data path of a SDRAM DIMM. Depending on the memory type and clock used, the typical memory bandwidth offered by those chipsets is between 800 MB/s (PC100 SDRAM) and 2100 MB/s (PC2100 DDR SDRAM). The latency of those solutions is depending on the latency of the memory that is used.
Only Intel’s recent high-end RDRAM chipsets i840, i850 and i860 are using two memory banks (in this case called dual channel Rambus). Intel had to do this to ensure reasonable memory latency, because the latency of a single Rambus channel is so high that system performance suffers significantly.
NVIDIA’s nForce combines the good of both worlds. It is using the low-latency SDRAM (SDR or DDR) and offers two memory banks, which doubles the bandwidth and halves the latency. The nForce420 IGP comes with two 64-bit wide memory controllers, adding up to a 128-bit wide memory interface. Each of the two controllers can work independently from the other to ensure low latency. Each bank can have different memory sizes as well as memory types (!!!), which is a major difference to Intel less intelligent dual channel Rambus memory controllers of i840/i850/i860, which require that two banks are equipped with identical RDRAM RIMMs.
The twin-bank crossbar memory controller of nForce420 offers very low memory latency, which speeds up today’s applications and a high memory bandwidth of 4200 MB/s to ensure that the integrated 3D-graphics won’t be slowed down. A highly efficient arbitration logic is able to prioritize memory accesses, depending on their latency sensitivity. This ensures that CPU reads and StreamThru accesses get a higher priority than memory accesses of the integrated graphics chip, which is less latency sensitive.
This feature alone sets nForce apart from any other chipset already. Even the dual-channel Rambus solutions from Intel can only offer 3200 MB/s and this at a much higher latency.
DASP – Dynamic Adaptive Speculative Pre-Processor
nForce’s IGP features another nifty thing to improve memory latency and thus processor performance. It’s called ‘DASP’, but I consider it to be some kind of smart read-ahead third level cache. NVIDIA is pretty tight lipped about its technology, because the patent is still pending, but what I was told is that it has a TLB-size of around what Palomino’s L1-cache (40 entries) and probably also around its size (64 kB). This cache comes with a intelligent pre-fetching mechanism that is supposed to analyze the behavior of different data streams (8-way prediction) and then speculatively read ahead, so that the next access of the CPU can read the data in the cache, thus saving the time it takes to access main memory. NVIDIA claims that Athlon gains up to 20% performance from it, depending on the application.
Integrated GeForce2MX
nForce’s IGP comes with integrated GeForce2 3D-graphics, which can provide much higher 3D-performance than currently-known chipsets with integrated graphics. The features of the integrated 3D-core sound very familiar:
- Integrated transform and lighting engine
- Two pixel rendering pipelines, clocked at 175 MHz
- 350 Mpixel/s pixel fill rate
You can see that those features are identical to NVIDIA’s GeForce2MX chip. We can expect the same performance from nForce’s 3D-graphics, simply depending on nForce’s memory configuration. NVIDIA showed that nForce220 (64-bit wide memory interface only) scores very similar to a GeForce2MX200 card, while nForce420 (128-bit wide memory interface) scores similar to GeForce2MX400 cards. In both cases I suppose NVIDIA used DDR-SDRAM for the test runs. While GeForce2MX-perfomance may not be exactly earth shattering for 3D-enthusiasts, it is a really big deal for the crowd that wants to use chipsets with integrated graphics due to budget reasons. nForce beats i815 or KM133 to the punch. Its 3D-performance is about 5-8 times as high as previous integrated graphics solutions.
The GeForce2 core inside nForce IGP communicates with the chipset core at a bandwidth of 1.5 GB/s (6xAGP). Please don’t forget that nForce is also supporting an AGP-slot for add-in graphics cards that runs at the usual 4xAGP.
HyperTransport
Another nifty feature of nForce is the communication interface between its north and south bridge, the IGP and MCP. Instead of the traditional PCI-bus or the new 266 MB/s interface used by Intel (hub architecture) and VIA (V-Link), nForce is using AMD’s HyperTransport interface, formerly known as LDT (lightning data transfer). This interface is also only 8-bit wide to keep motherboard makers happy, but it is clocked at 400 MHz and it is differential (kinda DDR), which gives it a bandwidth of 400 MB/s upstream and 400 MB/s downstream, adding up to 800 MB/s. It ensures high data throughput between the two chips, which is required by many streaming applications, such as audio and video recording or replay. Network throughput is also accelerated, especially in combination with the StreamThru-feature of the MCP.
APU – The Audio Processing Unit
NVIDIA admits it blatantly, the integrated sound subsystem of nForce is identical to what will make the noise in the Xbox. This however is certainly no bad news, but rather a recommendation. NVIDIA with its fetish to call its products in a way that makes them sound like ‘CPU’ christened this sound unit ‘APU’.
This APU comes with features that blow away every sound card that’s currently on the market, which is even more impressive when you consider that the APU is integrated into nForce’s south bridge ‘MCP’ and not a discrete card. Let’s have a look at its specs:
- Hardware DirectX8 audio processor (the first of its kind)
- Up to 256 different stereo voices, of which 192 are 2D-voices and 64 are 3D-voices
- Dolby Digital 5.1 Encoder
- DSL2-acceleration (Downloadable Sample Version 2)
- 32 bin mixer, with 8 voice volumes mixed to each bin (to get to 256)
NVIDIA classifies nForce’s APU as a ‘multi-processor audio rendering engine, as you can see in above picture. The APU renders completely to system memory, which allows the resulting stream to be sent to an AC97 codec or a USB-speaker system, in case if a nForce user fancies that. The voice processor is a fixed function DSP (digital signal processor), while the global processor is a programmable DSP. Together with the setup engine, that sets up all the data and parameters and controls all the resources required by the two DSPs, this unit is the most powerful sound subsystem found in PCs today.
APU – The Audio Processing Unit, Continued
The integrated Dobly Digital 5.1 encoder allows the output of an AC3-stream to feed your home theater system.
It is obvious that this powerful sound processor benefits greatly from the HyperTransport interface and the twin bank memory controller and its single-step arbitration logic. The APU is rendering the sound streams directly to system memory, using DMA (direct memory access). This process is a real time event and requires high bandwidth as well as proper priority handling if you want to have good sound quality without any sudden stalls.
Here is a (NVIDIA-made!) comparison list of nForce’s sound features:
nForce’s StreamThru Architecture
Lately is has become much more common to use content or software on the PC that requires real time delivery. I remember when I did my research work in the eighties at the University of Heidelberg, where we required real time date sampling as well as data evaluation at real time for probing the intracellular calcium of living mesangial cells. It was very difficult as well as limited to realize this back then. Today people are using digital content for video as well as audio, or they play multiplayer games, which all require real time data handling. Most of those tasks are done using DMA to avoid the impact of a possibly overly busy system processor, but the amounts of data that are handled today make it very likely that suddenly the memory access stalls due to other system events and you hear or see a stall in playback or in your multiplayer game.
Today’s system architectures make it very difficult for those applications to get their share of memory access at the right point in time. There are multiple reasons for that.
For one, the memory bandwidth has to be high enough to ensure ‘open slots’ in the first place. Then the bandwidth between the north and the south bridge of the chipset needs to be high enough to accommodate the transfers. Finally, the memory arbiter needs to know what to do with the memory access requests it receives.
nForce’s architecture is capable of handling those kind of latency sensitive data streams and NVIDIA called this ‘feature’ StreamThru, which allows ‘isochronous’ data transport.
The word ‘isochronous’ consists of two words, ‘iso’ meaning ‘equal, regular, even’ and ‘chronous’, coming from ‘chronos’, which is Greek for ‘time’ (shall someone say there are no benefits from a classic education in the computer era). ‘Isochronous data transport’ simply means that each memory access is treated fairly. NVIDIA translates ‘isochronous data transport’ as ‘memory access with a guaranteed latency and bandwidth’.
The ‘StreamThru’ architecture allows multiple virtual channels of those isochronous data streams and this is realized with the HyperTransport protocol as well as the single-step arbiter of nForce’s IGP. I personally don’t see that ‘StreamThru’ is a special feature on its own, but the logical product of nForce’s architecture. I wonder who at NVIDIA had the idea to ‘upgrade’ this technical fact to a feature that even deserves its own name. Nevertheless, it is a good thing and it is another item on the list that sets nForce apart from the rest of the chipsets.
Unified Drivers – A Blessing In Disguise
Now we are coming to a feature that is less technical but more of an excellent convenience. We all know that today NVIDIA is already supplying its unified driver idea in form of their graphics card drivers. All of NVIDIA’s graphics chips from RIVA TNT to Geforce3 are using the same driver. You don’t need to look for a special driver for your chip. This is something we are taking for granted, although neither 3dfx nor ATi have ever been supplying something as convenient as that.
nForce will now bring this convenience to another level:
A motherboard with nForce chipset will be installed with ONE driver! This is a real blessing, or don’t you hate to install a new operating system on your PC with the typical need to install a driver for the chipset-infs, then for the IDE-controller, then for the graphics card, then for the network card and finally for the sound card? nForce contains all the above components and while you still need at least two driver downloads for a simple Intel-motherboard without integrated graphics, network or sound, you need only one driver for an nForce board.
nForce Reference Motherboard
Here you can see NVIDIA’s reference nForce motherboard. NVIDIA chose the MicroATX format intentionally to prove that an nForce motherboard can be small even though it contains a 128-bit memory bus. The board is also manufactured with 4 layers only, which keeps its costs down. You can see that the board has only got 3 DIMM slots, so one DIMM is for one of the two 64-bit memory interfaces and the other two DIMMs are for the other one. This does not mean that one memory controller is only able to handle half the amount of memory as the other one. In fact, each of the three slots can be equipped with any size memory module. The crossbar memory controller of nForce does not require the two banks to be of equal size. NVIDIA told me that most board makers considered 3 DIMMs as plenty, but the design could just as well have 2 or 4 DIMM slots.
Here you can see the two memory interfaces running to the DIMM slots and also the 8-bit (8-line) HyperTransport interface running between IGP and MCP.
The first suppliers of motherboards with nForce chipset will be Taiwan’s top three motherboard makers Asus, MSI and Gigabyte, but also some smaller players, such as Abit and Mitac. Besides that we just learned that Fujitsu Siemens will also be a nForce launch partner. You see that NVIDIA managed to get the crиme de la crиme of motherboard makers behind it.
nForce Versions
I already said that there are a whole lot of different possible configurations of nForce platforms. Currently NVIDIA offers two different IGP and two different MCP chips. IGP-128 comes with all the goodness of the 128-bit twin-bank crossbar memory controller, while IGP-64 only features a 64-bit wide memory interface and obviously no crossbar configuration either. MCP-D is the south bridge with integrated Dolby Digital 5.1 encoder, while MCP without the ‘D’ does not have the Dolby Digital encoder included. Adding to this the four different types of memory that are supported by nForce opens up a huge amount of different configurations, from low-end to high-end:
nForce220 | IGP-64 | MCP | PC100 PC133 PC1600 PC2100 |
nForce220D | IGP-64 | MCP-D | PC100 PC133 PC1600 PC2100 |
nForce420 | IGP-128 | MCP | PC100 PC133 PC1600 PC2100 |
nForce420D | IGP-128 | MCP-D | PC100 PC133 PC1600 PC2100 |
Each of those configurations can either come with integrated 3D-graphics only or with an additional AGP-slot for an add-on graphics card. If you add all configurations up you will come to no less than 32 different possible nForce configurations, which caters the value as well as the high-end motherboard or system.
Political Issues
We’ve already posted the article ‘NVIDIA declares war on Intel‘ and this should give you a good idea what the release of nForce really means. NVIDIA as buddied up with AMD to supply the best system platform ever made. If you ask NVIDIA why there ain’t a Pentium III or Pentium 4 version of the chipset, they answer with a blatant “we don’t have an Intel front side bus license’. I doubt that NVIDIA is particularly unhappy with this situation. AMD is a much easier to handle partner than heavyweight Intel. However, nForce is now putting NVIDIA in a completely new and much more threatening league. Intel can only hope to come up with something competitive to nForce, because otherwise it will lose even more customers than it already has. nForce’s twin-bank DDR/SDR SDRAM memory controller is also threatening Rambus. So far Rambus tried to impress us with dual channel Rambus solutions and their high memory bandwidth. Now NVIDIA proved that dual-channel is just as feasible with DDR-SDRAM and the 4.2 GB/s of nForce’s memory bandwidth combined with the low-latency crossbar controller are now putting Rambus to shame. It’s funny how everything worked against Intel. Microsoft helps NVIDIA to become a platform manufacturer and NVIDIA helps AMD to become the CPU supplier for the fastest PC-platform available. The only win for Intel was when it got the Xbox contract. Other than that Intel stands empty handed now.
VIA, ALi and SiS will also be less than amused about NVIDIA’s bold move, although they should stand in awe of NVIDIA’s technical achievements. How long has VIA been trying to create something similar? How badly have they failed …? The same holds true for Intel. The i810 or i815 chipsets look simply pathetic next to nForce.
Then there is Creative Labs, who used to keep the sound card market tightly in its hands. nForce will be a threat to Creative as well. nForce’s APU is not only much more powerful than anything Creative ever designed, it’s also integrated into a chipset and comes virtually free. There won’t be any nForce-board owner buying a Soundblaster Live.
All in all NVIDIA has not only entered a completely new market today and found a bunch of brand new and highly interested customers, as e.g. Fujitsu Siemens. It has also made a gang of rather unpleasant enemies. I don’t think however, that Jen-Hsun Huang, NVIDIA’s CEO with the excellent birthday, will feel particularly scared. He is well experienced with fighting nasty enemies. Remember, don’t challenge Jen-Hsun! NVIDIA’s last archenemy had finally to sell its technology to NVIDIA to save its neck.
Summary
Well, well, what a product release! I bet most of you were just expecting the boring release of a lame chipset with integrated slow-motion graphics. You couldn’t have been more wrong.
NVIDIA’s nForce will change a lot in the PC industry. Microsoft handed NVIDIA tools and money and NVIDIA proved worthy, making the very best of it. nForce has all it takes to conquer the PC-platform market by storm. It provides the best memory system, the best chipset interconnect, the best integrated 3D-grapics, the best audio and a whole lot more features. It also supports the most cost effective processor. The only thing that could slow down its success are compatibility issues of its south bridge and maybe manufacturing problems that always occur when you don’t need them.
AMD had a great day today as well. nForce is the best that could happen to AMD by a long shot. Over are the days when AMD-platforms were fighting with incompatibilities because the chipsets for AMD-processors came from manufacturers who couldn’t get their act together.
Let’s see how Intel, VIA and Creative Labs will react to nForce. I doubt that they will simply stand by and watch their business taking damage. We already heard the rumor of VIA suing NVIDIA because of questionable patent issues. If you don’t know what else to do against your opponent … sue him! However, this idea hadn’t helped 3dfx as well.
nForce is supposed to ship to OEMs by July/August and the first nForce motherboards are expected in Fall 2001.