HOT! Update Of Intel Roadmap News!

Intel Roadmap News 10/2000 - Part Two, Intel's Future Mobile and Server/Workstation Products

Intel Roadmap News 10/2000 - Part One, Desktop Processors And Chipsets

AMD vs. Intel: The best CPU for MPEG-4.

DDR-SDRAM Has Finally Arrived

AMD Extends Performance Lead With New Athlon and Duron Processor

Intel i820 Chipset Review

Intel's New Weapon - The Coppermine

Tom's Blurb - All Owners of Systems With Intel's i820 Chipset That Don't Use RDRAM Yet Will Now Get It For Free From Intel!

Tom's Blurb: Why We Don't Trust Rambus - Pointing Out Facts, Turning Rumors Into Reality

Intel Admits Problems With Pentium III 1.13 GHz - Production and Shipments Halted

Important Pentium 4 Evaluation Update

Rambler's Top100 Рейтинг@Mail.ru


Краткое содержание статьи: Equipped with an exciting brand new design Intel's new flagship is getting ready to step out into the open. Get ready for the most controversial x86-processor of all times. Find out if you are the working class kind of guy with Athlon-ambitions or rather the stylish Yuppie that needs Pentium 4 even if it's only to be cool. Here it comes, the battle style against power.

Intel's New Pentium 4 Processor

Редакция THG,  20 ноября 2000
Вы читаете страницу 7 из 27

Hardware Prefetch

Intel has added another nifty feature that I want to bring to your attention in the L1/L2 cache context. If you think of the Pentium III launch in February 1999, you might remember Intel's introduction of the 'streaming' SIMD Extensions. The 'streaming' bit of 'SSE' is actually represented by the prefetch-instructions of Pentium III, which enable software to load data into the caches before it is requested by the processor core.

Those instructions still exist in Pentium 4's instruction set, but with the new hardware prefetch feature of Pentium 4 a lot of this is done automatically. This new unit is able to recognize data access patterns of the software executed by Pentium 4, so that it 'guesses' which data will be needed next and 'pre-fetches' it into the cache.

The procedure might sound familiar to you from the complex hard drive cache algorithms and you might also be aware how much this can speed up hard disk accesses under certain circumstances. Pentium 4's hardware prefetch is probably able to significantly accelerate the execution of software that is using a lot of large data arrays.

Entering The Execution Pipeline - Pentium 4's Trace Cache

Intel Pentium 4 Instruction Decoder

Our code has now passed the system bus, L1 and L2-cache, so that it's finally time to enter the execution path of Pentium 4. You remember that Pentium 4 is not using an L1 instruction cache, but a much niftier thing instead. Let me first explain what is bad about an L1 instruction cache.

With Pentium III or Athlon, who both have an L1 instruction cache, code is fetched by this cache and stored until it's about time to enter the execution path. This is done by code entering the decoder unit, which e.g. in case of Athlon consists of 3 'direct path' and 3 'vector path' decoders, which alternatively produce the 'OPs' (as explained above) that can get executed by the execution units of the processor. This situation has a few glitches. First of all, some x86-instructions are rather complex, taking a lot of time to be decoded by the slow or 'vector path' decoders. In the worst case all decoder units are busy decoding complex instructions, thus stalling the execution pipeline of the processor. Another problem is the fact that x86-instructions that are supposed to be executed repeatedly (e.g. in small loops) need to be decoded each time they enter the execution path, thus wasting a lot of time. Software branches are another wasteful situation for a processor with L1 instruction cache that starts its pipeline at the decoder level.

Pentium 4's fancy Execution Trace Cache does not suffer from the above-described problems. Once you understood it, the idea of the trace cache is actually rather simple, but it takes quite a bit more silicon resources and design skill to replace the good old L1 instruction cache with something like Pentium 4's trace cache. Basically, the 'Execution Trace Cache' is nothing but a L1 instruction cache that lies BEHIND the decoders. Obviously it's quite bit more complex, but once you understood this basic fact you start to realize the benefits of the trace cache.
Вы читаете страницу 7 из 27


Обсуждение в Клубе Экспертов THG Обсуждение в Клубе Экспертов THG


История мейнфреймов: от Harvard Mark I до System z10 EC
Верите вы или нет, но были времена, когда компьютеры занимали целые комнаты. Сегодня вы работаете за небольшим персональным компьютером, но когда-то о таком можно было только мечтать. Предлагаем окунуться в историю и познакомиться с самыми знаковыми мейнфреймами за последние десятилетия.

Пятнадцать процессоров Intel x86, вошедших в историю
Компания Intel выпустила за годы существования немало процессоров x86, начиная с эпохи расцвета ПК, но не все из них оставили незабываемый след в истории. В нашей первой статье цикла мы рассмотрим пятнадцать наиболее любопытных и памятных процессоров Intel, от 8086 до Core 2 Duo.