Processor Emporium (UK)
Home Intel AMD Cyrix Motherboards Technical Questions Links

Intel Pentium 4


Unlike recent iterations of the Pentium brand, ie, Pentium Pro, Pentium II and Pentium III, the Pentium 4 is not based upon the five year old P6 architecture, but is based upon a new design which Intel have branded the “NetBurst” architecture (and not P7). The NetBurst architecture boasts a number of new design features which we will now examine.

Hyper-Pipelined Architechture.

The first notable feature of the Pentium 4 is that it features what Intel calls “Hyper-Pipelined Technology”. What this means in actual fact is that the Pentium 4 features a 20-stage pipeline design.

The main reason why Intel has decided to implement such as long pipeline for the Pentium 4 centres around the issue of increasing clock-speed. As was seen with the recall of the 1.13 Ghz Pentium III back in August, the five year old P6 architecture is rapidly reaching the physical limits of its design. It is known that increasing the clock speed of P6 above 1 Ghz is not producing any real improvement in performance, but is causing the chip to increase its heat output. The answer to this problem is to implement a whole new design which lengthens the pipelines of the processor so as to enable running at higher clock-speeds.

The Pentium 4 features a doubling of pipeline length over the P6 and is already showing results with the chip being able to run at 1.5 Ghz on a 0.18 micron process. This contrasts to the 1 Ghz limit currently found on the Pentium III. Intel intends the Pentium 4 to run at high clock speeds (by today’s standards) and the way that Intel engineers decided to tackle this challenge is by lengthening the pipeline. It is known that Intel intends to release a 2 Ghz Pentium 4 based upon the 0.18 micron process early next year, which indicates that the chip has the ability to scale to high clock-speeds. With process shrinkage (ie from 0.18 micron to 0.13 micron CMOS process) the Pentium 4 has the ability to reach clock-speeds in excess of 3 Ghz, which is planned for this time next year.

Whilst the 20-stage pipeline of the Pentium 4 brings the advantage of the ability to run at high clock-speeds, it does bring with it a number of drawbacks which Intel needed to address.

The first drawback is the fact that it takes longer for an instruction to be processed by the Pentium 4 as it has to pass through double the number of stages as it would have done on a P6 based chip. This essentially doubles the time for an instruction to be processed in comparison to a Pentium III. The first way Intel has had to address this problem is by running the chip at a higher clock-speed (1.5 Ghz to get the performance of a 1 Ghz Pentium III). The second way is to double-pump the ALU (Arithmetic Logic Unit).

The second drawback of a 20 stage pipeline is that it increases the performance penalty of a brach prediction “miss”. As with all modern x86 processors from P6 onwards (AMD K5 also featured the design traits) the Pentium 4 features speculative execution and out-of-order execution of instructions, which mean that instructions can be executed in parallel (via different pipelines) or can be loaded and executed before they are needed. In order to achieve this modern x86 processors need a Branch Prediction Unit (BPU) in order to choose the right branch of instruction for execution based upon recent instruction history.

In the event of the BPU mis-predicting an instruction branch, the CPU must allow all instructions in the pipeline to flow through before loading up a new instruction. As the Pentium 4 is a 20-stage pipeline, this means that a mis-predicted branch can stall a pipeline for up to 20 clock-cycles compared to a stall of up to 10 clock cycles for P6 based chips.


Advanced Dynamic Execution.

Review Index:

Top of the Page

Home Intel AMD Cyrix Motherboards Technical Questions Links

© Copyright, Anthony Barrett 1999/2000.