Megahertz myth

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

Lua error in package.lua at line 80: module 'strict' not found.

The megahertz myth, or less commonly the gigahertz myth, refers to the misconception of only using clock rate (for example measured in megahertz or gigahertz) to compare the performance of different microprocessors. While clock rates are a valid way of comparing the performance of different speeds of the same model and type of processor, other factors such as pipeline depth and instruction sets can greatly affect the performance when considering different processors. For example, one processor may take two clock cycles to add two numbers and another clock cycle to multiply by a third number, whereas another processor may do the same calculation in two clock cycles. Comparisons between different types of processors are difficult because performance varies depending on the type of task. A benchmark is a more thorough way of measuring and comparing computer performance.

The myth started around 1984 when comparing the Apple II with the IBM PC. The argument was that the PC was five times faster than the Apple II, as its Intel 8088 processor had a clock speed roughly 5x the clock speed of the MOS Technology 6502 used in the Apple. However, what really matters is not how finely divided a machine's instructions are, but how long it takes to complete a given task. Consider the LDA # (Load Accumulator Immediate) instruction. On a 6502 that instruction requires two clock cycles, or 2 μs at 1 MHz. Although the 4.77 MHz 8088's clock cycles are shorter, the LDA # needs 4 of them, so it takes 4 / 4.77 MHz = 0.84 μs. So that instruction runs only a little more than 2 times as fast on the original IBM PC than on the Apple II.

History

Background

The x86 CISC based CPU architecture which Intel introduced in 1978 was used as the standard for the DOS based IBM PC, and developments of it still continue to dominate the Microsoft Windows market. An IBM RISC based architecture was used for the PowerPC CPU which was released in 1992. In 1994 Apple Computer introduced Macintosh computers using these PowerPC CPUs. Initially this architecture met hopes for performance, and different ranges of PowerPC CPUs were developed, often delivering different performances at the same clock rate. Similarly, at this time the Intel 80486 was selling alongside the Pentium which delivered almost twice the performance of the 80486 at the same clock rate.[1]

Rise of the myth

The myth arose because the clock rate was commonly taken as a simple measure of processor performance, and was promoted in advertising and by enthusiasts without taking into account other factors. The term came into use in the context of comparing PowerPC-based Apple Macintosh computers with Intel-based PCs. Marketing based on the myth led to the clock rate being given higher priority than actual performance, and led to AMD introducing model numbers giving a notional clock rate based on comparative performance to overcome a perceived deficiency in their actual clock rate.[2]

Modern adaptations of the myth

With the advent of multi-core and multi-threaded processing, the myth has stirred up more misconceptions regarding the measurement of performance in multi-core processors. Many people believe that a quad-core processor running at 3 GHz would result in an overall performance of 12 GHz worth of CPU. Others may say that the overall performance is in fact 3 GHz, with each core running at 750 MHz. Both of these ideas are incorrect. Often the same user making these comparisons will be comparing multiple brands of CPU, which will not do the same amount of work per cycle in any case. While micro-architecture traits such as pipeline depth play the same role in performance, the design of parallel processing brings other factor into the picture: software efficiency.

It is true that a poorly written program will run poorly on even a single-core[3] system, but even a well written program that was designed in a linear fashion, will often (if not always) perform better on a single-core system than a multi-core one.

Take the following instructions, for example:

# Instruction
1 x := (x + 1);
2 goto 1;

A program such as this will actually run faster on a single-core chip with a 4 GHz clock rate, than on a dual-core chip that clocks at 2 GHz. Because the equation x = (x + 1) depends on the previous value of x, which can only be accessed by the core that computed that previous value. Therefore, every time that instruction repeats (indefinitely, in this case) the new value of x must be derived by the same core as the previous value, effectively limiting the process to one core. On a single core system, this has no performance effect, as the entire 4 GHz peak performance is churned out by one lone core, but on a multi-core system, the code is constrained to use of only one of the cores running at only 2 GHz, slicing the program's speed in half (or more).

However, if the code is altered:

# Instruction
1 x := (x + 1);
2 y := (y + 1);
3 goto 1;

The values x and y are independent of each other, and therefore can be processed on separate cores at the same time (rather than waiting in queue for one core). Programs that are written to take advantage of multi-threading, such as this one (please note that setting up and writing a multi-threaded program is much more complex than depicted here; these examples have been simplified for the sake of readability and aim of the article) are able to approach the peak efficiency of (clock rate * number of cores), but due to shared resources between cores (each core needs to pick up instructions from a common place) 100% peak efficiency will never fully be reached, thus even a well-written multi-threaded program will still shed favorable results (however slightly) on a single-core system of the same architecture and twice the advertised clock speed.

A system's overall performance cannot be judged by simply comparing the amount of processor cores and clock rates, the software running on the system is also a major factor of observed speed. The myth of the importance of clock rate has confused many people as to how they judge the speed of a computer system.

Challenges to the myth

Comparisons between PowerPC and Pentium had become a staple of Apple presentations. At the New York Macworld Expo Keynote on July 18, 2001, Steve Jobs described an 867 MHz G4 as completing a task in 45 seconds while a 1.7 GHz Pentium 4 took 82 seconds for the same task, saying that "the name that we've given it is the megahertz myth".[4] He then introduced senior hardware VP Jon Rubinstein who gave a tutorial describing how shorter pipelines gave better performance at half the clock rate. The online cartoon Joy of Tech subsequently presented a series of cartoons inspired by Rubinstein's tutorial.[5]

Intel reaches its own speed limit

From approximately 1995 to 2005, Intel advertised its Pentium mainstream processors primarily on the basis of clock speed alone, in comparison to competitor products such as from AMD. Press articles had predicted that computer processors may eventually run as fast as 10 to 20 gigahertz in the next several decades.

This continued up until about 2005, when the Pentium Extreme Edition was reaching thermal dissipation limits running at speeds of nearly 4 gigahertz. The processor could go no faster without requiring complex changes to the cooling design, such as microfluidic cooling channels embedded within the chip itself to remove heat rapidly.

This was followed by the introduction of the Core 2 desktop processor in 2006, which was a major change from previous Intel desktop processors, allowing nearly a 50% decrease in processor clock while retaining the same performance.

Core 2 had its beginnings in the Pentium M mobile processor, where energy efficiency was more important than raw power, and initially offered power-saving options not available in the Pentium 4 and Pentium D.

The speed limit gets raised?

In the succeeding years after the demise of the NetBurst microarchitecure and its 3+ GHz CPUs, microprocessor clock speeds kept slowly increasing after initially dropping by about 1 GHz. Several years' advances in manufacturing processes and power management (specifically, the ability to set clock speeds on a per-core basis) allowed for clock speeds as high or higher than the old NetBurst Pentium 4s and Pentium Ds but with much higher efficiency and performance. Current (2013) Intel microprocessors clock as high as Xeon E3-1290v2 4.1 GHz and 4.4 GHz (Core i7 4790K). AMD crossed the 4 GHz barrier in 2011 with the debut of the initial Bulldozer based AMD FX CPUs and in June 2013 released the FX-9590 which can reach speeds of up to 5.0 GHz. However, similar issues with power usage and heat output have returned.

See also

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. Lua error in package.lua at line 80: module 'strict' not found.
  3. single-core
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. Lua error in package.lua at line 80: module 'strict' not found.

External links