|
Tech Primer
Multi-Core Processors
A
multi-core microprocessor combines multiple independent microprocessors on a
single package. It exhibits Thread Level Parallelism (TLP) without including
multiple microprocessors on distinct packages. Each single core has its independent
cache (though in some cases they share the same cache), thus providing the operating
system with sufficient resources to handle most time consuming applications
in parallel. The market is flooded with various multi-core variants from different
vendors.
Sun Microsystems
Suns multi-core processor, UltraSPARC T1 gives up to eight processing
cores with four threads per core. It delivers 32 simultaneous threads in one
low-power, low-heat processor. Typical processor power consumption is 72 watts.
All the cores are connected to memory and the I/O subsystem through a 134 GBps
crossbar switch which enables fast communication between cores and memory. The
high-bandwidth shared 4-Bank 3-MB L2 cache provides optimum-sized cache for
multi-threaded processors. Four on-chip DDR2 channels delivering 25.6 GBps processor-to-memory
bandwidth.
IBM
POWER5, IBMs bet for the muti-core development line, allows higher levels
of symmetric multiprocessing, up to 64 real processors. Processor performance
is increased by allowing each processor to be two-way threaded using SMT. POWER5
systems can operate in either single-threaded or simultaneous multithreaded
modes. To conserve power, the POWER5 design and implementation has introduced
a unique capability that permits power savings without affecting performance.
The processor cores each support two logical threads. To the operating system,
the chip appears as a four-way symmetric multiprocessor. A 1.875-MB L2 cache
is shared by the two processor cores. There are three partitions, or slices,
of the L2, each of which is ten-way set-associative, with 512 congruence classes
of 128-byte lines. The processor is manufactured using a 130-nm process. The
first-level instruction is 64 KB and data caches is of 32 KB.
In POWER5 systems are able to more frequently satisfy L2 misses with hits in
the 36-MB off-chip L3 by moving the L3 cache from the memory side of the fabric
to the processor side of the fabric. The L3 operates as a victim cache for the
L2, with separate 16-byte-wide buses for reads and writes that operate at half
processor speed. Data is staged to the L3 only when it is replaced from the
L2. Similarly, references to data in the L3 cause that cache line to be reloaded
into the L2. Only modified data that is replaced from the L3 is written back,
or cast-out to memory. Unmodified data that is replaced in the L3 is discarded.
Intel
First up are the Intel Core2 Extreme quad-core processor QX6700 and the new
Quad-Core Intel Xeon 5300 processor for servers. Slated for introduction in
late 2006, these 65 nm quad-core processors feature four complete execution
cores within a single processor and are based upon the Intel Core microarchitecture.
Intel Core2 Extreme Quad-Core Processor provides four hardware threads, featuring
2.66 GHz core speed and 1066 MHz front side bus (FSB) speed. On the other hand,
the Intel Xeon 5300 Processor which is targeted for the server market, will
feature 2.66 GHz to 1.66 GHz core speeds, 1333 to 1066 MHz bus speed and a 105
watt thermal design point (TDP). It also includes Intel Virtualization Technology
for virtualization softwares. Apart from this technologies such as fully buffered
DIMM and Intel I/O acceleration are also included in the Xeon 5300.
The Dual-Core Intel Itanium 2 processor 9000 series provides two complete 64-bit
processing cores on one processor providing double the performance of 32-bit
processor. EPIC (Explicitly Parallel Instruction Computing) technology is the
cornerstone of the Intel Itanium architecture. It provides a variety of advanced
implementations of parallelism, predication, and speculation, resulting in better
Instruction-Level Parallelism (ILP) to help address the current and future requirements
of high-end enterprise and technical workloads.
The L3cache ranges from 6MB(single core) to 24 MB(dual core) in different Itanium
2 models with the clock speed at 1.6 GHz levels almost throughout. The Front
side bus speed is at 400/533 MHz and the power consumption stands at 104W.
AMD
AMD introduced the first multi-core technology for x86-based servers and workstations
with the Dual-Core AMD Opteron processor launch in April 2005. Quad-Core AMD
Opteron processors, planned for 2007, represent the next milestone in AMDs
multi-core roadmap. The currently available Opteron is a dual-core processor
with DDR2 is designed to offer upgradeability to quad-core AMD Opteron processors.
The multi-core processors from AMD are bundled with technologies such as AMD64
Technology, Direct Connect Architecture, AMD Virtualization, Enhanced Performance-per-Watt.
DDR2 platforms can upgrade to Quad-Core AMD Opteron processors when they are
available in 2007 within existing thermal bands for better performance-per-watt.
Direct Connect Architecture connects the processors, integrated memory controller,
and I/O directly to the CPU and communicates at CPU speed. HyperTransport technology
provides a scalable bandwidth interconnect between processors, I/O subsystems,
and other chip sets, with up to three coherent HyperTransport technology links
providing up to 24.0 GBps peak bandwidth per processor. Integrated Memory Controller
- Integrated on-die DDR2 DRAM memory controller offers available memory bandwidth
up to 10.7 GBps (with DDR2-667) per processor.
Varun Aggarwal
For more information see:
www.amd.com/us-en/Processors/ProductInformation
/0,,30_118_8796_14286,00.html
www.intel.com/products/processor/itanium2
www-03.ibm.com/chips/power
www.sun.com/processors/UltraSPARC-T1
|