Huawei API v4.0

Tuesday, March 2, 2010

Intel Atom CPU Review

Intel Atom CPU Review

Interesting and Surprising Choices

Atom CPUs are surprising for more than one reason: they have modern functions (EM64T, SSSE3, etc.) grafted onto an older architecture The Atom is the first in-order x86 since the Pentium. Power management and fabrication costs are the two imperatives Intel seems to have been guided by, at the expense (with no attempt made to hide it) of performance. So, no, don’t expect a competitor to Core 2 Duo. But what does the Atom really have to offer? We’ll see that as this piece unfolds.

Intel and Declining Power Consumption

Power consumption and integrating a processor into portable or embedded devices have always caused problems for Intel, and this is not the first time the company has offered processors aimed at those uses. But the Atom is radically different in that it has a new architecture specially created to reduce power use.

A Short History: Before the Pentium-M

As far back as the 80386, Intel offered versions intended for low power and especially mobile use. The 80386EX, for example, had a chipset built into the CPU and consumed significantly less power than standard 386s. And low-power versions of the 486, the Pentium, and the Pentium II (the Dixon, with its 256 kB of built-in cache) were also offered. And yet in every case they essentially used a very similar (if not identical) architecture to the one used in the desktop version of the processor. In practice, these processors were efficient, but the difference between a standard version and a version for portable PCs remained slight.

The Pentium-M

Released in 2003, the Pentium-M was revolutionary in that it used a different architecture from that of the Pentium 4 and consumed much less power, while maintaining high performance. Yet it was still a derivative of the Pentium III, with the same faults, and the successive improvements to the Pentium-M (leading up to the Core 2 processors) have only increased power consumption. Intel has tried to come out with low-power processors (the A1x0, for example), but essentially they were slowed-down versions of the Pentium-M.

Atom Changes All of That
Atom is a different architecture in the sense that it was designed to reduce power consumption and that the processor uses a totally new design. It isn’t an adaptation of an earlier architecture. Concretely, Intel is now able to offer processors that consume very little power – the high-end Atoms consume less power than the (generally very slow) ULV versions of the standard architectures.

Atom Z500 and SCH (Poulsbo)
The first generation of Atoms is the Z5x0, previously known by the code name Silverthorne. The Atom Z500s are dedicated to MIDs (the famous Mobile Internet Devices) and are coupled to a new chipset, the Poulsbo SCH (System Controller Hub).

Atom Z500: Competition for ARM CPUs?

With the MID orientation, it’s clear to see who Intel’s target is – ARM processors. This very popular architecture (it’s used in the great majority of telephones, PDAs, and GPS devices) is offered by many manufacturers (ARM licenses its instruction set) and offers good performance while keeping power consumption low. In the mobility arena, except for a few rare devices using MIPS architecture (the PSP, for example), ARM processors are in the majority. Intel, incidentally, once produced ARM CPUs for consumer applications (the XScale, since sold to Marvell) and still has a line of products used, for example in RAID cards (the IOP333, for example). In practice, moving from an ARM architecture to x86 poses no real problems – Linux is obviously compatible, as are Windows CE (used in many GPSs) and the Windows Mobile OS layer (at least in the older versions). In addition, the x86 can also make use of the latest Windows versions and so benefit from broader software (and technical) support than with ARM CPUs.
The Z500 Processors

Before we analyze the architecture of the Atom, let’s look at the Z500 series. These processors are very small, delivered in a package only 13 x 14 mm. The processors are made up of approximately 47 million transistors (more than the original Pentium 4) and have 56 kB of Level-1 cache (24 kB for data and 32 kB for instructions) and 512 kB of Level-2 cache. They operate on a standard Intel bus, the same one used since the Pentium 4. The frequency of the bus is 400 MHz (QDR) or 533 MHz (QDR). There is also support for SIMD instructions, from MMX to SSSE3, EIST, and HyperThreading (making its comeback). Note that the latter technology is available only on certain models (with the 533 MHz QDR bus).

Chipset for the Atom

The SCH (System Controller Hub) is a chipset that includes the Northbridge and Southbridge in the same chip. Dedicated to Atom processors, it is the only one compatible with certain functions such as using the bus in CMOS mode (we’ll talk about that later). The SCH is complete – it includes a GMA graphics circuit (based on a PowerVR architecture), an HD Audio circuit (simplified, capable of operating only in two channels), a P-ATA controller (Ultra DMA 5, 100 MB/s), and supports two PCI-Express lanes (for a Wi-Fi card, for example). There are also three SDIO/MMC controllers and support for 8 USB ports (with the possibility of using one in client mode). The choice of P-ATA is logical: The controllers used in flash memory are often in this format (used by Compact Flash cards). Three SD controllers might seem strange, but certain types of memory use that connectivity (OneNAND, for example). Also, the DDR2 controller of the SCH supports memory with a voltage of 1.5 V (as opposed to 1.8 V for the JEDEC specifications). This little detail is a way of further reducing power consumption.

The Graphics End of Poulsbo

On the graphics side there’s a new GMA, the GMA 500. It uses a unified architecture and supports 3.0+ Shaders. An interesting point is that it has hardware support for decoding of the H.264, MPEG2, MPEG4, VC1, and WMV9 formats. The frequency of the GMA 500 is 200 MHz or 100 MHz, depending on the chipset version, and it’s DirectX 10 compatible (not really useful, but worth mentioning), even though the drivers only support DirectX 9. Note that the graphics end is not of Intel origin, but uses a PowerVR technology, unlike other GMA models.

An Interesting TDP

The Atom Z500 has a TDP that varies between 0.85 W (for the 800 MHz version without HyperThreading) and 2.64 W (for the 1.86 GHz model with HyperThreading enabled). The SCH consumes approximately 2.3 W in its most evolved version, which brings the SCH + CPU together to under 5 W. By comparison with existing solutions, that’s obviously a big step forward – the Via Nano, for example, is announced at 25 W for the 1.8 GHz version and a Celeron-M ULV at 5 W at 900 MHz.

Atom N200 and i945

For Atoms intended for standard PCs, Intel will offer another line of processors (Diamondville). The Atoms of the N200 and 200 series are meant for standard PCs, but more specifically low-cost portable PCs, like the Eee PC and its competitors.

Atom N200 and 200: The Price Is Attractive

The Atom N200s are similar to the Atom Z500, with the only differences being in the management of EMT64 (64 bits), present in the N200 and 200 models, and the absence of EIST. The Atom 200s, then, don’t change frequency on the fly. The prices are attractive: An Atom N270, with a frequency of 1.6 GHz (533 MHz bus) and a 2-W TDP costs barely $44. And the 230 version, with a 4-W TDP, costs a mere $29 (at the same frequency).

A Veteran Chipset: The i945

The main problem with the Atom N200 stems from the chipset: Intel offers only variants of the i945. This chipset, already “old” (it dates from 2005), has a major fault: It consumes a lot of power (22 W in the GC version). The i945 chipset supports modern technologies: SATA (2), PCI-Express (1 lane via the ICH7), HD Audio, etc. Obviously it can handle DDR2 memory (on two channels) and includes an IGP, the GMA 950. Still, it’s obvious that using an older chipset (from the Napa platform) with a TDP that’s ten times higher than the processor’s is not the best idea in the world. But it’ll have to do until something better comes along. Portable PCs use the i945GSE, which uses only 5.5 W (4 W for the Northbridge and 1.5 W for the Southbridge). Obviously, the performance is not the same – in 3D, essentially, where Intel has reduced the GMA’s frequency (from 400 to 133 MHz).

The GMA 950

Now let’s take a look at the GMA 950, the IGP used by Intel in the i945 chipset. Compatible with DirectX 9 and capable of running Aero, it is very common in portable PCs equipped with a Core Duo processor. Its performance is weak and it’s incapable of decoding HD formats. What’s more, it’s sensitive to memory bandwidth and its drivers are not optimized. Finally, Intel uses several frequencies for its IGP – from 400 MHz in the i945G versions (for desktop), it goes as low as 250 MHz in portable PCs and 166 MHz in some ultraportables (with the attendant loss of performance). The version used by the Atom (i945GSE) is limited to 133 MHz, whereas the i945GC operates at 400 MHz.

Note that Intel also proposes coupling the Atom to an SiS chipset. This solution, already offered in Intel Mini-ITX boards, uses a SiS 671 coupled with a 968, and consumes only 8 W.

Atom: In-Order and HyperThreading

The Atom uses a new architecture, but with older technologies. It’s the first in-order x86 from Intel since the Pentium, back in 1993. All other Intel processors (since the P6) use an out-of-order architecture.

In-Order: Say what?

To simplify, think of the processor as receiving the instructions one by one and putting them in its pipeline before executing them. In an in-order architecture, the instructions are executed in the order in which they arrive, whereas an out-of-order architecture is capable of changing the order in the pipeline. The advantage is that losses can be limited. If, for example, you have a simple calculation instruction, a memory access, then another simple calculation, an in-order architecture will execute the three operations one after the other, whereas in OoO the processor can execute the two calculations at the same time and then the memory access, with an obvious time saving. Quite surprisingly, whereas in-order architectures generally use a short pipeline, the Atom has a 16-stage pipeline, which can be a disadvantage in certain cases.


HyperThreading is a technology that appeared with the Pentium 4. It can process two threads simultaneously using the unused parts of the pipeline. While not as efficient as two true cores, the technology can make the OS think that the CPU can process two threads simultaneously and increase the computer’s overall performance. On the Atom with its long pipeline coupled to an in-order architecture, HyperThreading is very effective, and the technology can significantly increase performance without impacting the TDP. Intel claims an increase in consumption of only 10%.

The processing core

For the rest, the Atom is equipped with two ALUs (units capable of performing integer calculations) and two FPUs (units dedicated to floating-point calculation and very important for gaming, for example). The first ALU manages shift operations, and the second jumps. All multiplication and addition operations, even in integers, are automatically sent to the FPUs. The first FPU is simple and limited to addition, while the second manages SIMD and multiply/divide operations. Note that the first branch is used in conjunction with the second for 128-bit calculation (the two branches are in 64 bits).

Intel Has Optimized the Basic Instructions

If you look at the number of cycles necessary to execute instructions, you realize something: Some instructions are fast and others are (very) slow. A mov or an add, for example, is executed in one cycle, as on a Core 2 Duo, whereas a multiplication (imul) will take five cycles, compared to only three on the Core architecture. Worse, a floating-point division in 32 bits, for example, takes 31 cycles compared to only 17 (or almost half as many) on a Core 2 Duo. In practice – and Intel willingly admits this – the Atom is optimized to execute the basic instructions quickly, meaning that this processor short-changes performance with complex instructions. This can be checked simply using Everest (for example), which includes a tool for measuring the latencies of instructions.

Atom: Caches and FSB

Intel has chosen a fairly out-of-the-ordinary organization for the Atom, but without sacrificing performance (which is important with a CPU using an in-order architecture).

24 kB + 32 kB: An Asymmetrical Cache

The Atom’s Level-1 cache is 56 kB total: 24 kB for data and 32 kB for instructions. This asymmetry, fairly surprising for Intel, stems from the structure of the cache. Intel uses 8 transistors to store one bit, compared to six transistors in a standard cache. This technique allows the voltage applied to the cache to maintain information to be reduced. It seems that this move to 8-transistor cells was made late in the game, when the design of the processor was fairly advanced, which meant that the size of the cache had to be reduced to fit it in – which explains the 24 kB for the data cache. This unofficial explanation was advanced by AnandTech in their article introducing the Atom in April.

512 kB Level 2, shrinkable

The Level-2 cache has a capacity of 512 kB, and obviously runs at the same frequency as the processor. This 8-way cache is fairly classic and is close in performance to the one used in the Core 2 Duo (its latency is 16 cycles, compared to 14 for the Core 2). One of the new functions can deactivate part of the cache automatically – if a program doesn’t require much cache memory, part of it can be shut down. In practice, the cache goes from 8-way to 2-way (thus from 512 kB to 128 usable kB). This technique is a way of shaving a few precious milliwatts.

The FSB: Two modes of operation

The Atom’s FSB is the same one used by Intel since the Pentium 4. It operates in Quad Pumped (QDR) mode with GTL signaling. An interesting point: The Atom uses another signaling technology – CMOS mode. GTL is effective (the bus can reach 1,600 MHz), but power-intensive, whereas CMOS allows the bus voltage to be reduced. Technically, GTL uses resistors to improve the quality of the signal, but they aren’t really necessary except at higher frequencies. With the Atom and its bus, limited to 533 MHz, it’s possible to change to CMOS mode – the resistors are deactivated and the bus voltage is reduced by half. At the moment, only the SCH chipset is capable of handling the FSB in CMOS mode.

Power Management: Tests and Theory

Power consumption is central to this Intel platform, and they’ve made a lot of efforts in that department. Aside from the chipset, which consumes a lot of power in comparison to the processor, the Atom itself has many attractive functions.

Bus and cache

As we’ve already said, Intel has put a lot of effort into the bus and the cache: A different mode for the bus was developed (CMOS mode) and the cache can be disabled in part depending on how it’s being used. These functions reduce power consumption, as do the use of an in-order architecture and 8T SRAM for the L1 cache.

C6 power state

In addition to the low voltage (1.05 V) CPU, the Atom also introduces a new standby mode, C6. As a reminder, the C modes (0 to 6) are low-power states, and the higher the number, the less the CPU consumes. In C6 mode, the entire processor is almost totally disabled. Only a cache memory of a few kB (10.5) is kept enabled to store the state of the registers. In this mode, the L2 cache is emptied and disabled, the supply voltage falls to only 0.3 V, and only a small part of the processor remains active, for wake-up purposes. The processor can go into C6 mode in approximately 100 microseconds, which is quick. In practice, Intel claims, C6 mode is used 90% of the time, which limits overall power consumption (obviously, if you launch a program that requires a lot of CPU power or even watch a Flash video you won’t be in that mode).We should point out, though, that the two chipsets to be used with the Atom N200s are power users: the Atom 230s use a i945GC that consumes 22 W (4 W for the CPU) and the Atom N270s ship with a i945GSE that burns 5.5 W (2.4 W for the CPU).

In Practice

So is the Atom really low-power in practice? The processor is, yes. For the platform aimed at NetTop (low-cost desktop computers), the answer is yes, but... Why the “but”? Because the chipset used uses a lot of power and the processor is listed at a TDP of 4 W, compared to 2.4 W for the mobile versions. Our test motherboard consumes 59 W in standby, and we reached 62 W under maximum load (with a 3.5" hard disk and a 1 GB DDR2 DIMM). Obviously, these values are what we measured for the complete platform, not only the motherboard, and they don’t take power-supply losses into account (our test model has a yield of approximately 80%). That’s both a little and a lot – it’s not much for a desktop computer, of course, but it’s a lot in absolute terms. We should add that we recently tested a motherboard based on a 1.5 GHz Via C7, and the configuration drew less power with the same components: 49 W at idle and 59 W under load (always measured at the AC outlet).

Atom Against Pentium E and Sempron

 For our tests, we used a Mini-ITX motherboard made by Gigabyte, equipped with an Atom 230/i945GC. The board has a single DIMM (DDR2) slot and a PCI port – which rules out using any modern graphics cards. Amusingly, the chipset (which, remember, consumes 22 W) is actively cooled, whereas the processor makes do with a simple aluminum heat sink.


Since this motherboard is intended for entry-level machines, we tried to compare two current entry-level solutions – a Pentium E2160 (1.8 GHz factory), an entry-level dual-core processor based on the Core architecture, and a Sempron 3400+ (Socket 754 in this case). The two processors were set to the same clock frequency as the Atom (1.6 GHz) for the tests. The motherboard used for the Pentium E was a GA-GM945-S2. It has the advantage of using the same chipset (or almost) as the Atom motherboard – an i945G. The motherboard used with the Sempron is Nforce4-based.

The three boards were tested with the same system – Windows XP Service Pack 2 with all the drivers up to date. We used DDR2-667 memory (1 GB) on the Intel platforms and a 1 GB DDR-400 DIMM on the Sempron. Finally, our test hard disk was a 74 GB Western Digital Raptor.

The Tests

We decided to compare the three platforms at an identical frequency, with a few practical tests and a few synthetic ones.

On Cinebench R10, the Sempron placed between the Atom and the Pentium E, though the Atom-with-HyperThreading combo proved effective (1.53 times faster with HyperThreading). Notice that the increase with the Pentium E, which actually has two cores, is not that much greater: 1.86 times faster.

With Sandra, which is a synthetic test, the difference among the three processors was impressive. The Pentium E really was faster. Note that the difference between the Atom and the Sempron may seem slight, but the tests are multithreaded and the Sempron has only one core, whereas the Pentium E has two and the Atom uses HyperThreading, which can produce significant gains.

In the 3DMark 06 and PCMark 06 CPU tests, the Pentium E had a comfortable lead, and the Sempron always placed between the Atom and the Pentium E.

In this test – a favorite with overclockers, but fundamentally not really conclusive (the code is dated and not very optimized) –, the Atom was a lot slower than its competitors.

Finally, we ran a test that consists of compressing approximately one GB of files with WinRAR. Since the Sempron uses a different memory subsystem (DDR) and a real graphics card, it doesn’t show up on this test – the comparison would have been thrown off. In practice, the difference between the two platforms was less than in the synthetic tests, but the Pentium E was still approximately twice as fast.

Atom against C7-M and Celeron

We also decided to compare our Atom platform with two other systems that could find themselves in competition with our test Mini-ITX motherboard. The first is the Via PC3500G mobo; the second is an entry-level processor sometimes used in ultraportables, the Celeron-M (Dothan).

Compared to the C7

The Via PC3500G motherboard is a micro-ATX board with a CN896 chipset coupled with a C7 CPU clocked at 1.5 GHz. For the test, we dropped the clock of the Atom to the same frequency as the C7’s (12 x 125 MHz, or 1.5 GHz). The memory, hard disk, and operating system stayed the same.

On Cinebench R10, as you can see, the Atom was faster than the C7, but not by much – at least when using a single thread. On the other hand, the Atom’s HyperThreading resulted in an impressive performance gain.

On PCMark 05, you can see that the Atom platform, even at an identical frequency, was faster than the C7 platform. There are several reasons why. PCMark 05 is multithreaded, like many current programs, and the Atom’s HyperThreading has an advantage. Also, the Intel chipset is significantly faster (or a little less slow, let’s say) than Via’s.
Finally, we measured the power consumption of the two platforms. Surprise: Thanks to a power-saving chipset, the Via platform was more economical than the Intel platform. At idle, the PC3500G drew 49 W, whereas the GA-GC230D was at 59 W. But notice that under load, the Atom only climbed by 3 W, whereas the Via platform consumed 10 W more (while still staying below the Intel). Note that these values are measured at the AC outlet, without taking power-supply losses into account (ours has approximately an 80% yield).

Compared to the Celeron-M

For the comparison with the Celeron-M, we used a portable PC equipped with a model based on the Dothan core. We didn’t run the PCMark tests because, since the machines are very different, the tests wouldn’t have been valid. As for the C7, the Atom was underclocked to the same frequency as the Celeron-M (in this case 1.3 GHz).

On a synthetic test like Cinebench R10, you can see that the Celeron showed approximately twice the performance at an identical frequency. Still, HyperThreading let the Atom gain a few points.
In practical terms, these few tests show that the Atom ranks between the C7 and the Celeron-M at an identical frequency. Given that the two processors are being used in Netbooks (the C7 with frequencies close to the Atom’s and the Celeron-M with significantly lower frequencies), we can conclude that Atom-based computers will have performance that’s more or less identical to that of current machines.

Overclocking and 3D

Finally, we ran tests in two areas that won’t necessarily be of interest with an Atom platform, but that will appeal to some of you – 3D and overclocking.

3DMark with the GMA

Since our motherboard had no PCI-Express or AGP slot (and since PCI graphics cards are getting scarce), we limited ourselves to the GMA 950. To compare it, we used the Gigabyte motherboard, equipped with the same chipset, with the Pentium E 2160 clocked to 1.6 GHz to match the Atom. So the two machines were using an identical IGP (GMA 950 @ 400 MHz) and a CPU at the same frequency (1.6 GHz). Both computers were fitted with a single DDR2-667 DIMM.

As you can see, with 3DMark 06, in 640 x 480 without filters, performance was weak... and above all, that the Pentium E remains noticeably faster than the Atom, even outside the world of synthetic benchmarks.
But remember, portable PCs that use Atom will use the i945GSE chipset, and the GMA 950 on this version is clocked only at 133 MHz.

Overclocking Atom

The Gigabyte Mini-ITX motherboard offers a few overclocking options: the FSB can be modified, between 100 and 700 MHz (sic). On our model, whose coefficient can’t be modified from the fixed 12, the base frequency is 133 MHz. We were able to reach a stable 1.8 GHz (12 x 150) without touching the voltage settings, and as much as 1.86 GHz (bus 153 MHz) by altering the FSB voltage in the board’s BIOS (+0.3 V for the bus). Performance increased in linear fashion, but so did power consumption – from 62 to 65 W between 1.6 and 1.8 GHz, and we measured 67 W for the platform with the Atom overclocked to 1.86 GHz. The difference can be explained by the increase in bus voltage. But note that the increase in power consumption isn’t due only to the CPU; the chipset was also overclocked.

Why no HD test?

Why didn’t we run tests on playing HD? Well, first of all because computers using an Atom processor aren’t intended for that. Intel is targeting NetTops, which are computers intended for Web browsing, not for watching Blu-ray disks. Also, we did try to play an HD DVD, just to see, but Power DVD wouldn’t launch without a modern graphics card capable of handling part of the decompression load. We would have tested some "HD" video, such as the trailers available on the Web, but several problems associated with doing that. The software player used influences the results, and the quality and complexity of these videos don’t match that of commercial video. Decompressing a DivX 720p stream at a few Mbit/s is one thing, but video in H.264 at 36 Mbits/s on a Blu-ray is something else.


What conclusion should we draw about the Atom platform? We came away with a mixed impression. The processor itself is a success – it’s affordable, consumes very little power, and while its performance is weak, it’s sufficient for its target market (low-cost PCs intended for Web use). In addition, HyperThreading is a good feature and the platform is reactive. But for us the disappointment is the associated chipsets. Intel offers only two choices, and they’re open to criticism. The SCH Poulsbo seems efficient and autonomous, but it’s not viable in a standard PC due to its MID orientation (no SATA, for example), whereas the i945GC and i945GSE chipsets are usable in PCs, but they’re throwbacks – they lack functions, their performance with 3D is disastrous (whereas more and more applications are using it), and they consume significantly more power than the processor itself.
You get the feeling that Atom is only a trial balloon – one that’s a success from some points of view and a failure from others. Will computer manufacturers and the general public go for it? Undoubtedly, and for two reasons – the price and marketing. The platform will make it possible to offer computers at a very low price, and for now Atom has a good brand image. The public’s reasoning might proceed something like this:
"An Eee PC 900 for $450 (good) with a Celeron (not good) at 900 MHz (not good)"
“An Eee PC 901 for $450 (good) with an Atom (good) at 1.6 GHz (good)”
In other words, the Atom version will appeal more to the general public, even if in practice the difference is likely to be pretty slim.

The Intel Atom Platform

A paradoxical platform: The processor is a success (even if its performance is weak in absolute terms), whereas the associated chipsets are not worth their salt. Overall, the gains compared to older platforms remain slim, and we hope that Intel will be offering chipsets that are better suited in the future.


J  The price: $29 for an Atom 230
J  Low power consumption
J  HyperThreading, a good feature on this processor


L  Weak overall performance
L  The chipsets
L  Very poor 3D performances
      L  A mismatched platform

No comments:

Post a Comment