Monday, February 25, 2002
The market has recently seen a trickle of early DDR333 platforms appear on the scene. Although the performance of these platforms has varied, there appears to be a groundswell of enthusiasm building for the new PC2700 memory standard. While we were initially uncertain how VIA’s DDR333 implementation would perform, we have been pleasantly surprised by its performance profile in several key tests. We expect VIA’s new KT333 platform to play a major role in building momentum behind the DDR333 standard.
The introduction of VIA’s KT333 also ushers in a new dynamic for AMD platforms. For the first time, DRAM performance exceeds Athlon’s 266 MHz front side bus. The performance impact of this combination appears to be quite good and we expect OEMs and enthusiasts to welcome this compelling solution.
Despite the apparent enthusiasm for DDR333, some pundits have questioned the viability of DDR333 as a mainstream memory solution for the desktop. We will discuss this matter in some detail below before we attempt to quantify the potential performance impact of PC2700.
Advancing the Athlon Platform
AMD’s Athlon has proven to be remarkably resilient in the face of Intel’s frenzied pace of clock speed increases. When AMD introduced its new processor naming convention, the market was pleasantly surprised to observe the Athlon 1800+ matching strides with the 2GHz Willamette P4 in most benchmarks. Recently, Intel’s Northwood P4 has shifted the balance slightly so that today’s Athlon XP 2000+ falls narrowly behind the 2.2 GHz Northwood P4 in some benchmarks. But this is not bad for a lower-cost processor that runs at a clock speed 500 MHz under the P4!
After close observation under demanding system level benchmarks, we can estimate that Athlon XP with DDR333 can deliver approximately the equivalent of one CPU speed grade performance improvement in key applications. This places the Athlon XP 2000+ on par, or superior to Intel’s 2.2Ghz Northwood P4 platforms using DR266.
We expect that the KT333 is just the latest phase of a perpetual leapfrogging exercise between AMD and Intel. Next, VIA will likely provide DDR333 support for the P4 which could show a similar performance boost for Intel. But Intel’s recovery is likely to be short lived in the face of AMD’s faster .13 micron Thoroughbred CPU possibly equipped with a 333MHz front side bus, providing another exhilarating jump in performance. Then on to Hammer, and so on…
While this article is not about Athlon, we could not resist a peek at the crystal ball.
Main Memory Migration
DDR266 is now at the center of the market. As DRAM manufacturers enjoy excellent yields, some currently over 80% for DDR266 (CL2.5) and over 25% for DDR266 (CL2), prices have decreased to very attractive levels. With Intel finally jumping on the DDR bandwagon (a year behind VIA), there is now a unified front of DDR266 platforms across the market.
This scenario furnishes the perfect backdrop for the next major memory speed grade introduction. DDR333 has been on the roadmap for some time, however the market has become somewhat distracted by the prospect of DDRII coming on the scene as early as Q103. This would leave a very narrow period in which to yield 333 in any serious volumes before DRAM makers shift their attention toward DDRII. Some have wondered if this could relegate DDR333 to be a short-lived, low-volume, high-end, niche product. We don’t think so.
The longevity of DDR as the mainstream standard has been bolstered lately due to some recent confusion over the future of DDRII. Several months back, the market seemed on track for a clear and potentially rapid transition to DDRII in early 2003. However, the Intel led ADT group posed its design requirements to JEDEC as a derivative of the DDRII specification. This specification is sometimes referred to as DDRII-A (A for ADT). It differs from DDRII in terms of I/O voltage, burst length and several other attributes that are intended to allow it to operate at speeds up to 800MHz.
This has raised the prospect of a prolonged debate inside JEDEC to determine if the industry will move forward aggressively with the original DDRII specification, or if JEDEC will substantially delay DDRII to accommodate Intel’s requirements. While the original schedule for DDRII would have accommodated a rollout in Q1 2003, we are concerned that it might be delayed until late 2003 or perhaps much later. If so, this would create a generous market window for DDR333, as seen in the Memory Migration chart below.
What about DDR400?
With the prospect of a delayed DDRII launch, DRAM makers and chipset makers must seriously consider the development of DDR400 soon after DDR333 gets off the ground. But the extremely high clock speed of DDR400 will demand another set of design trade offs in order to satisfy system configuration concerns. It seems unacceptable to limit system capacity to just a single DIMM, so other options will have to be considered. Among them will be a complete transition to the FBGA package for DRAM ICs, or to change to the SO-DIMM to reduce signal fan out, and use a surface mount SO-DIMM connector on the motherboard. Even with these possible changes, it may be necessary to manufacture the first generation of boards as 6-layer designs, until the technology can be mastered to cost reduce to 4 layer designs. Even with some combination of these changes, DDR400 may be interesting enough to find its way into volume production some time in 2003.
Performance Analysis
The primary focus of this project is to quantify DDR333''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''s performance advantage over DDR266. VIA''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''s KT266A platform, regarded as the highest performing Athlon platform available on the market provides the basis for our performance comparison. Both systems were configured with an AMD Athlon 2000+ processor, 256MB of Micron memory, a Maxtor ATA 133 hard drive, NVidia GeForce 2 GTS accelerator and Windows 2000 SP2. When testing 3D applications, we switched to a GeForce3 accelerator.
The KT266A platform was configured with CAS Latency 2 memory and we set all BIOS programmable DRAM timings to their fastest settings. VIA’s KT333 reference platform did not have user controllable DRAM settings in the BIOS. We believe that DRAM was set to CAS latency of 2.5 and all other DRAM timings were set conservatively for maximum stability. Other platform configuration information is shown in the table below.
Micron provided us with several of their latest DDR 333 modules using the standard TSOP and the newer FBGA packaging. These two DRAM package types are pictured below. The newer FBGA package is similar in some respects to the packaging used by Rambus, though it is much cheaper to manufacture. It offers superior parasitic characteristics making it a bit easier to achieve higher clock speeds. The two module types can be mixed and interchanged without any problems.
In addition, we were able to successfully conduct basic interoperability tests using Micron, Hynix and Samsung DDR333 modules. However, during this project we were unable to test high-capacity configurations, but will do so when modules become readily available.
SysMark 2000
Sysmark 2000 remains the most comprehensive and revealing application benchmark available and makes an excellent starting point to gauge how DDR333 impacts real world applications. The bar chart below depicts the relative performance of the KT333 for each application test using the KT266A performance as a baseline.
In nine out of the twelve applications tested, this benchmark reveals a surprising performance enhancement that extends as high as 16% favoring the KT333. However, Photoshop, Paradox and Elastic Reality stand out as exceptions with a 1% slower runtime. One assumption is that these three tests might have benefited from the KT266A’s tweaked DRAM latency settings that were not accessible on the DDR333 system. We can reasonably assume that the inconsistent results shown in these three benchmarks might be turned around with a revision of the BIOS on VIA’s reference platform.
Business / Content Creation Winstone
Winstone provides another simulation of real world workloads using popular Windows applications under a series of scripted activities. The Content Creation Winstone benchmark demonstrates a moderate performance advantage of 2.7% for DDR 333.
ViewPerf – Workstation 3D Performance
This MCAD style benchmark measures 3D rendering performance of complex object models that are commonly used in CAD workstations. Some of the ViewPef tests are known to cause severe CPU cache thrashing resulting in an intense DRAM load. DDR333 allows the Athlon platform to really stretch its legs here with a strong across the board performance increase of up to 16%.
Referencing popular hardware review sites, the best reported scores for the Light-04 test have previously been claimed by the 2.2 GHz Northwood P4 platforms. However, in our test of the Athlon XP2000+ KT333 platform we observed a performance level approximately 3% faster than reported scores for Northwood. Without conducting detailed side by side testing, we cannot declare Athlon the winner over Northwood, but we can take this as a good indication of the benefit of DDR333.
Quake III Arena (version 1.11)
No benchmarking project is complete without the venerable Quake III Arena benchmark. In order to show results that are comparable to other public data, we chose version 1.11 of the program and ran Demo1 and Demo2. When testing accelerator performance we prefer the more challenging Demo4 which is available in Q3A version 1.3, but the older test scripts are more frequently used for CPU and memory performance evaluations. The Demo1 and Demo2 scripts each showed a very respectable 8% performance increase for the KT333 platform.
Sphinx 3 Speech Recognition Benchmark
Sphinx3 is a large vocabulary speech recognition system developed at the Carnegie Mellon University Computer Science Department. This powerful application simulation benchmark must traverse a language database of approximately 18.5 MB while simultaneously performing roughly 95.8 million multiplications per minute of speech. It has proven more accurate than humans at transcribing large volumes of random speech, however its performance is largely bound by memory performance. The output of this test corresponds to execution time, so a lower score in this test indicates better performance. As expected, DDR333''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''s additional 533 MB/s available bandwidth allows the KT333 platform to weigh in with a convincing 15% performance advantage.
CineBench 3D Raytracing
CineBench is based on Maxon Computer''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''s CINEMA 4D XL 3D raytracing and animation tool. This powerful tool performs advanced 3D modeling and animation through complex particle and lighting algorithsm including, radiosity, caustics, multipass rendering, metaballs and other methods. This tool further demonstrates the KT333''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''s ability to give content creators an attractive performance boost when running such advanced computationally intense applications.
StreamK7 Under Windows
The well-known Stream benchmark is perhaps the best tool to test the available bandwidth of the memory subsystem. Because both test platforms use AMD Athlon processors, we used the 3DNow! optimized version of the Stream benchmark for the K7 processor. The table shows DDR333’s performance advantage using the KT266A as a baseline. The actual benchmark scores are listed for reference in a table below. Given DDR333’s theoretical bandwidth advantage of 25% over DDR266, we were pleased to see this benchmark demonstrating a 10-16% performance boost as seen in the chart below.
Summary
The opportunity for DDR333 to become the next mainstream memory standard seems wide open. Motherboard makers have recently predicted that over 40% of their designs will support DDR333 by the end of this year. Also, direct resellers cannot help but observe that a backward compatible DDR333 platform makes for an attractive up-sell or upgrade offer to end users seeking performance.
With DDRII in turmoil, PC2700 should enjoy at least an 18-month reign as the mainstream high-volume memory of choice. If DDR333 is able to reach price parity with DDR266 early in its life, we believe that the PC2700 migration curve could be even steeper than characterized above. We are pleased to see VIA deliver a compelling performance advantage over DDR266 and expect the KT333 chip set to become a key enabler to drive DDR333 into the high-volume center of the market.
By: Bert McComas & Greg Fawson, Inquest Inc. Copyright © 2023 CST, Inc. All Rights Reserved
|