On the first day, Intel created Granite Bay. On the second day, Canterwood, featuring PAT saw the light of the day. On the third day, Springdale appeared. On the fourth day, ASUS came out with a little modification and chaos has reigned since.
The P4P800, based on Intel's i865PE chipset employs a simple, yet very effective trick to bypass the extra pipeline stages in the address and command decode path that ensure stable operation of the Springdale chipset at 200 / 800 MHz frequency. The top yield of the chipset can do without these pipeline stages and the separation between i875 and i865 is a somewhat arbitrarily drawn line between pass and fail under extreme conditions. Therefore the ASUS concept appears only a logical step to break some of the iron "Thou shalt not tamper with our pins" policies.
About 5 years ago, Abit did the same with a switch pulling the B21 pin to ground and everybody overclocked their Celerons happily ever after. Now it is up to ASUS to pull the white rabbit out of the magic hat. We have looked at Intel data sheets and the board itself and ran it through the paces, concerned with speed but also about stability.
What is the technology behind all the chaos? How does the P4P800 live up to the expectations? Moreover, how can it live with the amount of overdecoration ASUS added in form of its Ai feature bundle?
PAT, separating Canterwood from Springdale
The current line of high-end Intel chipset comprises two different lines of chip, that is the i875P and the i865 family. Within the 865 group, the "G revision has a unique position in that it integrates the Intel extreme Graphics2 engine in the core; the 865P and PE versions are the same die as what is sold as Canterwood, using a different package.
Aside from a different package, there are two main differences between the i875 and the 865. First, Canterwood supports ECC whereas Springdale does not, which is the main difference that allows the lower ball count of the Springdale package. Second, Springdale does not support Intel's new performance acceleration technology or "PAT".
So what exactly is PAT. Performance acceleration technology is a means to speed up address and command decoding. Address and command decoding are necessary because the virtual address space created by each software application will attempt to maintain a contiguous memory space. If more applications are running, more contiguous memory spaces will be necessary. PC memory falls into the category of Random Access memory, meaning that data can be written all over the place, depending only on which pages are open at the time that the write command is issued. This resembles pretty much a shotgun approach and, therefore, maintenance of any contiguousness would be a matter of scarce luck unless some extra measures are taken.
These measures are the translation of the physical addresses into a virtual memory space (not to be mistaken with the virtual memory or swapfile in the Windows environment) where every application maintains a contiguous block of memory or at least thinks it does, courtesy of the tags stored within the application address tagRAM more commonly referred to as TLBs, short for Translation Lookaside Buffers on the processor level. The tagged information still needs to be converted into physical memory addresses, which requires some work on the memory controller level. Needless to say that work does require time, which, in this case is called address decode latency. An extra step added is the command decode latency also known as CMD Rate to any users of VIA chipsets.
All these latencies add on top of what is generally known as the DRAM access latencies, where tRAC defines the time from Chip Select to the first or critical Quad-Word output to the bus. Just as a short reminder, the DRAM latencies themselves start with the RAS-to-CAS delay on top of which is added the CAS latency. In case both of those latencies are set to 2, tRAC at a 200 MHz system bus (chipset) frequency would be 4 x 5 ns or 20 ns. However, real world latency measurements show system initial access latencies that are generally in the order of 50-100 ns and encompass all steps from the time the CPU issues the request for the data until the actual data output. At a 200 MHz system bus frequency, the differences translate into some 6-18 chipset clock cycles.
Pipelined MRO Processing
Modern PC architecture uses pipelines; instead digesting the entire chunk of data in a single process, the data are chopped up into smaller operations that are handled one by one. Each of these processes requires at least one cycle, regardless of whether it is the chipset, CPU or any other system component such as memory. The same principles apply to the memory request organizer, which controls and sorts out all memory requests going to AGP or CPU or else to the various DMA channels in the system. Depending on the efficacy of the pipeline stages, however, it is possible to skip some of them, in particular those used to buffer data in between processing steps. The underlying principles are similar to those used by AMD in the Super Bypass of the 751 North Bridge or the programmable CAS latency in the DRAM output path.
Speedbin and Quality: separating Canterwood from Springdale
The context of CAS latency is actually a good example to explain the chipset latencies. That is, at lower frequency, the CAS latency can be lowered. That means that bypass switches are closed or opened, depending on the MRS (short for mode register setting) commands during device initialization. In other words, if DRAM can work with fewer pipeline stages at lower frequencies, then the same is possible on the chipset level as well. Likewise, if higher grade memory is capable of maintaining lower latencies at higher frequencies, the same will go for chipset speed bins. Lower latencies at higher frequency are called PAT. Third, if lower grade memory is to be run at higher speed, it will be necessary to add extra pipeline buffers; the same holds for chipset bins
The overclocking community is well aware of the fact that there is always the possibility to run hardware faster than what the specs call for. That means, there is always a possibility to run a CAS-2.5 DIMM at CAS 2, with a certain risk for stability included. The problem in this case is that at some point, there will be a brick wall at which the settings will have to be changed back to a more conservative mode of operation and that will cause a relatively greater performance hit than if the original settings were maintained across the bench. Keep in mind that we use the term "relative", meaning that the absolute performance can still be above average, it just does not scale with the expectations derived from the somewhat inflated performance at stock settings.
To summarize the above: At lower frequencies, the chipset has enough time to process all memory requests with relatively few pipeline stages in the memory controller. To warrant operability at high frequencies, extra pipeline stages are added that are activated at the 800 MHz PSB frequency. If these pipeline stages are bypassed by selecting the low frequency / low latency mode while maintaining high bus speed, it will result in reduced address and command decode latencies or PAT.
This, of course, opens up the possibility to activate the bypass by simply faking a lower bus speed to the chipset while cranking up the frequency. The reference signal to the chipset or Core / FSB Strap (BSEL) is a two bit signal that specifies either 100, 133 or 200 MHz at the rising edge of PWROK (power ok during initialization). That is, during initialization, the MCH Configuration Registers are written to offset C6-C7h in the PCI registers (like the above mentioned memory MRS) with bits 1-0 in the register defining the frequency. Keep in mind that the default value is assigned to the registers by the strap assigned to the BSEL pins. Therefore, if the strap can be changed, the frequency assignment will also be changed and if the chipset thinks it is running in 533 MHz mode while receiving an 800 MHz clock input, it will activate the bypass. All that is needed is a BIOS switch to pull up the "wrong" BSEL signal.
Of course, there is a problem with this scheme, usually wafers are tested to make sure that the speed bin will be able to withstand the high operating frequency at reduced pipeline stages. Those wafers that pass are binned as Canterwood and packaged using the more elaborate 1005 BGA package. Keep in mind though that only a small fraction of die is tested, and that the qualification means passing speed tests way in excess of the nominal frequency.
This leaves a waste majority of chips that will pass at the nominal value but with small margins and probably more that just fall through the cracks because they were not tested at all. What it all comes down to is the fact that most Springdale PE chipsets will more or less meet the PAT requirements.
On the legal side of things, PAT is reserved for the Canterwood and Canterwood only and, therefore, it cannot be used for any other chipset. A number of alternative acronyms matching the first names of Intel fellows like Performance Enhancement TEchnology (McWilliams) or Just A Notch faster (Camps) or one of ASUS own employees like Fu King (fast) comes to mind but I digress.
At One Glance
What You Get
The hardware bundle shipping with the P4P800 is somewhat pedestrian and is limited to the Parallel ATA ribbon cables (3) and two Serial ATA data cables along with one floppy cable and the I/O shield. The bag of extra jumper caps is nice to have but does not really qualify as bundle. One additional item we mentioned earlier is the "Instant Music" keyboard template that'll fit with some straight keyboards, Ergo- or non-standard keyboard users will have to put stickers on the respective keys (included with the board). What is missing altogether are the power adapters for SATA drives included e.g. with the ABIT IC7 and with other boards as well. Since these connectors are not shipping with SATA drives and are hard to find in the retail channel, this can be a problem.
Ai for Artificial intelligence.
Documentation is provided in the Artificial intelligence fold-out poster, the Quick Setup Reference, a sticker with the layout and connector configuration and the main manual. Asus manuals have always been among the best in the business and the P4P800 Deluxe makes no exception.
Bundled software is a copy of InterVideo's WinDVD Suite, comprised of WinRip and WinDVD Creator.
Ai for Artificial intelligence
As with the P4C800, the P4B800 features a number of the so-called Artificial intelligence features like Ai Net Solution with autodiagnostics like Virtual Cable Tester to find a faulty cable. Quite frankly, a CSA solution would be much more appropriate since the PCI-based 3Com solution can only handle half duplex anyway and the diagnostics .... Let me put it like that, the SMC BDT9332 had these diagnostics already 6 years ago.
The Ai Audio Solution is not that intelligent either since the auto detection of features is limited to the detection of something being plugged into the port, which then still requires manual identification of the device before the system will generate an error message if that is appropriate. At the same time, the SoundMAX technology enabling these features will only install when four or more speakers are already plugged in. In other words, what's the point of hilighting the configuration detection and help options if you have to figure it out in the first place in order to install the utility?
The Ai CrashFree BIOS2 is another mixed blessing, with the P4C800 our experience has been that the system would randomly start flashing the BIOS upon Post if the ASUS CD or else a floppy with the binaries was in the drive. The POST reporter is the first feature for me to turn off on any board that has it. Qfan technology, on the other hand is a nice feature as long as the heatsink used is not at the upper end of the performance and quality scale, in that case, the thermal capacitance will slow down the response time and lead to some rather erratic behavior.
The Ai Overclocking simply increases all frequencies by the selected value and was functional up to 10% overclock, beyond which the board would not even initialize (even after lowering the CPU multiplier on the unlocked P4. We'll have more details on that later).
Last not least, we have the ASUS MyLogo2 for customizable boot logos and the Instant Music, with the latter not working on the test board.
Overall, my main criticism of Ai is that intelligence in this case is really lower case, moreover, the entire scheme is too pretentious, and creates a shady impression of ASUS but then, the world wants to be lied to.
At first glance, the P4P800 looks similar to the P4C800 Canterwood board but it only takes a second look to realize that the entire board is a completely different design. Where the CPU socket 478 was in East-West orientation on the P4C800, it is vertically oriented on the P4P800. This, in turn, pushes down the MCH a bit more making its passive cooler butt flush against the AGP slot, which has moved up one slot on the PCB, too, leaving an empty slot before the five PCI slots start. The intent for this is not entirely clear, the negative side effects, however, are, in that it is impossible now to switch memory modules in the first two DIMM slots without removing the AGP card first. The AGP retention mechanism is luckily one of the better kind, using a sliding mechanism that can actually be accessed even after the card has been installed. Underneath the 5th PCI slot is the ASUS WiFi slot to be used with ASUS wireless modules to come.
The test board also featured a red AGP warning LED that lights up whenever the graphics adapter is either not inserted all correctly, or else, if an older 3.3 V AGP card is inserted, which can damage the mainboard. We have learned from our friends at Tomshardware that a production board, they picked up in Munich's Red Light district was missing the red LED, which they discovered after getting back to their lab. Coincidence?
Left to right, the CPU socket of the review board contains a few extra decoupling capacitors; the red AGP warning LED is omitted from the production boards, also note the proximity of the AGP slot to the DIMM slots; the ICH5/R supports SATA RAID0; the VIA VT6410 adds additional Parallel ATA RAID functionality. Click for larger images.
The four memory slots are color-coded in blue and black to take the guesswork out of identifying the complementary DIMM slots within each channel. The two IDE connectors are underneath the floppy and 20-pin ATX power connector at the upper far right of the PCB, that is, on the far side of the DIMM slots.
The ICH5/R and in this case, it is the /R version that is being used is at the level of the first PCI slot with the two SATA connectors directly adjacent to it. The ICH5/R only supports RAID 0 and UATA 66/100. Therefore, ASUS has added the VIA VT6410 IDE RAID controller with support for UATA 133 as well as for RAID 0, 1, 0+1 and JBOD configurations. Awkward, though is the placement of the two IDE connectors in that one of them is a 90 degree angled connector whereas the second connector is a standard, upright type in horizontal orientation.
ASUS relies on the Intel reference design around the ADP3168 programmable multiphase controller, in this case in a three phase configuration, using the ADP3148 driver chips for a Flex PWM architecture with temperature-compensated inductor current sensing, up to 1 MHz per phase operating frequency and a new output disable function for controlled shutdown with no negative current spikes during shutdown. Power is drawn directly from the 12V rail of the power-supply to allow enough juice to go to the processor. In the case of the P4P800, three power MOSFETs are used per phase just like in the Canterwood solution. The position of the auxiliary dual 12V input is directly adjacent to the VRM and once again, there will be some who complain about airflow restrictions but frankly, we don't see that problem.
ASUS has adopted the ADI1985 AC'97 audio CODEC with the SoundMax4 utility that is also featured on Intel's own boards. I am personally still not too thrilled with the sound quality and moreover, SoundMax4 and the "intelligent" features only work with a minimum of four speakers, otherwise they won't even install correctly. A related issue is that upgrading the audio configuration from two speakers to e.g. a 5.1 setup requires uninstallation of the existing drivers and reinstalling them. Needless to say that with two speakers, the auto recognition does not work either, which also holds for e.g. plugging in a microphone into the wrong plug. Overall, I still prefer the ALC650 or any other solution over the SoundMax4. On the upside, everything, including the microphone was working as it should.
The audio I/O path can be redirected from the back panel to any front panel interface by resetting the jumpers on the FP_Audio header on the far left next to the fourth PCI slot. In addition, the board features a header for SPDIF_OUT but there are no provisions for digital audio input (as with the ALC650).
Left to right, the I/O Back panel with its four USB 2.0 ports, the Gigabit LAN connector and the three multifunction analog audio jacks; the VIA VT6307 firewire controller is currently one of the most popular IEEE1394 controllers; the 3Com Gigabit controller does not support CSA and is connected to the PCI bus. Click for larger images.
Press Review Board vs. Retail Boards
There have been a few changes from the review samples to the retail boards and even more rumors spun around those changes. The red LED that was missing in THGs retail board is one of the omissions. Most other alterations concern some simplifications of the design, that is, mostly reduction of decoupling capacitors that were found unnecessary. The reason why those were included on the preproduction samples is supposedly that the "Secret" ICH5 samples still had some bugs, especially with respect to the USB functionality and some stability issues that were partially ameliorated with overloading the boards with capacitors. Frankly, I have no problem believing this. Of course, the real test would be to take a retail board and add those capacitors again to see whether they make a difference in overclocking. As for the Red Light, er, red LED ....
Jumpers and Other Connectors
Athough, jumper-free for all standard configuration issues, the P4P800 has a number of jumpers mostly serving the purpose of power-up options. Examples are the keyboard and mouse as well as the USB power-up options. The other jumpers on the board allow to enable/disable SMBus support for PCI slots and finally to clear the CMOS.
Three fan headers are for CPU, Chassis and PSU fans, in addition, we have the ASUS-typical chassis intrusion header and the actual SMBus interface. The USB 5-6 and 7-8 headers are on the level of the second PCI slot. The COM2 header, the GamePort MIDI interface (module not included) and the IEEE 1394 connector are lining the bottom edge of the PCB.
Like the P4C800, the P4P800 uses a BIOS from American Megatrend. There are good and bad things about it. The good thing about any AMI BIOS is that it boots up extremely fast, the bad stuff is that usually scores of settings are dysfunctional. This probably describes best the BIOS of the P4P800 as well.
In addition, the structuring of the AMI BIOS is somewhat counterintuitive making it difficult to navigate but in all fairness, a lot has simply to do with what we are used to and after a few days, finding the ins and outs became just as natural as with the Award Medallion BIOS. There are certainly a number of Phoenix Award BIOS out there with a much worse configuration.
The Main screen features the usual options like system time and date and IDE configuration. However, the SATA settings are hidden behind a separate IDE Configuration tab in the same listing and gets into a secondary page offering Enhanced and Compatible Mode where Compatible means that only a total of four ATA devices are supported; a concession to older operating systems. Enhanced Mode supports all four P-ATA devices along with the two SATA channels for a total of six possible drives. If Compatible Mode is selected, the next tab offers all different distribution possibilities for the four devices over four of the six channels. Actually, this is a much cleaner way of doing it than the cryptic ways of the Award Phoenix BIOS that do not offer the custom configuration possibility.
If SATA is enabled, the next tab queries for RAID or non-RAID setup of the SATA drives. It is important here to know that the SATA BOOT ROM needs to be enabled in order to boot off SATA drives.
Back at the top level, the next screen is the Advanced Menu featuring Jumperfree Configuration, CPU Configuration (which, aside from the HT setting is only a summary statement of the different CPU parameters), Chipset, Onboard Devices, PCI PnP and USB Configuration and last not least the Speech configuration and Instant Music. As mentioned earlier, Speech is the first thing that I disable, I simply can't stand computers talking back at me and the Instant Music did not work for whatever reason. The manual states that after enabling Instant Music, the Scroll Lock LED on the keyboard should light up. Well, it did not, at least not with any of the keyboards tried.
The Jumper-Free Configuration contains all voltage and frequency settings that are found e.g. in ABIT's SoftMenu. The Ai overclock tuner allows manual setting of the PSB from 100 to 400 MHz as well as some nondescript overclocking of all parameters by 5, 10, 20 or 30%. Parameters like memory frequency and AGP/PCI frequency are not accessible in that case but at least the memory frequncy is overclocked by the same ratio. Quite honestly, this is going back into the stone age of overclocking. A 10 % overclock worked, but even 20 % would not allow the system to even initialize. This is not surprising, even with an unlocked CPU since the memory would be clocked at at 480 MHz under these conditions. One thing interesting about the Ai Overclocking was the fact that its use was the only way of forcing memory settings other than 2:2:2:5, more details later. Hopefully, the production boards will fix these issues but otherwise, the Ai overclock tuner does not live up to even the lowest expectations.
The DRAM frequencies can be set according to the processor used. The AGP/PCI frequencies can be set to 33/66, 72/36 and 80/40, which is another set of features going completely beyond my understanding since overclocking the AGP and PCI bus does not increase performance by any means but only introduces stability problems and extra wear on the hardware.
The CPU Vre allowed the somewhat disappointing range of 1.55V to 1.5875V only, which will cause some uproar in the overclocking community, even though we had no problems pushing the 3.0 GHz P4 all the way up to 3.7 GHz without any voltage mods. Helpful in this respect was probably that the Vre measured on the CPU directly was 1.62V instead of the 1.5875. The DDR voltage can be increased up to 2.85V and all voltages were close enough to the settings to ignore some minor fluctuations. In stark contrast to those rather limited voltage settings is the AGP VDDQ range, spanning from 1.5 all the way up to 1.8V. I personally have still not figured out what advantage the voltage increase at the AGP I/O buffers should have, neither have I ever seen any effect other than the introduction of stability and heat problems.
The last entry on the page is the Performance tab that can be set to either Auto, Standard or Turbo. At 800 MHz FSB, nothing we did with this setting appeared to change anything with respect to performance. The situation was different, though at 250 MHz, where the board took the expected performance hit (see first page of this review) if left on auto or standard. Setting the value to Turbo, however, appears to force PAT back in, thereby cranking up the performance back to where it is expected. We did not check for the trip point at which the performance setting defaults back to the longer pipelines by default.
Skipping the CPU Configuration, the next tab is the Chipset submenu featuring the DRAM latency parameters and Burst Length selection of 4 or 8 Clocks. The latter is not possible since DDR DRAM chips do only support bursts up to 8 bits (which equals 4 clocks). Most likely ASUS means 4 and 8 QuadWords here, which would be 2 or 4 clocks (Please correct with the next BIOS revision).
Memory Acceleration mode enabled or disabled did not change anything as far as we can tell, at least not at 800 MHz PSB. The Idle timer should be set to 16T for best performance or shorter if the board is to be used as a server. Like in the case of the P4C800, the MPS revision is set by default to 1.1 which is the version compliant with WindowsNT versions older than 3.51, the correct setting for NT4.0, Win 2000 or XP would be 1.4.
We already mentioned it earlier, except when AI overclocking by X% is enabled, none of the latency settings is functional, the board usually showed a 2:2:2:5 configuration in CPU-Z even though the BIOS showed different settings. After enabling AI overclock and then reverting to the manual mode, some of the latency settings suddenly became workable, at least to the point where they would change something, even though the resulting settings were in no case what we manually entered. Certainly an interesting way of playing the memory roulette.
The Onboard Devices menu allows enabling / disabling of all integrated peripherals like AC'97 sound, VT6410 RAID controller, Fireqire and the 3Com LAN.
VT6410 RAID Controller
As strange as it may seem I have no pair of Parallel ATA drives left that I could have used for IDE RAID configuration, therefore, I cannot comment on the performance or any other properties of the controller. It appears to be an anachronism to have it on the board but that is only my personal opinion.
The PCIPnP configuration is rather standard, the USB Configuration offers the novelty of USB Mass Storage device configuration. The rest of the BIOS is plain vanilla.
Overall, the entire CMOS setup appeared to be somewhat immature with most of the basic features being functional but there are some major bugs to be weeded out still.
As usual, we take a quick look at SiSoft Sandra, for simplicity reasons and since both Integer and Floating Point Bandwidth are within a fraction of a point, we are only showing the Integer bandwidth with buffering enabled to get some idea about the overall throughput under ideal conditions. Typically, Canterwood boards are running between 4700 MB/s and 5100 MB/s depending on the manufacturer and the BIOS / memory configuration, whereas all Springdale boards we have looked at so far scored below 4600 MB/s buffered bandwidth
** The original BIOS shipped with the board gave higher memopry performance but had some stablity problems, all benchmarks shown were run with 1006.final.
Buffered memory bandwidth. At default, the P4P800 beats every other Springdale board with respect to raw memory bandwidth at default setting and also beats all Canterwood boards with the sole exception of the ASUS P4C800, which still scores marginally higher. At 250 MHz FSB with the memory running in DDR320 mode, the board defaults back to the "standard" setting, and takes a performance hit compared to e.g. the Canterwood-based ABit IC7. Forcing the "PAT" setting back into the chipset by setting the Turbo performance in the BIOS, quickly remedies this performance hit and brings the P4P800 almost back to the top. At 300 MHz PSB with the memory running in 3:2 mode, the P4P800 turned in the highest scores we have seen thus far.
A Quick Word About Overclocking
Despite the limited CPU voltage range, we had no problems running the P4P800 overclocked to 3.6 or 3.7 GHz processor speed. In terms of overall board stability, the P4P800 was rock solid up to 300 MHz PSB but completely failed above 302 MHz bus speed. Kyle from [H]ardOCP had emailed me about stability problems above 250 MHz he experienced with his board unless active cooling was added to the MCH, we did not see anything like this, moreover, the MCH heatsink never got anything but lukewarm to the touch. Whether there is a story behind this or it is just coincidence is completely beyond our knowledge and we won't comment further.
Memory Access Latencies
The real story unfolding behind the P4P800 has been that ASUS was able to find a way around the Springdale limitations, that is to avoid activation of the additional pipeline stages in the memory control and address generation path. Essentially, and that is just another angle of looking at the entier situation, PAT is roughly equivalent to an overclocked Granite Bay chipset but to ensure functionality of lower quality grades, the two additional pipeline stages have been added (with the respective bypass switches).
Take the Canterwood configuration and add an extremely aggressive BIOS with the shortest memory access latencies and the result will be the P4P800 performance. If it works as we assume, the access latencies will be in the ballpark of what we showed for the Canterwood in our "to PAT or not to PAT" article. If it does not, the P4P800 latencies will rather be in the Springdale-typical range. As baseline, we are using the ABIT IC7 with its upper middleclass Canterwood performance.
ABIT IC7 Canterwood vs. ASUS P4P800 Springdale memory access latencies, lower is better. The P4P800 latencies are plotted as solid columns, the IC7 as transparent columns. The results are stunning to say the least, the P4P800 literally destroys the IC7 in this discipline despite the fact that, nominally, the Canterwood chipset is supposed to be faster.
Running at 300 (1200) MHz
One of our questions or hypotheses from the very beginning of this review was that at some point during overclocking, there has to be a trip point at which the higher pipeline stages needed to be activated to allow the higher operation frequencies. At the same time, the valid question was whatever would happen to memory access latencies if, at higher PSB frequency, the memory bus was going to be run asynchronously at lower frequency, which we know, increases overall bandwidth, compared to the same memory frequency but a lower, synchronous, PSB. We are using the same IC7 reference data as above.
ABIT IC7 Canterwood vs. ASUS P4P800 Springdale memory access latencies, lower is better. The P4P800 latencies, this time running at a 1200 MHz PSB with the memory bus at 3:2 setting, are plotted as solid columns, the IC7 as transparent columns.
With the memory running at the same latencies on both systems, as well as on the one shown above, the access latencies should still drop quite dramatically as a function of the 50% higher chipset frequency. However, at the higher speed, switching on the additional pipeline stages costs some performance, maybe there is an additional performance hit caused by the asynchronous memory operation. Still, overall, the latencies are even lower than those shown above which is consistent with the higher memory bandwidth scores if the PSB is raised while the memory bus is kept at the same speed.
Keep in mind that what we are looking at are composite latencies consistinng of the DRAM AND the chipset latencies. The purpose of this graph is not so much to make a science project out of the actual data contained therein but rather to provide some food for thought for those that really are into these issues.
As mentioned earlier, we have almost completely transitioned to SATA drives. Except for a bunch of old IBMs and ATA66 WesternDigital drives, there are no parallel drives left to test the functionality of the VIA RAID controller and therefore we rather skip that part instead of testing with obsolete equipment.
SiSoft Sandra FileSystem
The numbers speak for themselves. Again, we don't care about the "Random" numbers that depend on too many user-defined parameters to be reproducible
ATTO Disk Benchmark
As always, only the lower part of the graph really counts where data transfers outweigh bus occupancy by Frame Information Structures and the result is saturation of the sequential read and write performance. The results are basically identical to those shown with SiSoft Sandra with a slight offset.
The two sets of Winstone, that is Business and Content Creation are still appropriate measures of overall system performance with the business suite severely bottlenecked by the system's I/O performance, meaning that increasing PCU speed will not buy much in terms of extra scores. ContentCreation is more CPU intensive and here we see a strong show of the P4 overclocked to run at 3.6 GHz (12 x 300 MHz). Again, to avoid confusion, the CPU used is an unlocked Engineering sample. On the other hand we also have to stress the fact that the system put away the 300 MHz FSB without any problems whatsoever which is rather remarkable, particularly in light of the aggressive BIOS.
Even at default settings, the P4P800 zooms right to the top of the list, overclocking to 3.6GHz just adds a little extra. Keep in mind that the P4C800 was running with the Promise Controller instead of the ICH5R which causes a small performance hit.
In CCWS2003, the P4P800 once again takes a small lead, however, the benchmark is mostly processor limited (if all other bottlenecks are resolved) and a 20% higher CPU speed pays off big time. Certainly, the higher PSB also helps.
Once again, we were tempted to migrate to 3DMark2003. However, since the only meaningful score for system performance is the CPU score, which, with an identical CPU becomes a self-fulfilling prophecy (but still makes for a good waste of bandwidth), we'll just leave it out and stick with 3DMark2001SE.
At default settings, the P4P800 once again destroys the rest of the field. Note, however, that the scores at 250 MHz (250MHz x 12 = 3.0 GHz) are actually lower than the ones achieved at default speed. Overclocked to 3.6 GHz (300 MHz x 12), the P4P800 makes everybody else just eat dust.
Again, at 120 x 250 MHz for 3.0GHz total clock speed with the memory running in DDR320 mode for 400 MHz data rate, the P4P800 takes a performance hit. Which still does not overshadow the fact that at default it is almost as fast as the P4C800.
UT2003 paints exactly the same picture, marginally slower than the P4C800 but faster than the rest of the field with the small performance drop at 250 MHz
One thing to keep in mind here is that the 200 MHz default PSB frequency is really 202 MHz which translates into 3030 MHz default CPU frequency rather than the 3.0 GHz it is supposed to be running at. The 250 MHz setting, on the other hand, gave us 3000.3 MHz, meaning that there is a 30 MHz delta in clock speed that can account for some of the performance differences.
One thing is for certain, there are a whole bunch of conclusions that can be drawn from this review. Foremost, there is no doubt that ASUS were the first to figure out how to reconfigure the 865MCH to run in Canterwood mode, that is, with the bypass switches closed to avoid the extra buffering stages in the pipeline added for stability at high speed. It appears as if, at least with some 865PE samples, the use of the extra long address and command decode pipeline is not necessary. The next question in this regard is, of course, how will ASUS control the speed bin of the 865PE memory controllers. Without proper binning, this could easily turn into Russian Roulette.
Either way ASUS (and their followers) play the game, there is one thing to keep in mind: Intel does not condone the bypass mode on the Springdale chipset and, thus, the only ones liable if anything goes wrong, are the mainboard manufacturers bypassing Intel's speed stops. On the other hand, Pandora's box has been opened and there is no stepping back. Likewise, to remain competitive in the Springdale market, every manufacturer will have to do the same. Let the slaughter begin and the RMAs soar, this is called self regulation of the market. On the other hand, if it turns out that there are no problems, let Intel bitch as much as they want but most likely, they won't even do that, what better advertising could they get. And then, there is always still the possibility to sell the Canterwood chipsets to the OEMs.
As for the P4P800, the overwhelming impression is that this is easily the overall fastest board we have laid our hands on thus far. At default, the performance blows away anything else, Canterwood or Springdale, it really does not matter. At the same time, the stability was phenomenal but there is the caveat that the production boards may have a tough act to follow.
One thing that definitely needs a sizeable amount of attention is the BIOS and its lack of functionality regarding even some of the most basic parameters. After one week of going through all possible permutations of latency settings, some of which would work, others would not or only randomly and at default, nothing would work at all, we are left with the impression that there are still some rather serious issues. Granted, WE did not have any problem with DDR400 at 2:2:2:5, but it is foreseeable that the majority of users will run into a brickwall right there.
Artificial intelligence is what ASUS calls their new set of features. It is not clear where that name was derived from, most of the features are gimmicks or don't work as advertised and the overall impression is that the marketing people at ASUS must think that the enthusiast community is a bunch of idiots that probably need a bit of Artificial intelligence. The feature set is interesting for the OEM market but if ASUS wants to address the overclocking and enthusiast community, they really need to step back and rethink their strategy which only lacks a Barbie Doll included with the accessories. Keep in mind that this verdict is completely independent of the hardware itself which, at least with respect to the sample we received is nothing short of a masterpiece of engineering.
Copyright © 2002