PC Authority Benchmarks

Staff Writers | Jan 1, 1900 12:00 AM
Since the launch of PC Authority magazine 30 issues ago, we have placed an emphasis on the concept of computing in the real world. This means that all the PC hardware that passes through our Labs is t
Since the launch of PC Authority magazine 30 issues ago, we have placed an emphasis on the concept of computing in the real world. This means that all the PC hardware that passes through our Labs is tested in an objective fashion using methods that will give you an accurate and useful indication of how that piece of hardware would perform in everyday use. While a component-based test might tell you how fast a particular hard disk or graphics card is in isolation, and a synthetic test might tell you how many MIPS or FLOPS a CPU is capable of, if you are interested in purchasing a system you need to know how well all the components work in conjunction as a unified whole. Synthetic tests also have the disadvantage of being an abstraction from the real-world function of that component, thus the number of MIPS or FLOPS a CPU is capable of might not be a reliable indication of how that processor will perform in a spreadsheet or graphics application.


For this reason PC Authority firmly believes in system-wide application-based benchmarks that treat the PC as a whole and tests its performance in the tasks that are used by our readers on a daily basis. We also do not distribute the PC Authority Benchmarks, firstly because we use full versions of retail applications as an integral part of the tests, and also because we do not want PC manufacturers tweaking their systems purely for our benchmarks. The important features of any benchmark are that it is consistent, accurate and that it gives a true representation of the PCs performance when it is applied to real-world tasks.


When it is necessary for us to test an individual component, such as a 3D graphics card or motherboard as a part of our second Labs each month, we insure that we only use the best benchmarks that are available, and we test them exhaustively to insure that they meet our rigorous standards for accuracy, consistency and real-world applicability. Here we will cover in detail the new PC Authority application-based Benchmarks, as well as the latest benchmarks that we use for the testing of 3D graphics and DVD playback, both of which are becoming increasingly important in the world of desktop PCs.


Real-world applications

The PC Authority Benchmark suite uses an assortment of off-the-shelf applications that are representative of those used on a day-to-day basis by a majority of our readers. The full version of each application is installed and run through its paces using its own built-in scripting language. The scripts force the application through a complex and demanding series of tasks intended to represent the worst a user could put the program through, and this stresses even the most powerful PCs as much as is possible. The tests are run at 1,024 x 768 resolution at 24-bit colour (or 32-bit colour where 24-bit is unavailable), which is the most common resolution used on PCs sold today.


The applications used in the PC Authority Benchmarks are:
* Adobe Photoshop 5
* CorelDraw 8
* FileMaker 4.1
* Microsoft Access 2000
* Microsoft Excel 2000
* Microsoft Word 2000


Accuracy

The performance testing of PCs and notebooks is a strict discipline: it has to be both accurate and consistent. With the adoption by many manufacturers of very similar system components, results have to be accurate enough to distinguish all but the finest of differences between multiple machines. In addition, we have to know about the accuracy of the tests in advance so that we can tell whether the differences between results are statistically significant or not.


The PC Authority Benchmark results are designed to be accurate to within one per cent. This is achieved by ensuring that the machines being tested operate under standardised Labs conditions. Specifically, we ensure that there are no background tasks running while the tests are being executed. Anti-virus scanners, custom screen savers, display control panels and so on are all disabled and all entries in the Startup group, and Run and RunServices sections of the Registry are removed. This ensures that there are no extraneous events interfering with the operation of the tests. Each application is also run six times over, and is rebooted between each run in order to negate the effects of caching in the RAM and in virtual memory.


Consistency

In order for a benchmark suite to give objective, meaningful and valuable results, the tests must be repeatable. The same measures that are taken to insure that the benchmark results match or exceed a given accuracy also produce scores that can be repeated on subsequent occasions. Tests run by the Labs have demonstrated that the results generated on one PC can be repeated to within one per cent of another machine with an identical specification.


To achieve this level of consistency and known accuracy, each test in the suite is run six times, with the system being rebooted between each iteration of the test. From the six timings that result, the closest four are selected, and only if these scores are within one per cent of each other is the result considered acceptable overall. If the deviation between the scores is greater than one per cent, then the Labs team will investigate what could be interfering with the running of the tests, and then will run them again until an acceptable score is achieved. In practice, most PCs produce six results that are within 0.5 per cent of each other, which is well within our tolerance limits. We believe that there is no other benchmark suite that can guarantee this level of accuracy and consistency.'


Calculating the scores

To calculate a PCs final score, the times for each of the tests are compared with a collection of reference times that are used as a yardstick. The overall score generated is an indication of the performance of the system relative to this yardstick, thus a score of 2.00 means that the tested system completed the tests in half the time it took the reference system to finish the tests. The reference machine used was a Compaq Deskpro Pentium II/400 with 64Mb of RAM running Windows 98 Second Edition and featuring an ATi 3D Rage Pro graphics card, 512Kb of discreet Level 2 cache and a 6.4Gb Fujitsu hard disk.


Once the scores for each test have been calculated, they are weighted and combined into an overall score. The proportional weightings used for the overall score are as follows:
* Adobe Photoshop 5 4
* CorelDraw 8 5
* FileMaker Pro 4.1 4
* Microsoft Access 4
* Microsoft Excel Business 7
* Microsoft Excel Scientific 4
* Microsoft Word 10


These weightings are based on PC Authority reader research into how you typically spend your time at your work PC. These weightings are a generalisation from a great deal of specific applications as used by a great number of people, and as such they will not reflect everyones typical usage patterns. For this reason we will soon be including a roll-your-own application on both our cover CD and Web site. This application will allow you to adjust the weightings of the tests to suit your own requirements and thus tailor the Benchmark results to your specific needs.


Technical Editor,
application installation routines: Tim Dean
Project Manager, Front Panel design and programming, CorelDraw 8 and
FileMaker Pro 4.1 test design: Derek Cohen
Application installation routines: Dave Mitchell
Microsoft Access, Excel and
Word 2000 test design: Simon Jones
Adobe Photoshop 5 test design: Tom Arah
Thanks to Compaq for supplying the reference PC and to the technical support teams at Adobe, Corel, FileMaker and Microsoft.

This article appeared in the May, 2000 issue of PC Authority.