AES Second Round Implementation Experience
Note a new Serpent result here showing a 36% speed up on the previous best Pentium II/III results by using MMX instructions.
This code implements and tests the five AES finalist algorithms on the Pentium II/III architecture.
This zip file contains the source code for the five AES finalists. The files are C source code files but they will also compile in Visual C++ by simply renaming them with a 'cpp' file extension rather than 'c'. The compiler used by the author is Visual C++ version 6.
This zip file contains source code files for testing the AES finalists and consists of header files, an auxiliary file, aes_aux.c, and the following:
Again the files have 'c' extensions but can be used as C++ files by changing the extension to 'cpp'. In order to run the code you will need to set up appropriate directory paths in aes_config.h because I make some assumptions about where files are located. I have a master directory with a subdirectory for each algorithm where the source code and its test vectors are stored. The testing files listed above each contain details of the command line switches used to control which algorithms are tested on each run. The header file std_defs.h contains some common definitions. You will also need test vector files if you intend to verify the implementations.
All these files are updated versions of those published previously. There were two errors in the previous source code files, one in the code for rijndael that might cause errors for 256 bit keys and one in aes_rav.c that meant that it did not always test decryption operations after encryption to check for the correct recovery of the original plaintext.
I have been porting these files to ARM and in doing this I have also changed the Pentium code to improve performance in some areas. The figures below are hence not identical to those provided here previously. Some of the key schedule costs have been significantly reduced (my thanks go to the Bill and John Worley of HP Labs for one suggestion here in respect of Twofish). DES results are provided for comparison purposes.
The figures in megabits per second (Mb/s) are for the 200MHz reference platform. Two cycle counts are provided for Rijndael key setup because its key schedule cost is different for encryption and decryption (the decryption figure also applies if both encryption and decryption are required).
Because of the higher overheads in C++ (virtual function tables and instance references) this code would be expected to be slower than that in C. This is true in general but the table shows that this overhead varies considerably between algorithms. Rijndael and Serpent both do reasonably well but Twofish suffers more than most, probably because this algorithm is sensitive to register allocation and there is less freedom for the compiler to optimise this in C++.
You are welcome to play with this code provided you attribute its source and comply with any constraints that algorithm designers have imposed.
(6th May 2000) - My thanks to Michael Dwyer of Intel for pointing out that the previous version of aes_tmr.c published on this page contained a bug that resulted in the algorithm timing for encryption being measured twice instead of encryption and decryption respectively. The zip files below has now been revised to correct this error and to simplify some of the code. I have updated the table below but, fortunately, the changes are small since most algorithms are symmetric in respect of encryption and decryption times (the only significant change is for RC6 decryption in C).