About Me
Background
Policy Analysis
Organisations
Cryptography
Technology
Cryptography
Computing
Mathematics
Papers

AES Second Round Implementation Experience

Note a new Serpent result here showing a 36% speed up on the previous best Pentium II/III results by using MMX instructions.

The Code

This code implements and tests the five AES finalist algorithms on the Pentium II/III architecture.

This zip file contains the source code for the five AES finalists.  The files are C source code files but they will also compile in Visual C++ by simply renaming them with a 'cpp' file extension rather than 'c'.  The compiler used by the author is Visual C++ version 6. 

This zip file contains source code files for testing the AES finalists and consists of header files, an auxiliary file, aes_aux.c, and the following:

  • aes_gav.c � for generating test vector files and optionally checking them against a reference set.
  • aes_rav.c � for running the algorithms and verifying their output against files containing reference test vectors.
  • aes_tmr.c � for testing the speed of the algorithm code.

Again the files have 'c' extensions but can be used as C++ files by changing the extension to 'cpp'.  In order to run the code you will need to set up appropriate directory paths in aes_config.h because I make some assumptions about where files are located.  I have a master directory with a subdirectory for each algorithm where the source code and its test vectors are stored.  The testing files listed above each contain details of the command line switches used to control which algorithms are tested on each run.  The header file std_defs.h contains some common definitions.  You will also need test vector files if you intend to verify the implementations.

All these files are updated versions of those published previously.  There were two errors in the previous source code files, one in the code for rijndael that might cause errors for 256 bit keys and one in aes_rav.c that meant that it did not always test decryption operations after encryption to check for the correct recovery of the original plaintext. 

I have been porting these files to ARM and in doing this I have also changed the Pentium code to improve performance in some areas.  The figures below are hence not identical to those provided here previously.  Some of the key schedule costs have been significantly reduced (my thanks go to the Bill and John Worley of HP Labs for one suggestion here in respect of Twofish).  DES results are provided for comparison purposes.

Speed Results

Key Length MARS RC6 RIJNDAEL SERPENT TWOFISH DES
C C++ C C++ C C++ C C++ C C++ C
Cycles Mb/s Cycles Mb/s Cycles Mb/s Cycles Mb/s Cycles Mb/s Cycles Mb/s Cycles Mb/s Cycles Mb/s Cycles Mb/s Cycles Mb/s Cycles Mb/s
128 bits  
key_set 2118 2044 1697 2200 215 / 1334 216 / 1435 1300 1363 8520 8530 988
encrypt 364 70.3 395 64.8 269 95.1 264 96.9 362 70.7 371 69.0 953 26.8 980 26.1 366 69.9 417 61.3 840 30.5
decrypt 371 69.0 423 60.5 231 110.8 280 91.4 358 71.5 373 68.6 920 27.8 959 26.6 376 68.0 439 58.3 840 30.5
192 bits  
key_set 2132 2042 2040 2472 215 / 1591 223 / 1678 1312 1380 11755 11740 988
encrypt 369 69.3 393 65.1 271 94.4 283 90.4 428 59.8 440 58.1 961 26.6 994 25.7 359 71.3 423 60.5 1260 30.5
decrypt 367 69.7 417 61.3 226 113.2 284 90.1 421 60.8 433 59.1 908 28.1 956 26.7 378 67.7 417 61.3 1260 30.5
256 bits  
key_set 2136 2050 1894 2400 288 / 1913 290 / 1983 1306 1362 15700 15650 988
encrypt 369 69.3 400 64.0 269 95.1 279 91.7 503 50.8 505 50.6 928 27.5 961 26.6 376 68.0 417 61.3 1680 30.5
decrypt 375 68.2 426 60.0 226 113.2 284 90.1 492 52.0 504 50.7 915 27.9 954 26.8 374 68.4 428 59.8 1680 30.5

The figures in megabits per second (Mb/s) are for the 200MHz reference platform.  Two cycle counts are provided for Rijndael key setup because its key schedule cost is different for encryption and decryption (the decryption figure also applies if both encryption and decryption are required).

Because of the higher overheads in C++ (virtual function tables and instance references) this code would be expected to be slower than that in C.  This is true in general but the table shows that this overhead varies considerably between algorithms.  Rijndael and Serpent both do reasonably well but Twofish suffers more than most, probably because this algorithm is sensitive to register allocation and there is less freedom for the compiler to optimise this in C++. 

You are welcome to play with this code provided you attribute its source and comply with any constraints that algorithm designers have imposed.

Changes

(6th May 2000) - My thanks to Michael Dwyer of Intel for pointing out that the previous version of aes_tmr.c published on this page contained a bug that resulted in the algorithm timing for encryption being measured twice instead of encryption and decryption respectively.  The zip files below has now been revised to correct this error and to simplify some of the code.  I have updated the table below but, fortunately, the changes are small since most algorithms are symmetric in respect of encryption and decryption times (the only significant change is for RC6 decryption in C).


Back to Brian Gladman's Home Page