About Me
Background
Policy Analysis
Organisations
Cryptography
Technology
Cryptography
Computing
Mathematics
Papers

A Password Based File Encryption Utility

Updated on 11th November 2008 to add revised entropy function

Since first publishing an example of how my AES code can be used to encrypt files, I have been surprised by the number of people who want to use this in real situations. This was never my intention since this example lacks a number of features that are essential for security purposes.  But now that I have published other code that can be used with AES to provide a better file encryption utility, I thought it might be worthwhile to invest some time in a more complete file encryption application.

The Password Based File Encryption Utility described here combines AES code I have published earlier with HMAC-SHA1 for authentication and key derivation from a password and a salt according to RFC2898. 

Although I cannot provide any guarantees of correctness or fitness for purpose, I am not aware of any bugs or errors in this code. I welcome feedback on any problems encountered in its use and on any related security issues. My aim will be to publish fixes here for any issues that are reported.  Users should be aware that this code is not designed to operate in a hostile machine environment (that is, one in which other running processes have to be treated as hostile) and I assume that compiler and library code incorporated when the application is built are benign in security terms.

The AES code used in this application is now very mature and has been independently reviewed by a number of companies who are using it in their products.  WinZip Computing Inc has incorporated the code on this page into WinZip and has received FIPS-197 certification for the AES elements of the resulting product.  You can read more about this implementation on their web page at:

    http://www.winzip.com/aes_info.htm

My overall aim has been to avoid complexity and to ensure as far as possible that existing standards are employed in a conservative way.  At the moment I have not provided plaintext compression before encryption but I may add this later if the other aspects of the design hold up.   I will also consider any other ideas that people suggest for inclusion.

What follows is a description of the utility in outline and then a description of its major components.  All the code needed to compile this application is in this zip file.

Here is a version that adds bzip2 compression to the application. This ZIP file contains my source code and the VC++ 7.1 project files needed to produce the application.  It requires the BZIP2 source code that is available here

Overall Structure

The essential components in the design are:

  • AES in Counter (CTR) mode for encryption

  • HMAC-SHA1 for authentication

  • Key derivation from a password and a salt according to RFC2898

The encryption utility operates in one of three modes as follows (with all lengths in 8-bit bytes):

Password
Length
Salt
Length
AES Key
Length
HMAC-SHA1
Key Length
Password
Verifier Length
Authentication
Field Length
File Length
Overhead
  8 <= length < 32 8 16 16 2 10 20
32 <= length < 48 12 24 24 2 10 24
48 <= length < 64 16 32 32 2 10 28

The encrypted file format is:

  • the password salt bytes (8, 12 or 16 bytes)

  • the optional password verifier (2 bytes)

  • the encrypted file contents (the same length as the file itself)

  • the authentication field (10 bytes)

Authentication is performed on the encrypted file contents (that is after encryption).  The length of the keys used for encryption and authentication is determined by the length of the password supplied by the user.

Please note that the use of HMAC-SHA1 for key derivation places a limit on the resulting key space that can be as low as 160 bits. Hence, although 192 and 256 bit AES keys are supported, users should be aware that these keys will not offer their full theoretical strength against brute force key searches.  However brute force key searches over such huge key spaces are of no practical significance. In consequence the design is directed instead to the prevention of searches in the much smaller key spaces that experience suggests are typical for password derived keys.

The Components

The Pseudo Random Data Pool

Pseudo random numbers are needed in the design to provide the bytes of the password salt. However, since the salt is not secret, we are interested in randomness rather than secrecy and this makes our task somewhat simpler than it would otherwise be. 

Even though we do not need a paranoid pseudo random data stream, I have nevertheless implemented a pseudo random data pool using the ideas advocated by Peter Gutmann.  This code is in prng.h and prng.c and is designed to make use of an externally supplied source of entropy. 

In this case I have used the Time Stamp Counter as an entropy source. This is probably good enough for generating randomness without secrecy but would have to be improved if we ever want to use this random data stream for key generation purposes.

Password Based Key Derivation

The supplied password and a salt value are converted into two keys (one for AES and one for HMAC-SHA1) and an optional password verification value using the approach set out in RFC 2898.   This is accomplished in the files pwd2key.h and pwd2key.c.

The two byte password verifier is optional and, if present, this is placed in the file after the password salt. This is generated from the password and salt value on encryption and stored in the file. On decryption it is again generated from the password and salt but is then tested against the stored value to determine if the password is correct. 

The main aim of the password verifier is to cope with huge files where there may be practical reasons for not wanting to decrypt the whole file before discovering that the password is incorrect.  If only short files are being encrypted it will often be better to use the file authentication value to reject incorrect passwords so this password verification feature is optional. Note, however, that there is no interoperability between versions with and without this feature.

The Advanced Encryption Standard

This code, which is supplied in the files aes.h, aesopt,h, aescrypt.c and aeskey.c and aestab.c, is simply my AES code configured for encryption only operation since CTR mode does not require the inverse cipher. The file to be encrypted is split up into 16 byte blocks (the last can be a partial block) and the resulting block number is used as the encryption nonce for CTR mode.

Message Authentication (HMAC-SHA1)

The SHA1 hash code is supplied in sha1.h and sha1.c and the message authentication algorithm HMAC-SHA1 is provided in the files hmac.h and hmac.c. 

The File Encryption Layer

The above components are bought together to provide support for file encryption in the files fileenc.h and fileenc.c

The Command Line Code

The command line input is handled as usual in main.c which uses a very simple input format:

        encfile password infile

where 'encfile' is the name used for the file encryption utility, 'password' is the password to be used and 'infile'  is the name of the file to be processed.  If the last extension on the filename is '.enc', the file is assumed to be an encrypted file and an attempt is made to decrypt it with the given password, writing the output to a file with the same name except that the extension '.enc' is removed.  Otherwise the file is encrypted with the given password and the output is written to a file with the same name except that the extension '.enc' is added.  This interface will do for now but it could obviously be improved.