Thursday, September 28, 2006

Working on SHA-2

Several months ago, it became clear that one could crack a SHA-1 message digest. It was still a nontrivial problem, but it could be done thousands of times faster than brute-force guessing. So SHA-1 has become undesirable as a secure message digest, and U.S. federal software security standards now call for software to use SHA-256 (one of the group of algorithms which comprise SHA-2).

MySQL currently has a builtin function to produce message digests with the SHA-1 algorithm, but not with SHA-2. There's a bug #13174 logged at the MySQL site, but they seem like they're deprioritizing it.

So I thought it would be nice to contribute some code to MySQL. I've been using it for about six years, and helping answer questions on newsgroups and forums, and I've also logged a few decent bugs. But I've never contributed code. How hard could it be? I'm no expert on implementing cryptography code, so I don't want to write the code and get it wrong.

Fortunately MySQL uses OpenSSL, which qualifies on NIST's list of Validated FIPS 140-1 and 140-2 Cryptographic Modules. MySQL already relies on the OpenSSL library for the DES encryption and decryption algorithms.

I've made good progress. I checked out the sources with BitKeeper, and built the MySQL 5.0 and 5.1 sources. I kept an unmodified copy of the tree so I could create diffs after I added my code. I followed the documentation on adding native functions to MySQL. I could have implemented it as a UDF, but I thought the proper fix for the lack of functionality would be a built-in function.

I now have it working. One can apply a patch to the MySQL sources, and build the tree. If you configured MySQL with "--with-openssl", then the SHA-2 function calls into the OpenSSL library to get the SHA-224, SHA-256, SHA-384, and SHA-512 message digest algorithms. Here's an example demonstrating the usage:

SELECT SHA2('plaintext string', 256);

The first argument is a string expression. The second argument is one of 224, 256, 384, or 512, according to the bit length of the desired message digest. Other values cause the function to return an error.

I ran tests using sample test vectors available from NIST. This is of course not a certification, but I can at least perform unpublishable validation. The test vectors include sample strings, and the expected hash values for SHA-1 through SHA-512. I wrote some shell scripts to help run through the tests, submitting the strings to SQL statements using "mysql -e", and comparing the result to the test vector data, and also running the same data through standard command-line utilities like "sha256sum" as a double-check. Everything passes except the simplest case: a short, 8-bit value "00". I suspect I just have the length of the input set wrong.

The next step is to figure out how to convert these tests to integrate with MySQL's own test framework.