Here is the cache code:
#region don't process the same address twice if (newEmail.EmailHash == 0) newEmail.EmailHash = newEmail.Email.GetSHA1Hash(); if (emailsToAdd.Contains(newEmail.EmailHash)) { return false; } emailsToAdd.Add(newEmail.EmailHash); #endregion |
For this implementation, I am returning the hash as a 32-bit integer because I am using it for caching, not security. It’s a little faster to use an int32 index than the default 160 hex digest. If avoiding collisions is important (only 4,294,967,295 values in an int32), remove the BitConverter.ToInt32 call and return a string.
It’s for C# 4.0 because it’s an extension method.
Here is the hash class. Call with STRING.GetSHA1Hash().
public static class SHA1Hash { private static SHA1CryptoServiceProvider _cryptoTransformSHA1; public static SHA1CryptoServiceProvider CryptoProvider { get { return _cryptoTransformSHA1 ?? (_cryptoTransformSHA1 = new SHA1CryptoServiceProvider()); } } public static int GetSHA1Hash(this string stringToHash) { return string.IsNullOrWhiteSpace(stringToHash) ? 0 : (Hash(stringToHash, Encoding.Default)); } public static int Hash(string stringToHash, Encoding enc) { return BitConverter.ToInt32(CryptoProvider.ComputeHash(enc.GetBytes(stringToHash)), 0); } } |