Understanding Digital Signatures: More Than Just a Hash

Digital signatures are a cornerstone of modern security practices, ensuring data integrity and authentication in various online communications. But there's often confusion between the terms "digital signature", "hash", and "digest". Let's delve deep into understanding these terms and their roles.
What is a Digital Signature?
At its core, a digital signature is a mechanism used to verify the authenticity and integrity of a message, software, or digital document. It's like an electronic stamp, proving that the content hasn't been altered since it was signed and verifying the identity of the signer.
Properties of a Digital Signature:
Authentication: Validates the sender's identity.
Data Integrity: Ensures that the content hasn't changed since being signed.
Non-repudiation: Signers cannot deny having signed the content.
Hash vs. Signature
A hash is a function that converts an input (often called a "message") into a fixed-length string of bytes, which appears random. This output is commonly referred to as the hash value or digest.
On the other hand, a digital signature involves more than just output encodings and hashing. It uses a signing algorithm, which often employs a hash function as one of its steps. But, importantly, it also incorporates elements like symmetric keys to create a unique signature, even with the same message input.
Understanding the Digest
A digest is the encoded output of a hash function. It's the "end result" you get after passing your data through the hash function. The term "digest" is often used interchangeably with "hash value", emphasising the output's nature.
A hexadecimal output is precisely what is referred to as the digest, it's just an encoded output for the chosen hash. It could as easily be encoded as binary, base64, whatever encoding - it is still an output of the same hash.
So there's the twist: looking only at a digest, it's impossible to determine whether it resulted from a hash function alone or was part of a digital signature process. Both can produce similar-looking outputs, but their origins are different.
Signature Algorithms: The Heart of Digital Signatures
Every digital signature is rooted in a particular signing algorithm. This algorithm determines how the signature is produced and, consequently, how it will be verified. Here are a few examples:
HMAC-SHA512: This combines the HMAC (Hash-Based Message Authentication Code) method with the SHA-512 hash function.
ECDSA-MD5: Uses Elliptic Curve Digital Signature Algorithm with the MD5 hash function.
RSA-SHA3-512: Combines the RSA (Rivest–Shamir–Adleman) algorithm with the SHA3-512 hash function.
To put it simply:
HMAC-SHA512, ECDSA-MD5, and RSA-SHA3-512 are signatures.
HMAC-SHA256, ECDSA-SHA256, and RSA-SHA256 are also signatures.
SHA256, on its own, is just a hash.
Digests can be either hash or signatures, the determining factor is the method to reproduce and therefore verify the digest is one or the other.
Demo
Here's how you can create a HMAC-SHA512 signature in Python and then verifying it in JavaScript.
Python (Sender):
import hmac
import hashlib
import base64
def generate_hmac_sha512_signature(secret_key, message):
signature = hmac.new(secret_key.encode(), message.encode(), hashlib.sha512).digest()
return base64.b64encode(signature).decode()
secret_key = "supersecretkey"
message = "Hello, World!"
signature = generate_hmac_sha512_signature(secret_key, message)
print(signature)
This code generates an HMAC-SHA512 signature using Python's hashlib
library, JavaScript (Receiver) verifies the HMAC-SHA512 signature using Node.js's crypto library:
const crypto = require('crypto');
function verifyHMACSHA512Signature(secretKey, message, signature) {
const hmac = crypto.createHmac('sha512', secretKey);
hmac.update(message);
const computedSignature = hmac.digest('base64');
return computedSignature === signature;
}
const secretKey = "supersecretkey";
const message = "Hello, World!";
const receivedSignature = "..."; // This should be the output from the Python script
if (verifyHMACSHA512Signature(secretKey, message, receivedSignature)) {
console.log("Signature is valid!");
} else {
console.log("Signature is NOT valid!");
}
For ECDSA-SHA256 signing in Python, we can use the ecdsa
library in a very similar way. The signature can then be verified in JavaScript using the elliptic
library in an ECMAScript module.
import ecdsa
import base64
def generate_ecdsa_sha256_signature(secret_key, message):
sk = ecdsa.SigningKey.from_string(bytes.fromhex(secret_key), curve=ecdsa.NIST256p)
signature = sk.sign(message.encode())
return base64.b64encode(signature).decode()
secret_key = "your_private_key_in_hex_format"
message = "Hello, World!"
signature = generate_ecdsa_sha256_signature(secret_key, message)
print(signature)
The JavaScript verifier is also not too different:
import { ec } from 'elliptic';
function verifyECDSASHA256Signature(publicKey, message, signature) {
const ecInstance = new ec('p256');
const key = ecInstance.keyFromPublic(publicKey, 'hex');
const isValid = key.verify(message, Buffer.from(signature, 'base64').toString('hex'));
return isValid;
}
const publicKey = 'your_public_key_in_hex_format';
const message = 'Hello, World!';
const receivedSignature = '...'; // This should be the output from the Python script
if (verifyECDSASHA256Signature(publicKey, message, receivedSignature)) {
console.log('Signature is valid!');
} else {
console.log('Signature is NOT valid!');
}
Make sure you have the necessary libraries installed, e.g.:
Python:
pip install ecdsa
JavaScript:
npm install elliptic
Verification
Hash Verification:
When you hash data, the outcome is a fixed-length string of characters, regardless of the input's size. To verify a hash, you take the original message data and run it through the same hashing algorithm again. If the resultant digest matches the previously produced hash, then the data hasn't been tampered with. Essentially, you're reproducing the hash digest with the message data on both ends to ensure they match.
Symmetric Signature Verification (HMAC):
HMAC (Hash-based Message Authentication Code) involves a hash function and a symmetric secret key. When sending a message, you encipher the data using this symmetric key, creating an HMAC signature digest. To verify, the receiver, who also has the symmetric key, will reproduce the HMAC signature from the received message data. If the digests match, it verifies the message's integrity and authenticates its origin, since only someone with the shared secret key could produce the same signature.
Asymmetric Signature Verification (ECDSA):
Elliptic Curve Digital Signature Algorithm (ECDSA) involves an asymmetric key pair: a private key and a corresponding public key. The sender uses the private key to create a signature for the message data. The receiver, or any verifier, uses the sender's public key to verify the signature. The beauty of this method is that verification ensures both the data's integrity and the sender's authenticity. Only the holder of the private key could've produced a signature that the public key can verify, yet the private key itself isn't exposed during this process.
Verification assures Sender Authentication
Verification is paramount for sender authentication. While a simple hash can guarantee data integrity (i.e., the data hasn't changed), it doesn't confirm who sent the data. HMAC, with its symmetric key, offers an extra layer of authentication. However, ECDSA and other asymmetric methods add an even stronger assurance. Since only the private key holder can sign the message in a way that the corresponding public key can verify, receivers can trust not only the message's content but also its source.
Attacks and Weakness
I can't talk about secrets and authentication without at least discussing the threat vectors! Here are a mix of known and potential attacks to think about:
Collision (Birthday) Attacks: Two different sets of data produce the same hash digest. MD5 and SHA-1 are known to be vulnerable to this attack. While SHA-256 is considered secure, its resistance depends on the continuing evolution of computational power and cryptanalysis techniques, and it is unclear if cryptocurrency mining GPUs or ASIC miners have been used for this kind of research..
Pre-image (rainbow tables / dictionary) and Second Pre-image Attacks: Finding a message that hashes to a specific target hash (pre-image) or a different message with the same hash as the original message (second pre-image).
Length Extension Attacks: Exploiting the mathematical properties of hash functions to append additional data to the message. Particularly possible when not using a HMAC. A length extension attack exploits properties of certain hash functions where, given a hash and length of the original data, new data can be added.
Potential Attack Vectors: Quantum computers pose a theoretical threat to hash functions due to their potential to perform complex calculations faster than classical computers.
Brute Force Attacks: Attempting all possible keys until the correct one is found.
Chosen-plaintext Attack: Attackers choose specific plaintexts to be encrypted and analyse the ciphertexts or public key to gather information about the private key. This is a weakness of block cipher modes
Known Plaintext Attacks: Using known parts of the plaintext and ciphertext to derive the key. This happened with Debian's (and subsequent others) factorable RSA and DSA public keys, and before that there was the SHAmbles attack.
Inherent Trust Attacks: Relies on both parties having the secret key. If the key is exposed, data integrity and authenticity are compromised (never was assured).
Man-in-the-Middle Attacks: An attacker intercepts and potentially alters the communication between two parties without them knowing.
Replay Attacks: An adversary captures legitimate encrypted data and later resends it, aiming to cause unauthorized actions or reveal information about the key.
Side-channel Attacks: Attackers gain information from the physical system performing the encryption, like power consumption or acoustic emissions.
Private Key Derivation: Theoretically, if an attacker has enough data and computational power, they might derive the private key from public components, although currently, this is practically impossible for strong asymmetric cryptography.
Forward Secrecy Violation: If reused keys aren't regularly changed, an attacker who gets the long-term private key can decrypt past encrypted data, or forge signatures, with the same key.
In Conclusion
While hashes and digital signatures may seem similar on the surface, they serve different purposes in the realm of security. A hash ensures data integrity, while a digital signature ensures both data integrity and sender authentication.
Understanding this difference is crucial for verification.
Verification is the only way you gain any of the benefits at all.