Web Security: Encoding and Hashing Guide
Encoding, hashing, and encryption are three fundamentally different operations that developers frequently confuse. Using the wrong operation in the wrong context creates security vulnerabilities — storing passwords with Base64 encoding instead of bcrypt hashing, transmitting sensitive data with URL encoding instead of TLS encryption, or using MD5 hashing for data integrity when SHA-256 is required. Each operation has a specific purpose, specific guarantees, and specific limitations that every developer must understand.
This guide provides a comprehensive, practical reference for choosing the right operation for every common security scenario, with code examples in JavaScript and Python, performance benchmarks, and a decision flowchart you can apply immediately.
The Three Operations
Encoding — Reversible Format Conversion
Encoding transforms data from one format to another for compatibility, not security. Anyone can decode encoded data without a key. The encoding algorithm is public, deterministic, and designed to be easily reversed.
Common encoding schemes:
- Base64 — Binary-to-text encoding for embedding images in CSS/HTML, transmitting binary data in JSON, and MIME email attachments. Uses 64 ASCII characters (A-Z, a-z, 0-9, +, /).
- URL Encoding (Percent-Encoding) — Encodes special characters in URLs (spaces become
%20,&becomes%26) to comply with RFC 3986 URI syntax. - HTML Entities — Encodes characters that have special meaning in HTML (
<becomes<,"becomes") to prevent browser interpretation as markup. - UTF-8 — Encodes Unicode code points into 1-4 byte sequences for text storage and transmission.
// Base64 encoding — anyone can decode this
const encoded = btoa('secret data'); // "c2VjcmV0IGRhdGE="
const decoded = atob(encoded); // "secret data" — no key needed
// URL encoding
const urlEncoded = encodeURIComponent('hello world & more');
// "hello%20world%20%26%20more"
const urlDecoded = decodeURIComponent(urlEncoded);
// "hello world & more"
Important: Base64 is NOT encryption. Learn why.
Hashing — One-Way Fingerprint
Hashing produces a fixed-size “fingerprint” (digest) from any input. It cannot be reversed — you cannot recover the original data from the hash. The same input always produces the same hash, but even a tiny change in the input produces a completely different hash (avalanche effect).
Common hash functions:
- SHA-256 — 256-bit digest for data integrity verification, checksums, digital signatures. Part of the SHA-2 family. Used in TLS, Bitcoin, and git.
- SHA-3 (Keccak) — Alternative to SHA-2 with a different internal structure. NIST standard since 2015.
- bcrypt — Password-specific hash with built-in salt and configurable cost factor. Deliberately slow to resist brute-force attacks.
- Argon2 — Winner of the 2015 Password Hashing Competition. Memory-hard, making GPU/ASIC attacks expensive. The current best practice for new applications.
- MD5 — 128-bit digest. Broken for security — collisions found in 2004. Only use for non-security checksums (file deduplication).
// SHA-256 hashing — same input always produces the same hash
const hash = await crypto.subtle.digest('SHA-256',
new TextEncoder().encode('Hello, World!')
);
// → e3b0c44298fc1c14... (64 hex characters, always the same)
// Avalanche effect — tiny change, completely different hash
// "Hello, World!" → 315f5bdb76d...
// "Hello, World" → f4bb1975c1d... (completely different)
Generate hashes instantly with our Hash Generator.
Encryption — Reversible with a Key
Encryption transforms data into an unreadable format that can only be decoded with the correct key. It provides confidentiality — without the key, the ciphertext is computationally infeasible to reverse.
Common encryption algorithms:
- AES-256-GCM — Symmetric encryption (same key encrypts and decrypts). The gold standard for data-at-rest and data-in-transit encryption. Used in TLS, disk encryption, and cloud storage.
- ChaCha20-Poly1305 — Symmetric encryption optimized for software implementations. Used in TLS 1.3, WireGuard VPN, and mobile devices.
- RSA-2048+ — Asymmetric encryption (public key encrypts, private key decrypts). Used for key exchange, digital signatures, and SSL certificates.
- ECDSA / Ed25519 — Elliptic curve cryptography for digital signatures. Shorter keys than RSA with equivalent security.
// AES-256-GCM encryption — requires a key to decrypt
const key = await crypto.subtle.generateKey(
{ name: 'AES-GCM', length: 256 }, true, ['encrypt', 'decrypt']
);
const iv = crypto.getRandomValues(new Uint8Array(12));
const ciphertext = await crypto.subtle.encrypt(
{ name: 'AES-GCM', iv }, key, new TextEncoder().encode('sensitive data')
);
// ciphertext is unreadable without the key
Comprehensive Comparison
| Property | Encoding | Hashing | Encryption |
|---|---|---|---|
| Reversible? | ✅ Yes (no key needed) | ❌ No (one-way) | ✅ Yes (with correct key) |
| Purpose | Format compatibility | Integrity verification | Confidentiality |
| Security? | ❌ None | ⚠️ Integrity only | ✅ Confidentiality + Integrity |
| Requires a key? | ❌ No | ❌ No | ✅ Yes |
| Output size | ~33% larger than input | Fixed (32 bytes for SHA-256) | ~Same as input + overhead |
| Deterministic? | ✅ Same input → same output | ✅ Same input → same hash | ❌ Random IV → different output each time |
| Speed | Very fast (100+ MB/s) | Fast (500+ MB/s for SHA-256) | Fast (1+ GB/s for AES-NI) |
| Common algorithms | Base64, URL, HTML entities | SHA-256, bcrypt, Argon2 | AES-256, RSA, ChaCha20 |
When to Use What — Decision Flowchart
What is your goal?
│
├─ STORE PASSWORDS
│ → Use bcrypt (cost 12+) or Argon2id
│ → NEVER use SHA-256, MD5, or Base64
│
├─ VERIFY DATA INTEGRITY (file checksum, commit hash)
│ → Use SHA-256 or SHA-3
│ → Avoid MD5 and SHA-1 (collision-vulnerable)
│
├─ PROTECT DATA FROM UNAUTHORIZED ACCESS
│ → Data in transit: Use TLS 1.3 (HTTPS)
│ → Data at rest: Use AES-256-GCM
│ → Key exchange: Use RSA-2048+ or ECDH
│
├─ EMBED BINARY DATA IN TEXT FORMAT
│ → Use Base64 encoding
│ → For URLs: Use Base64URL variant
│
├─ PREVENT XSS IN HTML OUTPUT
│ → Use HTML entity encoding
│ → Framework auto-escaping (React, Angular, Vue)
│
├─ BUILD SAFE URL QUERY STRINGS
│ → Use URL encoding (percent-encoding)
│
└─ SIGN DATA (prove authenticity without encrypting)
→ Use HMAC-SHA256 or Ed25519 digital signatures
Practical Scenarios
| Scenario | Use | Algorithm | Tool |
|---|---|---|---|
| Embed image in CSS | Encoding | Base64 | Base64 Encoder |
| Store user passwords | Hashing | bcrypt (cost 12) | bcrypt Generator |
| Verify file integrity | Hashing | SHA-256 | Hash Generator |
| Prevent XSS attacks | Encoding | HTML entities | HTML Entity Encoder |
| Inspect auth tokens | Decoding | Base64URL | JWT Decoder |
| Build URL query strings | Encoding | Percent-encoding | URL Encoder |
| Protect API payloads | Encryption | AES-256-GCM over HTTPS | TLS configuration |
| Digital signatures | Signing | HMAC-SHA256 / Ed25519 | OpenSSL / crypto library |
Algorithm Performance Comparison
Benchmarks on a modern machine (Apple M1, Node.js 20):
| Algorithm | Type | Speed | Output Size | Security Level |
|---|---|---|---|---|
| Base64 | Encoding | 2,500 MB/s | 133% of input | ❌ None |
| MD5 | Hash | 1,800 MB/s | 16 bytes (128 bits) | ❌ Broken |
| SHA-1 | Hash | 1,400 MB/s | 20 bytes (160 bits) | ⚠️ Deprecated |
| SHA-256 | Hash | 800 MB/s | 32 bytes (256 bits) | ✅ Secure |
| SHA-3-256 | Hash | 600 MB/s | 32 bytes (256 bits) | ✅ Secure |
| bcrypt (cost 12) | Password hash | 0.003 MB/s | 60 bytes | ✅ Secure |
| Argon2id | Password hash | 0.001 MB/s | Configurable | ✅ Best practice |
| AES-256-GCM | Encryption | 4,000 MB/s (AES-NI) | Input + 28 bytes | ✅ Secure |
Note how bcrypt and Argon2 are intentionally slow — this is a feature, not a bug. Slowness makes brute-force password cracking computationally expensive.
Common Mistakes
Mistake 1: Using Base64 for “Security”
Base64 only changes the format. It provides zero confidentiality. An attacker can decode any Base64 string in microseconds. Read more: Base64 is Not Encryption.
// ❌ WRONG — provides no security
const "hidden" = btoa(JSON.stringify({ apiKey: 'sk_live_abc123' }));
// ✅ RIGHT — use environment variables + secrets manager
const apiKey = process.env.API_KEY; // Injected at runtime, never in code
Mistake 2: Using SHA-256 for Passwords
SHA-256 is fast — too fast for password storage. An attacker with a modern GPU can compute 10 billion SHA-256 hashes per second. This means every possible 8-character password can be cracked in under a minute.
# ❌ WRONG — SHA-256 is too fast for passwords
import hashlib
password_hash = hashlib.sha256(b"password123").hexdigest()
# ✅ RIGHT — bcrypt with cost factor 12
import bcrypt
password_hash = bcrypt.hashpw(b"password123", bcrypt.gensalt(rounds=12))
Read more: bcrypt vs SHA-256.
Mistake 3: Skipping HTML Entity Encoding
Displaying user input without encoding HTML entities enables Cross-Site Scripting (XSS) attacks:
<!-- ❌ WRONG — user input rendered directly -->
<div>Welcome, ${username}</div>
<!-- If username is: <script>steal(cookies)</script> — XSS! -->
<!-- ✅ RIGHT — HTML entity encoding -->
<div>Welcome, ${escapeHtml(username)}</div>
<!-- <script> becomes <script> — safe -->
Modern frameworks (React, Angular, Vue, Svelte) auto-escape by default. The risk is when you bypass auto-escaping with dangerouslySetInnerHTML, v-html, or [innerHTML].
Read more: XSS Prevention with HTML Entity Encoding.
Mistake 4: Using MD5 for Anything Security-Related
MD5 has been cryptographically broken since 2004. Collisions (two different inputs producing the same hash) can be generated in seconds. Never use MD5 for:
- Password storage
- Digital signatures
- Certificate verification
- Any integrity check where an attacker could craft collisions
MD5 is acceptable only for non-security checksums (detecting accidental corruption, file deduplication).
Mistake 5: Hardcoding Encryption Keys
// ❌ WRONG — key visible in source code
const key = 'my-secret-encryption-key-12345';
const encrypted = encrypt(data, key);
// ✅ RIGHT — key from environment/secrets manager
const key = await secretsManager.getSecret('ENCRYPTION_KEY');
const encrypted = encrypt(data, key);
Encryption is only as secure as the key management. Store keys in HSMs (Hardware Security Modules), cloud KMS (AWS KMS, Google Cloud KMS), or secrets managers (HashiCorp Vault) — never in source code, configuration files, or environment variables on shared systems.
Frequently Asked Questions
Can I use SHA-256 with a salt for password storage?
While salting improves SHA-256 over unsalted use (preventing rainbow table attacks), SHA-256 is still too fast for password hashing. An attacker can compute billions of salted SHA-256 hashes per second on a modern GPU. Use bcrypt (cost 12+), Argon2id, or scrypt — these are specifically designed to be slow and memory-intensive.
What is the difference between symmetric and asymmetric encryption?
Symmetric encryption (AES) uses the same key for both encryption and decryption — both parties must share the key securely. Asymmetric encryption (RSA) uses a key pair: a public key (shared freely) for encryption and a private key (kept secret) for decryption. In practice, asymmetric encryption is used to exchange a symmetric key, then symmetric encryption handles the actual data (this is how TLS works).
Is HMAC the same as hashing?
HMAC (Hash-based Message Authentication Code) combines a hash function with a secret key to provide both integrity and authenticity. Regular hashing (SHA-256) only provides integrity — anyone can compute the hash. HMAC proves that the message was created by someone who knows the secret key. Use HMAC for API request signing, webhook verification, and message authentication.
When should I use SHA-3 instead of SHA-256?
SHA-256 (SHA-2 family) and SHA-3 are both secure and approved by NIST. Use SHA-256 for compatibility — it has wider hardware support (AES-NI includes SHA-256 acceleration) and library availability. Use SHA-3 when regulations specifically require it, or when you want defense-in-depth against potential future attacks on the SHA-2 construction.