Why KEM-DEM Hybrid Encryption Matters for Post-Quantum Security
You cannot encrypt a file with Kyber768 directly.
Kyber is a Key Encapsulation Mechanism (KEM) — it produces a shared secret, not a ciphertext of your data. To build a complete encryption system, you combine a KEM with a symmetric cipher. This combination is called KEM-DEM (Key Encapsulation Mechanism + Data Encapsulation Mechanism), and it is the standard pattern for building practical encryption from any KEM.
This article explains why the KEM-DEM architecture exists, how Qpher implements it, and why each component is essential.
The KEM-DEM Pattern
A KEM-DEM encryption system has three components, each with a specific and proven role:
KEM (Key Encapsulation): Securely establishes a shared secret between two parties. This is the quantum-safe part — Kyber768 resists both classical and quantum attacks.
KDF (Key Derivation): Derives a symmetric encryption key from the shared secret. HKDF-SHA256 stretches and separates the key material with domain separation, ensuring that the same shared secret never produces the same AES key in different contexts.
DEM (Data Encapsulation): Encrypts your actual data using the derived symmetric key. AES-256-GCM provides authenticated encryption — both confidentiality and tamper detection.
Why this separation? Each component is independently proven secure under well-studied assumptions. Composing proven-secure components with clean interfaces yields a system whose security properties are analyzable and trustworthy. Monolithic encryption designs — where key exchange and data encryption are intertwined — are harder to analyze and more prone to subtle vulnerabilities.
The analogy: KEM is a sealed envelope containing a fresh house key. KDF is cutting a copy from that key for a specific door. DEM is using that copy to lock the house.
Qpher's Implementation
When you call POST /api/v1/kem/encrypt, Qpher executes the full KEM-DEM pipeline internally:
Step 1: Kyber768 Encapsulation
Qpher generates a fresh Kyber768 ciphertext and shared secret for every encrypt call. The 32-byte shared secret exists only in memory during the operation. It is never stored, never logged, and never returned in the API response.
Step 2: HKDF-SHA256 Key Derivation
The shared secret is fed into HKDF-SHA256 with a domain-specific info string. This produces a 256-bit AES key. The domain separation ensures that even if the same shared secret were somehow reused (it is not — each call generates a fresh one), the derived keys would differ across contexts.
Step 3: AES-256-GCM Encryption
Your plaintext is encrypted with AES-256-GCM using the derived key and a random 12-byte IV (nonce). GCM mode provides authenticated encryption: the ciphertext includes a 16-byte authentication tag that detects any tampering. If even a single bit of the ciphertext is modified, decryption fails.
The resulting ciphertext has this structure:
| Component | Size | Purpose |
|---|---|---|
| KEM ciphertext | 1,088 bytes | Kyber768 encapsulated shared secret |
| IV (nonce) | 12 bytes | AES-GCM initialization vector |
| AES ciphertext | = plaintext size | Your encrypted data |
| GCM auth tag | 16 bytes | Tamper detection |
The shared secret is ephemeral. A fresh one is generated for every encrypt call. It exists only in memory during the operation and is securely erased afterward. It is never stored, logged, or returned in the API response.
Why Not Just AES?
AES-256 is quantum-safe. So why not just use AES directly?
AES is a symmetric cipher — both the encryptor and decryptor must possess the same key. Before you can encrypt anything, you need a way to agree on the key with the other party. This is the key distribution problem, and it is the fundamental reason public-key cryptography exists.
Traditionally, key agreement happens through RSA or ECDH. Your TLS connection does exactly this: ECDH establishes a shared secret, which is used as the AES session key. The AES part is fine. The ECDH part is quantum-vulnerable.
KEM-DEM replaces the ECDH step with Kyber768. The AES step stays exactly the same. The architecture is identical to what TLS has always done — only the key exchange mechanism is upgraded to resist quantum attacks.
If you already have a pre-shared key (both parties have the same secret through some out-of-band mechanism), you can use AES directly without KEM. But in practice, pre-shared keys are impractical for most applications. You need a way to establish keys dynamically and securely, which is precisely what KEM provides.
Why Not Just Kyber?
Kyber encapsulates exactly one thing: a fixed-size 32-byte shared secret. That is the mathematical operation defined by the KEM primitive. You cannot feed a 1MB file into Kyber and get a ciphertext back — the algorithm is not designed for that.
This is not a limitation of Kyber specifically. It is how all public-key encryption has always worked:
- RSA "encryption" was actually RSA key transport + AES internally.
- ECIES is ECDH key exchange + KDF + AES.
- PGP uses RSA or ECDH to encrypt a session key, then AES to encrypt the message.
KEM-DEM makes this pattern explicit and clean. Rather than bolting arbitrary-length encryption onto a key exchange primitive (which is error-prone), KEM-DEM cleanly separates the two responsibilities. The KEM handles quantum-safe key establishment. The DEM handles efficient bulk encryption.
Think of KEM as the quantum-safe evolution of the RSA/ECDH key exchange step in TLS. The AES part stays exactly the same. If you understand how TLS works, you already understand KEM-DEM — the only difference is which algorithm negotiates the session key.
Ephemeral Shared Secrets
Each encrypt call generates a fresh shared secret. This has important security properties:
Non-deterministic by default. Encrypting the same plaintext twice produces different ciphertexts, because each call uses a new shared secret and a new random IV. An attacker cannot determine whether two ciphertexts contain the same plaintext.
Forward-secrecy-like property. Compromising one ciphertext (even if the KEM ciphertext is somehow broken) does not help with any other ciphertext. Each encryption is independent.
No secret reuse. The shared secret is never stored or reused. It exists in memory for the duration of a single API call, then is securely erased.
For the rare cases where you need deterministic output (content-addressable storage, deduplication), Qpher supports an opt-in mode=deterministic with an explicit salt parameter. See the Deterministic Encryption guide for details and security considerations.
Practical Implications
Ciphertext Overhead
Each encryption adds approximately 1,116 bytes of overhead:
- 1,088 bytes for the KEM ciphertext
- 12 bytes for the IV
- 16 bytes for the GCM authentication tag
For different data sizes, the overhead impact varies dramatically:
| Plaintext Size | Ciphertext Size | Overhead |
|---|---|---|
| 100 bytes | ~1,216 bytes | ~1,116% |
| 1 KB | ~2,140 bytes | ~109% |
| 10 KB | ~11,356 bytes | ~11% |
| 100 KB | ~103,516 bytes | ~1.1% |
| 1 MB | ~1,049,692 bytes | ~0.1% |
For typical API payloads (KB range), the overhead is modest. For larger data (files, database backups), it is negligible.
Performance
The KEM step dominates the latency budget (~8ms for the Kyber768 encapsulation). AES-256-GCM encryption is hardware-accelerated on modern CPUs and adds negligible time even for large payloads. The total API call — including network round-trip, authentication, KEM, and AES — completes in under 15ms at the 95th percentile.
Next Steps
You now understand the complete Qpher encryption architecture: Kyber768 for quantum-safe key establishment, HKDF for key derivation, and AES-256-GCM for authenticated data encryption.
The final article in the intermediate path puts everything together with hands-on exercises in the Qpher Playground — encrypting, decrypting, signing, verifying, and rotating keys interactively.
For integration guidance, see the Encrypt Data guide or the Deterministic Encryption guide.