Storage Encryption: A Cryptographer’s View Shai Halevi IBM Research September 11, 2008 SCN 2008 Motivation “You’re working on storage encryption? It must be the most boring thing in the world…” Anonymous Encryption is the most basic task in crypto We know what secure encryption means We have provably-secure schemes CCA-security, Authenticated encryption, … Even efficient ones What is left to research? September 11, 2008 SCN 2008 Cryptographically interesting problems with storage encryption Choosing the encryption scheme Managing keys and nonces “Transparent” vs. authenticated encryption Avoiding nonce re-use, wrapping keys, … Outside the model Circular encryption September 11, 2008 SCN 2008 Typical Storage Architecture Clients Network Storage devices My focus: encryption at the network/devices Several suchbelongs high-profile Typical threat model: data to the incidents in 2005 organization, encrypted to prevent unauthorized disclosure / modification E.g., encrypting tapes, lest they fall off the truck September 11, 2008 SCN 2008 Two Types of Encryption “Transparent” (length-preserving) Used to add encryption to existing data-paths E.g., software hard-disk encryption, or a bump-in-a-wire encryption box Authenticated (length-increasing) Used when the “storage medium” allows records of flexible-length E.g., tape encryption, client-side encryption, etc. September 11, 2008 SCN 2008 Transparent encryption Storage units (“sectors”): just a pile of bits, no semantics (partially) trusted / untrusted Clients Inputs: keys, plaintext, location in storage Output: ciphertext September 11, 2008 Encryption Module Key Mngmnt SCN 2008 Storage Inherent limitations Random access u Each “sector” encrypted separately u Can mix and match ⇒ C1 C2 … Cm is encryption of P1 P2 … Pm C1’ C2’ …Cm’ is encryption of P1’ P2’… Pm’ C1 C2’ …Cm is encryption of P1 P2’ … Pm Length preserving u Deterministic u When re-encrypting a file, we can see what sectors have changed Length preserving u No authentication u Any ciphertext sector is decrypted as “something” September 11, 2008 SCN 2008 The best we can do: Tweakable Encryption [LRW02] Enciphering/deciphering routines: ciphertext = E(key, tweak, plaintext), plaintext = D(key, tweak, ciphertext) ciphertext-length = plaintext-length key is fixed and secret tweak is arbitrary (even adversarially chosen) Should look like A block cipher with block-size = plaintext-length Different tweaks look like independent keys September 11, 2008 SCN 2008 Narrow vs. Wide Blocks Narrow-blocks Wide-block Each 16-byte block is encrypted separately (think ECB) The entire sector is encrypted together Change anywhere effect entire ciphertext Quantitative, not qualitative difference They are the same if you use 16-byte sectors September 11, 2008 SCN 2008 Some Wide-Block Modes September 11, 2008 SCN 2008 CMC [HR03] P1 T P2 P3 P4 EK’ EK PPP1 M EK PPP2 M CCC4 EK PPP3 M CCC3 EK PPP4 M CCC2 CCC1 EK EK EK EK C4 C3 C2 C1 September 11, 2008 EK = AES with key K T – tweak M = 2(PPP1 ⊕PPP4) = 2(CCC1⊕CCC4) SCN 2008 Mult. In GF(2128) EME [HR04] EK = AES with key K L = another key T = tweak P1 L 2L EK P4 4L 8L EK PPP2 EK PPP3 PPP4 MP EK M = MP ⊕ MC P3 EK PPP1 T⊕ΣPPP P2 2M 4M 8M MC T⊕ΣCCC CCC1 EME∗ [H04] is an extension for sectors longer than 2KB September 11, 2008 CCC2 EK L SCN 2008 EK 2L C1 CCC3 EK 4L C2 CCC4 EK 8L C3 C4 XCB [MF04] A B HCTR [WFW05] T A B T HCH [CS06] A B T,len PRF E Hash Hash CTR Hash E CTR E Hash E September 11, 2008 Hash SCN 2008 E CTR Hash x2 Naor-Reingold Modes: TET [H07], HEH [S07] p1 p2 pm-1 pm Invertible “universal hashing” ECB encryption Invertible “universal hashing” c1 c2 cm-1 cm “Universal hashing” ensures no collisions in the input to the ECB layer September 11, 2008 SCN 2008 Microsoft BitLocker [F06] Not quite an AES mode of operation Ad-hoc “mixing” EK’ EK EK EK EK “Block-cipher-like” mixing Detailed analysis of resistance to attacks, but no reduction to the security of AES September 11, 2008 SCN 2008 Some Narrow-Block Modes September 11, 2008 SCN 2008 LRW Mode [LRW02] P1 EK - AES with key K L - another key L×T in GF(2n) L×(T+1) EK EK C1 C2 A handy optimization: L×T P2 Think about using tweaks T, T+1, T+2, … Once L×T is computed, easy to compute L×(T+1), L×(T+2), … IEEE 1619 intended to standardize this mode September 11, 2008 SCN 2008 What’s Wrong with LRW? Fails when “encrypting its own key” L×(T+1) L L×T 0 L×(T+1) EK X C1= X + L×T Extract L = C1−C2 September 11, 2008 SCN 2008 EK C2= X + L×(T+1) (?) Is This a Problem in Practice? Lively argument in the 1619 mailing list “No one in their right mind will ever do that” Turns out that “encrypting own key” can happen, e.g., in Windows Vista™ A driver does sector-level encryption On hibernate, driver itself stored to disk So a different mode (based on Rogaway’s XEX) was chosen for the standard September 11, 2008 SCN 2008 XTS Mode [Ro04] P1 T T* Tweak is (T,i) 2T* P3 4T* EK’ EK EK EK T* C1 C2 C3 Similar handy optimization T∗=EK’(T), Ti∗=2i×T∗ C = Ti∗⊕EK(P⊕Ti∗) P2 (T,0), (T,1), (T,2), … for sequential blocks About as efficient asWe’ll LRW talk later about circular The attack from before doessecurity not work How do we know that there aren’t other attacks in this vein? September 11, 2008 SCN 2008 Remaining problems Narrow vs. wide-block in practice Wide-block is 2-3 times more expensive Limit attacker to more coarse granularity Traffic-analysis/malleability of whole sectors, rather than each 16-byte block Does this add security in practice? Security beyond the birthday bound With big disk-arrays in the petabytes, q2/2128 may get too close for comfort September 11, 2008 SCN 2008 Authenticated Encryption Each record is stored with a nonce (IV), and an authentication tag EncK(P) = <IV, C, tag> DecK(IV, C, tag) = P / fail IVs must be “fresh” Encrypting the same plaintext twice results in a different ciphertexts September 11, 2008 SCN 2008 Many “standard” Encryption Modes Two-Pass Modes Encrypt-then-authenticate (e.g., GCM [MV05]) Authenticate-then-encrypt (e.g., CCM [WHF03]) Choose IV, C=EK(IV, P), tag=MACK’(IV,C) E: AES-based encryption, MAC: HMAC or others Choose IV, t=MACK’(IV,P), C=EK(IV, P, t) One-Pass Modes (IAPM [J01], OCB [R01],…) Compute CTXT & MAC together, more efficient None is used in practice today Due to patent issues September 11, 2008 SCN 2008 Whence Cometh thy Nonce? Re-using the same (key,IV) pair to encrypt different records is a security violation Especially in schemes based on CTR mode Especially2 in GCM mode Re-using (key,IV) is the same as two-time-pad Re-using (key,IV) may leak the authentication key Avoiding nonce re-use may be tricky September 11, 2008 SCN 2008 Common Tape-Encryption Setting Key Same key can be Mngmnt served to several encryption modules They must avoid using the same (key,IV) pair Without much coordination Keys Encryption Module Encryption Module Clients Data September 11, 2008 SCN 2008 Encryption Module tapes tapes tapes Random Nonces? Some modes have 96-bit nonces (GCM) Is this enough? How many times can the same key be served? What if you use just one key for all your corporate tapes? September 11, 2008 SCN 2008 Systematic Nonces? E.g., use the module serial # in the nonce Reduces the IV space further Sensitive to mis-configuration Module must remember “the current nonce” Through reset, power-failures, crashes, … Using encryption modules from several different manufacturers? An organization may have two drives from IBM, one from HP, one from SUN, etc. September 11, 2008 SCN 2008 Better: Wrapped Keys The served key (from key-management) is only used as a key-encrypting-key (KEK) Module generates a “fresh” data key (DK) Use KEK to encrypt DK, store ciphertext on tape Use DK to encrypt data David Wheeler: All problems in computer science can be solved by another level of indirection… … but that usually will create another problem. September 11, 2008 SCN 2008 How to Wrap Keys? Using standard encryption (symmetric/pkey) Using “deterministic encryption” Need to worry again about fresh IVs / randomness E.g., ANS X9.102 draft standard [RS06]: Deterministic Authenticated Encryption Essentially “the strongest security possible with deterministic encryption” Similar to strong PRP, but need not be a bijection SIV mode: IV = PRFk1(DK), C = CTRk2(IV, DK) September 11, 2008 SCN 2008 More on Key-Wrapping [GH08] Some “secure schemes” are not DAE Secure key-wrap is just like secure encryption, except the plaintext is random DAE an overkill for wrapping encryption keys Rather than adversarially chosen Hash-then-Encrypt: “SIV-like” constructions IV = Hash(DK), C = ENC(IV, DK) Hash either keyed or not ENC any “standard encryption mode” September 11, 2008 SCN 2008 Hash-then-Encrypt Hash XOR Linear Universal 2nd preimage Encrypt CTR ECB * CBC ? Masked ECB/CBC XEX September 11, 2008 SCN 2008 * Remaining Problems Authenticated Encryption does not solve: “Replay attacks:” replace current record on medium with a previous one Re-ordering of records No good crypto solutions to either problem Merkel trees work, but they are too expensive Not clear that one can do better [DNRV08] September 11, 2008 SCN 2008 Back to “Key-Dependent Security” Adversary sees encryptions of the secret key Maybe even some functions of this key How to define security in this case? How to achieve it? Aside: The definitional issue was noted already in [GM84], but explicitly scoped out [CL01] had a “key-dependent-secure” public-key encryption in the ROM September 11, 2008 SCN 2008 [BRS01] Definitions Start from the “usual notions” q1 k Answerk(q1) q2 Answerk(q2) … g Answerk(g(k)) Let the attacker specify a function of the key September 11, 2008 SCN 2008 [BRS01] Construction Textbook scheme: Enck(m) = <r, fk(r)⊕m> With fk(x) = H(k|x) and H a random oracle, this is “key-dependent-secure” As usual: in lieu of a true random oracle, we can use, e.g., SHA1 fk(x) = SHA1-Compression(IV=k, M=x) This should be safe… September 11, 2008 SCN 2008 [HK07] Insecurity in Standard Model SHA1 follows the Davis-Meyer approach Roughly Compression(IV,M) = EM(IV)⊕IV E is a “block cipher” (easily invertible given M) SHA1 actually uses + rather than ⊕ But we will ignore that fact We get Enck(m) = <r, Er(k)⊕k⊕m> In particular Enck(k) = <r, Er(k)⊕k⊕k> Given <r,c> recover k = Er-1(c) September 11, 2008 SCN 2008 Key-dependent security w/o ROM? [HH’08]: Unlikely from “general assumptions” [BHHO’08]: But possible from DDH Think ElGamal Encryption: pk=(v,w=va), sk=a, Encpk(m)=<vr, m×wr> So Encpk(sk)=<vr, a×var> Security unlikely to follow from DDH What if we use sk=ua (u≠v)? We get security from DDH, but cannot decrypt… September 11, 2008 SCN 2008 Decrypting with “sk in the exponent”? Use single bits in the exponent for secret key Can recover b from vb pk = (v1 v2 … vm w=Π vibi) sk = (ub1 ub2 … ubm) Encpk(m) = (v1r v2r … vmr m×wr ) So Encpk(ubi) = (v1r v2r … vmr ubi×wr ) Thm: This is CPA-secure against encryptions of any affine function of the secret key [CCS08] build on this to get CCA-security September 11, 2008 SCN 2008 Morals to take away Applying crypto to real-world systems is fun Can even find interesting questions to look at 1st law of commercial crypto: “cryptosystems will be (ab)used beyond their security model” We still do not know everything there is to know about encryption Storage encryption is (a little) special Mostly: harder to get synchronization between encryptor and decryptor September 11, 2008 SCN 2008 Thank you September 11, 2008 SCN 2008