Chapman & Hall/CRC CRYPTOGRAPHY AND NETWORK SECURITY
INTRODUCTION TO MODERN CRYPTOGRAPHY
Second Edition
Jonathan Katz Yehuda Lindell
INTRODUCTION TO MODERN CRYPTOGRAPHY
Second Edition
CHAPMAN & HALL/CRC CRYPTOGRAPHY AND NETWORK SECURITY
Published Titles
Series Editor Douglas R. Stinson
Lidong Chen and Guang Gong, Communication System Security Shiu-Kai Chin and Susan Older, Access Control, Security, and Trust:
A Logical Approach
M. Jason Hinek, Cryptanalysis of RSA and Its Variants
Antoine Joux, Algorithmic Cryptanalysis
Jonathan Katz and Yehuda Lindell, Introduction to Modern Cryptography, Second Edition
Sankar K. Pal, Alfredo Petrosino, and Lucia Maddalena, Handbook on Soft Computing for Video Surveillance
Burton Rosenberg, Handbook of Financial Cryptography and Security Forthcoming Titles
Maria Isabel Vasco, Spyros Magliveras, and Rainer Steinwandt, Group Theoretic Cryptography
Chapman & Hall/CRC CRYPTOGRAPHY AND NETWORK SECURITY
INTRODUCTION TO MODERN CRYPTOGRAPHY
Second Edition
Jonathan Katz
University of Maryland College Park, MD, USA
Yehuda Lindell
Bar-Ilan University Ramat Gan, Israel
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742
© 2015 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
No claim to original U.S. Government works Version Date: 20140915
International Standard Book Number-13: 978-1-4665-7027-6 (eBook – PDF)
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information stor- age or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copy- right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro- vides licenses and registration for a variety of users. For organizations that have been granted a photo- copy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com
and the CRC Press Web site at http://www.crcpress.com
Contents
Preface
I Introduction and Classical Cryptography
1 Introduction
xv
3
…… 3 …… 4 …… 8 …… 16 …… 17 …… 20 …… 22 …… 22 …… 23
2
II
3
Perfectly Secret Encryption 25
2.1 Definitions ………………………. 26 2.2 TheOne-TimePad ………………….. 32 2.3 LimitationsofPerfectSecrecy …………….. 35 2.4 *Shannon’sTheorem …………………. 36 ReferencesandAdditionalReading …………….. 37 Exercises ………………………….. 38
Private-Key (Symmetric) Cryptography
Private-Key Encryption 43
3.1 ComputationalSecurity ………………… 43 3.1.1 TheConcreteApproach …………….. 44 3.1.2 TheAsymptoticApproach …………… 45
3.2 Defining Computationally Secure Encryption . . . . . . . . . 52 3.2.1 TheBasicDefinitionofSecurity . . . . . . . . . . . . 53 3.2.2 *SemanticSecurity……………….. 56
3.3 Constructing Secure Encryption Schemes . . . . . . . . . . . 60
3.3.1 Pseudorandom Generators and Stream Ciphers . . . . 60
3.3.2 ProofsbyReduction………………. 65
3.3.3 A Secure Fixed-Length Encryption Scheme . . . . . . 66
1.1 Cryptography and Modern Cryptography . . . . .
1.2 The Setting of Private-Key Encryption . . . . . .
1.3 Historical Ciphers and Their Cryptanalysis . . . .
1.4 Principles of Modern Cryptography . . . . . . . .
1.4.1 Principle 1 – Formal Definitions . . . . . .
1.4.2 Principle 2 – Precise Assumptions . . . . .
1.4.3 Principle3–ProofsofSecurity . . . . . . .
1.4.4 Provable Security and Real-World Security
ReferencesandAdditionalReading . . . . . . . . . . .
Exercises ………………………….. 24
vii
viii
3.4 StrongerSecurityNotions ……………….. 71
3.4.1 Security for Multiple Encryptions . . . . . . .
3.4.2 Chosen-Plaintext Attacks and CPA-Security .
3.5 Constructing CPA-Secure Encryption Schemes . . .
3.5.1 Pseudorandom Functions and Block Ciphers
….. 71 ….. 73 ….. 77 ….. 77
3.5.2 CPA-Secure Encryption from Pseudorandom Functions 82 3.6 ModesofOperation ………………….. 86 3.6.1 Stream-CipherModesofOperation. . . . . . . . . . . 86 3.6.2 Block-CipherModesofOperation . . . . . . . . . . . 88 3.7 Chosen-CiphertextAttacks ………………. 96 3.7.1 DefiningCCA-Security……………… 96 3.7.2 Padding-OracleAttacks …………….. 98 ReferencesandAdditionalReading …………….. 101 Exercises ………………………….. 102
4 Message Authentication Codes 107
4.1 MessageIntegrity …………………… 107 4.1.1 Secrecyvs.Integrity ………………. 107
4.1.2 Encryption vs. Message Authentication . . .
4.2 Message Authentication Codes – Definitions . . . .
4.3 Constructing Secure Message Authentication Codes
. . . . . 108 ….. 110 ….. 116
4.3.1 AFixed-LengthMAC ……………… 116
4.3.2 DomainExtensionforMACs ………….. 118
4.4 CBC-MAC ………………………. 122 4.4.1 TheBasicConstruction …………….. 123 4.4.2 *ProofofSecurity ……………….. 125
4.5 AuthenticatedEncryption ………………. 131
4.5.1 Definitions …………………… 131
4.5.2 GenericConstructions……………… 132
4.5.3 SecureCommunicationSessions. . . . . . . . . . . . . 140
4.5.4 CCA-SecureEncryption…………….. 141
4.6 *Information-TheoreticMACs …………….. 142 4.6.1 Constructing Information-Theoretic MACs . . . . . . 143
4.6.2 Limitations on Information-Theoretic MACs . . . . . 145 ReferencesandAdditionalReading …………….. 146 Exercises ………………………….. 147
5 Hash Functions and Applications 153
5.1 Definitions ………………………. 153 5.1.1 CollisionResistance ………………. 154 5.1.2 WeakerNotionsofSecurity …………… 156
5.2 Domain Extension: The Merkle–Damg ̊ard Transform . . . . 156
5.3 Message Authentication Using Hash Functions . . . . . . . . 158 5.3.1 Hash-and-MAC…………………. 159 5.3.2 HMAC …………………….. 161
5.4 GenericAttacksonHashFunctions . . . . . . . . . . . . . . 164
5.4.1 Birthday Attacks for Finding Collisions . . . . . . . . 164
5.4.2 Small-SpaceBirthdayAttacks. . . . . . . . . . . . . . 166
5.4.3 *Time/Space Tradeoffs for Inverting Functions . . . . 168
5.5 TheRandom-OracleModel ………………. 174
5.5.1 TheRandom-OracleModelinDetail . . . . . . . . . . 175
5.5.2 Is the Random-Oracle Methodology Sound? . . . . . . 179
5.6 Additional Applications of Hash Functions . . . . . . . . . . 182
5.6.1 FingerprintingandDeduplication. . . . . . . . . . . . 182
5.6.2 MerkleTrees ………………….. 183
5.6.3 PasswordHashing ……………….. 184
5.6.4 KeyDerivation…………………. 186
5.6.5 CommitmentSchemes ……………… 187
ReferencesandAdditionalReading …………….. 189 Exercises ………………………….. 189
6 Practical Constructions of Symmetric-Key Primitives 193
6.1 StreamCiphers ……………………. 194
6.1.1 Linear-FeedbackShiftRegisters. . . . . . . . . . . . . 195
6.1.2 AddingNonlinearity ………………. 197
6.1.3 Trivium …………………….. 198
6.1.4 RC4………………………. 199
6.2 BlockCiphers …………………….. 202
6.2.1 Substitution-Permutation Networks . . . . . . . . . . 204
6.2.2 FeistelNetworks ………………… 211
6.2.3 DES–TheDataEncryptionStandard . . . . . . . .
6.2.4 3DES: Increasing the Key Length of a Block Cipher
6.2.5 AES – The Advanced Encryption Standard . . . . .
6.2.6 *Differential and Linear Cryptanalysis . . . . . . . .
. 212 . 220 . 223 . 225
6.3 HashFunctions ……………………. 231 6.3.1 HashFunctionsfromBlockCiphers . . . . . . . . . . 232 6.3.2 MD5………………………. 234 6.3.3 SHA-0,SHA-1,andSHA-2 …………… 234 6.3.4 SHA-3(Keccak) ………………… 235
ReferencesandAdditionalReading …………….. 236 Exercises ………………………….. 237
7 *Theoretical Constructions of Symmetric-Key Primitives 241
7.1 One-WayFunctions ………………….. 242 7.1.1 Definitions …………………… 242 7.1.2 CandidateOne-WayFunctions . . . . . . . . . . . . . 245 7.1.3 Hard-CorePredicates ……………… 246
7.2 From One-Way Functions to Pseudorandomness . . . . . . . 248
7.3 Hard-Core Predicates from One-Way Functions . . . . . . . 250 7.3.1 ASimpleCase …………………. 250
ix
x
7.3.2 AMoreInvolvedCase ……………… 251 7.3.3 TheFullProof …………………. 254
7.4 Constructing Pseudorandom Generators . . . . . . . . . . . 7.4.1 Pseudorandom Generators with Minimal Expansion
7.4.2 IncreasingtheExpansionFactor . . . . . . .
7.5 Constructing Pseudorandom Functions . . . . . . .
7.6 Constructing (Strong) Pseudorandom Permutations
7.7 Assumptions for Private-Key Cryptography . . . . .
7.8 Computational Indistinguishability . . . . . . . . .
ReferencesandAdditionalReading …………….. 278 Exercises ………………………….. 279
III Public-Key (Asymmetric) Cryptography
8 Number Theory and Cryptographic Hardness Assumptions 285
8.1 PreliminariesandBasicGroupTheory . . . . . . . . . . . . 287
8.1.1 PrimesandDivisibility……………… 287
8.1.2 ModularArithmetic ………………. 289
8.1.3 Groups …………………….. 291
8.1.4 TheGroupZ∗N …………………. 295
8.1.5 *Isomorphisms and the Chinese Remainder Theorem . 297
8.2 Primes,Factoring,andRSA ……………… 302
8.2.1 GeneratingRandomPrimes…………… 303
8.2.2 *PrimalityTesting ……………….. 306
8.2.3 TheFactoringAssumption …………… 311
8.2.4 TheRSAAssumption ……………… 312
8.2.5 *Relating the RSA and Factoring Assumptions . . . . 314
8.3 Cryptographic Assumptions in Cyclic Groups . . . . . . . . . 316
8.3.1 CyclicGroupsandGenerators . . . . . . . . . . . . . 316
8.3.2 The Discrete-Logarithm/Diffie–Hellman Assumptions 319
8.3.3 Workingin(Subgroupsof)Z∗p …………. 322
8.3.4 EllipticCurves …………………. 325
8.4 *CryptographicApplications ……………… 332
8.4.1 One-Way Functions and Permutations . . . . . . . . . 332
8.4.2 Constructing Collision-Resistant Hash Functions . . . 335
ReferencesandAdditionalReading …………….. 337 Exercises ………………………….. 338
9 *Algorithms for Factoring and Computing Discrete Loga- rithms 341
9.1 AlgorithmsforFactoring ……………….. 342
9.1.1 Pollard’sp−1Algorithm ……………. 343
9.1.2 Pollard’sRhoAlgorithm…………….. 344
9.1.3 TheQuadraticSieveAlgorithm…………. 345
9.2 Algorithms for Computing Discrete Logarithms . . . . . . . 348
. . . . . . . . …. . . . . . . . .
. 257 . 258 . 259 . 265 . 269 . 273 . 276
9.2.1 ThePohlig–HellmanAlgorithm . . . . . . . . . . . . . 350
9.2.2 The Baby-Step/Giant-Step Algorithm . . . . . . . . . 352
9.2.3 Discrete Logarithms from Collisions . . . . . . . . . . 353 9.2.4 TheIndexCalculusAlgorithm . . . . . . . . . . . . . 354
9.3 RecommendedKeyLengths ……………… 356 ReferencesandAdditionalReading …………….. 357 Exercises ………………………….. 358
10 Key Management and the Public-Key Revolution 359
10.1 KeyDistributionandKeyManagement . . . . .
10.2 A Partial Solution: Key-Distribution Centers . .
10.3 Key Exchange and the Diffie–Hellman Protocol 10.4ThePublic-KeyRevolution ………………. 370 ReferencesandAdditionalReading …………….. 372 Exercises ………………………….. 373
11 Public-Key Encryption 375
11.1 Public-KeyEncryption–AnOverview . . . . . . . . . . . . 375 11.2Definitions ………………………. 378 11.2.1 Security against Chosen-Plaintext Attacks . . . . . . . 379 11.2.2 MultipleEncryptions………………. 381 11.2.3 Security against Chosen-Ciphertext Attacks . . . . . . 387 11.3 Hybrid Encryption and the KEM/DEM Paradigm . . . . . . 389 11.3.1 CPA-Security………………….. 393 11.3.2 CCA-Security………………….. 398 11.4CDH/DDH-BasedEncryption …………….. 399 11.4.1 ElGamalEncryption ……………… 400 11.4.2 DDH-BasedKeyEncapsulation . . . . . . . . . . . . . 404 11.4.3 *A CDH-Based KEM in the Random-Oracle Model . 406 11.4.4 Chosen-Ciphertext Security and DHIES/ECIES . . . . 408 11.5RSAEncryption ……………………. 410 11.5.1 PlainRSA …………………… 410 11.5.2 PaddedRSAandPKCS#1v1.5 . . . . . . . . . . . . 415 11.5.3 *CPA-Secure Encryption without Random Oracles . . 417 11.5.4 OAEPandRSAPKCS#1v2.0 ………… 421 11.5.5 *A CCA-Secure KEM in the Random-Oracle Model . 425 11.5.6 RSA Implementation Issues and Pitfalls . . . . . . . . 429 ReferencesandAdditionalReading …………….. 432 Exercises ………………………….. 433
12 Digital Signature Schemes 439
12.1DigitalSignatures–AnOverview …………… 439 12.2Definitions ………………………. 441 12.3TheHash-and-SignParadigm……………… 443 12.4RSASignatures ……………………. 444
. . . . . . . 359 ……. 361 ……. 363
xi
xii
12.4.1 PlainRSA …………………… 444
12.4.2 RSA-FDHandPKCS#1v2.1 …………. 446 12.5 Signatures from the Discrete-Logarithm Problem . . . . . . . 451 12.5.1 TheSchnorrSignatureScheme . . . . . . . . . . . . . 451 12.5.2 DSAandECDSA ……………….. 459 12.6*SignaturesfromHashFunctions …………… 461 12.6.1 Lamport’sSignatureScheme ………….. 461 12.6.2 Chain-BasedSignatures …………….. 465 12.6.3Tree-BasedSignatures……………… 468
12.7 *Certificates and Public-Key Infrastructures . . . . . . . . . 473
12.8 PuttingItAllTogether–SSL/TLS . . . . . . . . . . . . . . 479
12.9*Signcryption …………………….. 481 ReferencesandAdditionalReading …………….. 483 Exercises ………………………….. 484
13 *Advanced Topics in Public-Key Encryption 487
13.1 Public-Key Encryption from Trapdoor Permutations . . . . . 487 13.1.1 TrapdoorPermutations …………….. 488 13.1.2 Public-Key Encryption from Trapdoor Permutations . 489
13.2ThePaillierEncryptionScheme ……………. 491 13.2.1 TheStructureofZ∗N2 ……………… 492 13.2.2 ThePaillierEncryptionScheme. . . . . . . . . . . . . 494 13.2.3HomomorphicEncryption……………. 499
13.3 Secret Sharing and Threshold Encryption . . . . . . . . . . . 501 13.3.1 SecretSharing …………………. 501 13.3.2 VerifiableSecretSharing…………….. 503 13.3.3 Threshold Encryption and Electronic Voting . . . . . 505
13.4 The Goldwasser–Micali Encryption Scheme . . . . . . . . . . 507 13.4.1 QuadraticResiduesModuloaPrime . . . . . . . . . . 507 13.4.2 Quadratic Residues Modulo a Composite . . . . . . . 510 13.4.3 The Quadratic Residuosity Assumption . . . . . . . . 514 13.4.4 The Goldwasser–Micali Encryption Scheme . . . . . . 515
13.5TheRabinEncryptionScheme …………….. 518 13.5.1 ComputingModularSquareRoots . . . . . . . . . . . 518 13.5.2 A Trapdoor Permutation Based on Factoring . . . . . 523 13.5.3 TheRabinEncryptionScheme . . . . . . . . . . . . . 527
ReferencesandAdditionalReading …………….. 528 Exercises ………………………….. 529
Index of Common Notation 533
Appendix A Mathematical Background 537
A.1 IdentitiesandInequalities ……………….. 537 A.2 AsymptoticNotation …………………. 537 A.3 BasicProbability …………………… 538 A.4 The“Birthday”Problem ……………….. 542 A.5 *FiniteFields …………………….. 544
Appendix B Basic Algorithmic Number Theory 547
B.1 IntegerArithmetic …………………… 549 B.1.1 BasicOperations………………… 549 B.1.2 The Euclidean and Extended Euclidean Algorithms . 550
B.2 ModularArithmetic ………………….. 552 B.2.1 BasicOperations………………… 552 B.2.2 ComputingModularInverses ………….. 552 B.2.3 ModularExponentiation…………….. 553 B.2.4 *MontgomeryMultiplication ………….. 556 B.2.5 ChoosingaUniformGroupElement . . . . . . . . . . 557
B.3 *FindingaGeneratorofaCyclicGroup . . . . . . . . . . . . 559 B.3.1 Group-TheoreticBackground ………….. 559 B.3.2 EfficientAlgorithms ………………. 561
ReferencesandAdditionalReading …………….. 562 Exercises ………………………….. 562
References 563 Index 577
xiii
Preface
The goal of our book remains the same as in the first edition: to present the basic paradigms and principles of modern cryptography to a general audience with a basic mathematics background. We have designed this book to serve as a textbook for undergraduate- or graduate-level courses in cryptography (in computer science, electrical engineering, or mathematics departments), as a general introduction suitable for self-study (especially for beginning graduate students), and as a reference for students, researchers, and practitioners.
There are numerous other cryptography textbooks available today, and the reader may rightly ask whether another book on the subject is needed. We would not have written this book—nor worked on revising it for the second edition—if the answer to that question were anything other than an unequiv- ocal yes. What, in our opinion, distinguishes our book from other available books is that it provides a rigorous treatment of modern cryptography in an accessible manner appropriate for an introduction to the topic.
Our focus is on modern (post-1980s) cryptography, which is distinguished from classical cryptography by its emphasis on definitions, precise assump- tions, and rigorous proofs of security. We briefly discuss each of these in turn (these principles are explored in greater detail in Chapter 1):
• The central role of definitions: A key intellectual contribution of modern cryptography has been the recognition that formal definitions of security are an essential first step in the design of any cryptographic primitive or protocol. The reason, in retrospect, is simple: if you don’t know what it is you are trying to achieve, how can you hope to know when you have achieved it? As we will see in this book, cryptographic definitions of security are quite strong and—at first glance—may appear impossible to achieve. One of the most amazing aspects of cryptography is that efficient constructions satisfying such strong definitions can be proven to exist (under rather mild assumptions).
• The importance of precise assumptions: As will be explained in Chapters 2 and 3, many cryptographic constructions cannot currently be proven secure in an unconditional sense. Security often relies, in- stead, on some widely believed (though unproven) assumption(s). The modern cryptographic approach dictates that any such assumption must be clearly stated and unambiguously defined. This not only allows for objective evaluation of the assumption but, more importantly, enables rigorous proofs of security as described next.
xv
xvi
• The possibility of proofs of security: The previous two principles serve as the basis for the idea that cryptographic constructions can be proven secure with respect to clearly stated definitions of security and relative to well-defined cryptographic assumptions. This concept is the essence of modern cryptography, and is what has transformed the field from an art to a science.
The importance of this idea cannot be overemphasized. Historically, cryptographic schemes were designed in a largely ad hoc fashion, and were deemed to be secure if the designers themselves could not find any attacks. In contrast, modern cryptography advocates the design of schemes with formal, mathematical proofs of security in well-defined models. Such schemes are guaranteed to be secure unless the underly- ing assumption is false (or the security definition did not appropriately model the real-world security concerns). By relying on long-standing assumptions (e.g., the assumption that “factoring is hard”), it is thus possible to obtain schemes that are extremely unlikely to be broken.
A unified approach. The above principles of modern cryptography are rel- evant not only to the “theory of cryptography” community. The importance of precise definitions is, by now, widely understood and appreciated by de- velopers and security engineers who use cryptographic tools to build secure systems, and rigorous proofs of security have become one of the requirements for cryptographic schemes to be standardized.
Changes in the Second Edition
In preparing the second edition, we have made a conscious effort to integrate a more practical perspective (without sacrificing a rigorous approach). This is reflected in a number of changes and additions we have made:
• We have increased our coverage of stream ciphers, introducing them as a variant of pseudorandom generators in Section 3.3.1, discussing stream-cipher modes of operation in Section 3.6.1, and describing mod- ern stream-cipher design principles and examples in Section 6.1.
• We have emphasized the importance of authenticated encryption (see Section 4.5) and have added a section on secure communication sessions.
• We have moved our treatment of hash functions into its own chapter (Chapter 5), have included some standard applications of cryptographic hash functions (Section 5.6), and have added a section on hash-function design principles and widely used constructions (Section 6.3). We have also improved our treatment of birthday attacks (covering small-space birthday attacks in Section 5.4.2) and have added a discussion of rainbow tables and time/space tradeoffs (Section 5.4.3).
• We have included several important attacks on implementations of cryp- tography that arise in practice, including chosen-plaintext attacks on chained-CBC encryption (Section 3.6.2), padding-oracle attacks on CBC- mode encryption (Section 3.7.2), and timing attacks on MAC verifica- tion (Section 4.2).
• After much deliberation, we have decided to introduce the random- oracle model much earlier in the book (Section 5.5). This allows us to give a proper, integrated treatment of standardized, widely used public- key encryption and signature schemes in later chapters, instead of rele- gating them to second-class status in a chapter at the end of the book.
• We have strengthened our coverage of elliptic-curve cryptography (Sec- tion 8.3.4) and have added a discussion of its impact on recommended key lengths (Section 9.3).
• In the chapter on public-key encryption, we introduce the KEM/DEM paradigm as a form of hybrid encryption (see Section 11.3). We also cover DHIES/ECIES in addition to the RSA PKCS #1 standards.
• In the chapter on digital signatures, we now describe the construction of signatures from identification schemes using the Fiat–Shamir transform, with the Schnorr signature scheme as a prototypical example. We have also improved our coverage of DSA/ECDSA. We include brief discus- sions of SSL/TLS and signcryption, both of which serve as culminations of everything covered up to that point.
• In the “advanced topics” chapter, we have amplified our treatment of homomorphic encryption, and have included sections on secret sharing and threshold encryption.
Beyond the above, we have also edited the entire book to make extensive corrections as well as smaller adjustments, including more worked examples, to improve the exposition. Several additional exercises have also been added.
Guide to Using This Book
This section is intended primarily for instructors seeking to adopt this book for their course, though the student picking up this book on his or her own may also find it a useful overview.
Required background. We have structured the book so that the only formal prerequisite is a course on discrete mathematics. Even here we rely on very little material: we assume familiarity with basic (discrete) probability and modular arithmetic. Students reading this book are also expected to have had some exposure to algorithms, mainly to be comfortable reading pseudocode and to be familiar with big-O notation. Many of these concepts are reviewed in Appendix A and/or when first used in the book.
xvii
xviii
Notwithstanding the above, the book does use definitions, proofs, and ab- stract mathematical concepts, and therefore requires some mathematical ma- turity. In particular, the reader is assumed to have had some exposure to proofs at the college level, whether in an upper-level mathematics course or a course on discrete mathematics, algorithms, or computability theory.
Suggestions for course organization. The core material of this book, which we recommend should be covered in any introductory course on cryp- tography, consists of the following (in all cases, starred sections are excluded; more on this below):
• Introduction and Classical Cryptography: Chapters 1 and 2 discuss clas- sical cryptography and set the stage for modern cryptography.
• Private-Key (Symmetric) Cryptography: Chapter 3 on private-key en- cryption, Chapter 4 on message authentication, and Chapter 5 on hash functions provide a thorough treatment of these topics.
We also highly recommend covering Section 6.2, which deals with block- cipher design; in our experience students really enjoy this material, and it makes the abstract ideas they have learned in previous chapters more concrete. Although we do consider this core material, it is not used in the rest of the book and so can be safely skipped if desired.
• Public-Key (Asymmetric) Cryptography: Chapter 8 gives a self-contained introduction to all the number theory needed for the remainder of the book. The material in Chapter 9 is not used subsequently; however, we do recommend at least covering Section 9.3 on recommended key lengths. The public-key revolution is described in Chapter 10. Ideally, all of Chapters 11 and 12 should be covered; those pressed for time can pick and choose appropriately.
We are typically able to cover most of the above in a one-semester (35-hour) undergraduate course (omitting some proofs and skipping some topics, as needed) or, with some changes to add more material on theoretical founda- tions, in the first three-quarters of a one-semester graduate course. Instructors with more time available can proceed at a more leisurely pace or incorporate additional topics, as discussed below.
Those wishing to cover additional material, in either a longer course or a faster-paced graduate course, will find that the book is structured to allow flexible incorporation of other topics as time permits (and depending on the interests of the instructor). Specifically, the starred (*) sections and chapters may be covered in any order, or skipped entirely, without affecting the overall flow of the book. We have taken care to ensure that none of the core material depends on any of the starred material and, for the most part, the starred sections do not depend on each other. (When they do, this dependence is explicitly noted.)
We suggest the following from among the starred topics for those wishing to give their course a particular flavor:
• Theory: A more theoretically inclined course could include material from Section 3.2.2 (semantic security); Chapter 7 (one-way functions and hard-core predicates, and constructing pseudorandom generators, functions, and permutations from one-way permutations); Section 8.4 (one-way functions and collision-resistant hash functions from number- theoretic assumptions); Section 11.5.3 (RSA encryption without random oracles); and Section 12.6 (signatures without random oracles).
• Mathematics: A course directed at students with a strong mathemat- ics background—or being taught by someone who enjoys this aspect of cryptography—could incorporate Section 4.6 (information-theoretic MACs in finite fields); some of the more advanced number theory from Chapter 8 (e.g., the Chinese remainder theorem and the Miller–Rabin primality test); and all of Chapter 9.
In either case, a selection of advanced topics from Chapter 13 could also be included.
Feedback and Errata
Our goal in writing this book was to make modern cryptography accessible to a wide audience beyond the “theoretical computer science” community. We hope you will let us know if we have succeeded. The many enthusiastic emails we have received in response to our first edition have made the whole process of writing this book worthwhile.
We are always happy to receive feedback. We hope there are no errors or typos in the book; if you do find any, however, we would greatly appre- ciate it if you let us know. (A list of known errata will be maintained at http://www.cs.umd.edu/~jkatz/imc.html.) You can email your comments and errata to jkatz@cs.umd.edu and lindell@biu.ac.il; please put “In- troduction to Modern Cryptography” in the subject line.
Acknowledgments
For the second edition: We are grateful to the many readers of the first edi- tion who have sent us comments, suggestions, and corrections that helped to greatly improve the book. Discussions with Claude Cr ́epeau, Bill Gasarch, Gene Itkis, Leonid Reyzin, Tom Shrimpton, and Salil Vadhan regarding the content and overall “philosophy” of the book were especially fruitful. We also thank Bar Alon, Gilad Asharov, Giuseppe Ateniese, Amir Azodi, Omer Berk- man, Sergio de Biasi, Aurora Bristor, Richard Chang, Qingfeng Cheng, Kwan Tae Cho, Kyliah Clarkson, Ran Cohen, Nikolas Coukouma, Dana Dachman- Soled, Michael Fang, Michael Farcasin, Pooya Farshim, Marc Fischlin, Lance
xix
xx
Fortnow, Michael Fuhr, Bill Gasarch, Virgil Gligor, Carmit Hazay, Andreas Hu ̈bner, Karst Koymans, Eyal Kushilevitz, Steve Lai, Ugo Dal Lago, Ar- mand Makowski, Tal Malkin, Steve Myers, Naveen Nathan, Ariel Nof, Eran Omri, Ruy de Queiroz, Eli Quiroz, Tal Rabin, Charlie Rackoff, Yona Raekow, Tzachy Reinman, Wei Ren, Ben Riva, Volker Roth, Christian Schaffner, Joachim Schipper, Dominique Schr ̈oder, Randy Shull, Nigel Smart, Christoph Sprenger, Aravind Srinivasan, John Steinberger, Aishwarya Thiruvengadam, Dave Tuller, Poorvi Vora, Avishai Yanai, Rupeng Yang, Arkady Yerukhi- movich, Dae Hyun Yum, Hila Zarosim, and Konstantin Ziegler for their help- ful corrections to the first edition and/or early drafts of the second edition.
For the first edition: We thank Zoe Bermant for producing the figures; David Wagner for answering questions related to block ciphers and their cryptanaly- sis; and Salil Vadhan and Alon Rosen for experimenting with an early version of our text in an introductory course at Harvard University and for providing us with valuable feedback. We would also like to extend our gratitude to those who read and commented on earlier drafts of this book and to those who sent us corrections: Adam Bender, Chiu-Yuen Koo, Yair Dombb, Michael Fuhr, William Glenn, S. Dov Gordon, Carmit Hazay, Eyal Kushilevitz, Avivit Levy, Matthew Mah, Ryan Murphy, Steve Myers, Martin Paraskevov, Eli Quiroz, Jason Rogers, Rui Xue, Dicky Yan, Arkady Yerukhimovich, and Hila Zarosim. We are extremely grateful to all those who encouraged us to write this book and agreed with us that a book of this sort is badly needed.
Finally, we thank our wives and children for all their support and understand- ing during the many hours, days, months, and now years we have spent on this project.
Part I
Introduction and Classical Cryptography
Chapter 1 Introduction
1.1 Cryptography and Modern Cryptography
The Concise Oxford English Dictionary defines cryptography as “the art of writing or solving codes.” This is historically accurate, but does not capture the current breadth of the field or its present-day scientific foundations. The definition focuses solely on the codes that have been used for centuries to en- able secret communication. But cryptography nowadays encompasses much more than this: it deals with mechanisms for ensuring integrity, techniques for exchanging secret keys, protocols for authenticating users, electronic auctions and elections, digital cash, and more. Without attempting to provide a com- plete characterization, we would say that modern cryptography involves the study of mathematical techniques for securing digital information, systems, and distributed computations against adversarial attacks.
The dictionary definition also refers to cryptography as an art. Until late in the 20th century cryptography was, indeed, largely an art. Constructing good codes, or breaking existing ones, relied on creativity and a developed sense of how codes work. There was little theory to rely on and, for a long time, no working definition of what constitutes a good code. Beginning in the 1970s and 1980s, this picture of cryptography radically changed. A rich theory began to emerge, enabling the rigorous study of cryptography as a science and a mathematical discipline. This perspective has, in turn, influenced how researchers think about the broader field of computer security.
Another very important difference between classical cryptography (say, be- fore the 1980s) and modern cryptography relates to its adoption. Historically, the major consumers of cryptography were military organizations and gov- ernments. Today, cryptography is everywhere! If you have ever authenticated yourself by typing a password, purchased something by credit card over the Internet, or downloaded a verified update for your operating system, you have undoubtedly used cryptography. And, more and more, programmers with rel- atively little experience are being asked to “secure” the applications they write by incorporating cryptographic mechanisms.
In short, cryptography has gone from a heuristic set of tools concerned with ensuring secret communication for the military to a science that helps secure systems for ordinary people all across the globe. This also means that cryptography has become a more central topic within computer science.
3
4 Introduction to Modern Cryptography
Goals of this book. Our goal is to make the basic principles of modern cryptography accessible to students of computer science, electrical engineer- ing, or mathematics; to professionals who want to incorporate cryptography in systems or software they are developing; and to anyone with a basic level of mathematical maturity who is interested in understanding this fascinating field. After completing this book, the reader should appreciate the secu- rity guarantees common cryptographic primitives are intended to provide; be aware of standard (secure) constructions of such primitives; and be able to perform a basic evaluation of new schemes based on their proofs of security (or lack thereof) and the mathematical assumptions underlying those proofs. It is not our intention for readers to become experts—or to be able to de- sign new cryptosystems—after finishing this book, but we have attempted to provide the terminology and foundational material needed for the interested reader to subsequently study more advanced references in the area.
This chapter. The focus of this book is the formal study of modern cryp- tography, but we begin in this chapter with a more informal discussion of “classical” cryptography. Besides allowing us to ease into the material, our treatment in this chapter will also serve to motivate the more rigorous ap- proach we will be taking in the rest of the book. Our intention here is not to be exhaustive and, as such, this chapter should not be taken as a representa- tive historical account. The reader interested in the history of cryptography is invited to consult the references at the end of this chapter.
1.2 The Setting of Private-Key Encryption
Classical cryptography was concerned with designing and using codes (also called ciphers) that enable two parties to communicate secretly in the pres- ence of an eavesdropper who can monitor all communication between them. In modern parlance, codes are called encryption schemes and that is the ter- minology we will use here. Security of all classical encryption schemes relied on a secret—a key—shared by the communicating parties in advance and un- known to the eavesdropper. This scenario is known as the private-key (or shared-/secret-key) setting, and private-key encryption is just one example of a cryptographic primitive used in this setting. Before describing some histor- ical encryption schemes, we discuss private-key encryption more generally.
In the setting of private-key encryption, two parties share a key and use this key when they want to communicate secretly. One party can send a message, or plaintext, to the other by using the shared key to encrypt (or “scramble”) the message and thus obtain a ciphertext that is transmitted to the receiver. The receiver uses the same key to decrypt (or “unscramble”) the ciphertext and recover the original message. Note the same key is used to convert the
Introduction
5
FIGURE 1.1: One common setting of private-key cryptography (here, encryption): two parties share a key that they use to communicate securely.
plaintext into a ciphertext and back; that is why this is also known as the symmetric-key setting, where the symmetry lies in the fact that both parties hold the same key that is used for encryption and decryption. This is in contrast to asymmetric, or public-key, encryption (introduced in Chapter 10), where encryption and decryption use different keys.
As already noted, the goal of encryption is to keep the plaintext hidden from an eavesdropper who can monitor the communication channel and observe the ciphertext. We discuss this in more detail later in this chapter, and spend a great deal of time in Chapters 2 and 3 formally defining this goal.
There are two canonical applications of private-key cryptography. In the first, there are two distinct parties separated in space, e.g., a worker in New York communicating with her colleague in California; see Figure 1.2. These two users are assumed to have been able to securely share a key in advance of their communication. (Note that if one party simply sends the key to the other over the public communication channel, then the eavesdropper obtains the key too!) Often this is easy to accomplish by having the parties physically meet in a secure location to share a key before they separate; in the example just given, the co-workers might arrange to share a key when they are both in the New York office. In other cases, sharing a key securely is more difficult. For the next several chapters we simply assume that sharing a key is possible; we will revisit this issue in Chapter 10.
The second widespread application of private-key cryptography involves the same party communicating with itself over time. (See Figure 1.2.) Consider, e.g., disk encryption, where a user encrypts some plaintext and stores the resulting ciphertext on their hard drive; the same user will return at a later
6 Introduction to Modern Cryptography
FIGURE 1.2: Another common setting of private-key cryptography (again, encryption): a single user stores data securely over time.
point in time to decrypt the ciphertext and recover the original data. The hard drive here serves as the communication channel on which an attacker might eavesdrop by gaining access to the hard drive and reading its contents. “Sharing” the key is now trivial, though the user still needs a secure and reliable way to remember/store the key for use at a later point in time.
The syntax of encryption. Formally, a private-key encryption scheme is defined by specifying a message space M along with three algorithms: a procedure for generating keys (Gen), a procedure for encrypting (Enc), and a procedure for decrypting (Dec). The message space M defines the set of “legal” messages, i.e., those supported by the scheme. The algorithms have the following functionality:
1. The key-generation algorithm Gen is a probabilistic algorithm that out- puts a key k chosen according to some distribution.
2. The encryption algorithm Enc takes as input a key k and a message m and outputs a ciphertext c. We denote by Enck(m) the encryption of the plaintext m using the key k.
3. The decryption algorithm Dec takes as input a key k and a ciphertext c and outputs a plaintext m. We denote the decryption of the ciphertext c using the key k by Deck(c).
An encryption scheme must satisfy the following correctness requirement: for every key k output by Gen and every message m ∈ M, it holds that
Deck(Enck(m)) = m.
Introduction 7
In words: encrypting a message and then decrypting the resulting ciphertext (using the same key) yields the original message.
The set of all possible keys output by the key-generation algorithm is called the key space and is denoted by K. Almost always, Gen simply chooses a uniform key from the key space; in fact, one can assume without loss of generality that this is the case (see Exercise 2.1).
Reviewing our earlier discussion, an encryption scheme can be used by two parties who wish to communicate as follows. First, Gen is run to obtain a key k that the parties share. Later, when one party wants to send a plaintext m to the other, she computes c := Enck(m) and sends the resulting ciphertext c overthepublicchanneltotheotherparty.1 Uponreceivingc,theotherparty computes m := Deck(c) to recover the original plaintext.
Keys and Kerckhoffs’ principle. As is clear from the above, if an eaves- dropping adversary knows the algorithm Dec as well as the key k shared by the two communicating parties, then that adversary will be able to decrypt any ciphertexts transmitted by those parties. It is for this reason that the communicating parties must share the key k securely and keep k completely secret from everyone else. Perhaps they should keep the decryption algorithm Dec secret, too? For that matter, might it not be better for them to keep all the details of the encryption scheme secret?
In the late 19th century, Auguste Kerckhoffs argued the opposite in a paper he wrote elucidating several design principles for military ciphers. One of the most important of these, now known simply as Kerckhoffs’ principle, was:
The cipher method must not be required to be secret, and it must be able to fall into the hands of the enemy without inconvenience.
That is, an encryption scheme should be designed to be secure even if an eavesdropper knows all the details of the scheme, so long as the attacker doesn’t know the key being used. Stated differently, security should not rely on the encryption scheme being secret; instead, Kerckhoffs’ principle demands that security rely solely on secrecy of the key.
There are three primary arguments in favor of Kerckhoffs’ principle. The first is that it is significantly easier for the parties to maintain secrecy of a short key than to keep secret the (more complicated) algorithm they are using. This is especially true if we imagine using encryption to secure the communication between all pairs of employees in some organization. Unless each pair of parties uses their own, unique algorithm, some parties will know the algorithm used by others. Information about the encryption algorithm might be leaked by one of these employees (say, after being fired), or obtained by an attacker using reverse engineering. In short, it is simply unrealistic to assume that the encryption algorithm will remain secret.
1We use “:=” to denote deterministic assignment, and assume for now that Enc is deter- ministic. A list of common notation can be found in the back of the book.
8 Introduction to Modern Cryptography
Second, in case the honest parties’ shared, secret information is ever ex- posed, it will be much easier for them to change a key than to replace an encryption scheme. (Consider updating a file versus installing a new pro- gram.) Moreover, it is relatively trivial to generate a new random secret, whereas it would be a huge undertaking to design a new encryption scheme.
Finally, for large-scale deployment it is significantly easier for users to all rely on the same encryption algorithm/software (with different keys) than for everyone to use their own custom algorithm. (This is true even for a single user who is communicating with several different parties.) In fact, it is desirable for encryption schemes to be standardized so that (1) compatibility is ensured by default and (2) users will utilize an encryption scheme that has undergone public scrutiny and in which no weaknesses have been found.
Nowadays Kerckhoffs’ principle is understood as advocating that crypto- graphic designs be made completely public, in stark contrast to the notion of “security by obscurity” which suggests that keeping algorithms secret im- proves security. It is very dangerous to use a proprietary, “home-brewed” algorithm (i.e., a non-standardized algorithm designed in secret by some com- pany). In contrast, published designs undergo public review and are therefore likely to be stronger. Many years of experience have demonstrated that it is very difficult to construct good cryptographic schemes. Therefore, our con- fidence in the security of a scheme is much higher if it has been extensively studied (by experts other than the designers of the scheme) and no weaknesses have been found. As simple and obvious as it may sound, the principle of open cryptographic design (i.e., Kerckhoffs’ principle) has been ignored over and over again with disastrous results. Fortunately, today there are enough secure, standardized, and widely available cryptosystems that there is no reason to use anything else.
1.3 Historical Ciphers and Their Cryptanalysis
In our study of “classical” cryptography we will examine some historical encryption schemes and show that they are insecure. Our main aims in pre- senting this material are (1) to highlight the weaknesses of an “ad hoc” ap- proach to cryptography, and thus motivate the modern, rigorous approach that will be taken in the rest of the book, and (2) to demonstrate that simple approaches to achieving secure encryption are unlikely to succeed. Along the way, we will present some central principles of cryptography inspired by the weaknesses of these historical schemes.
In this section, plaintext characters are written in lower case and cipher- text characters are written in UPPER CASE for typographical clarity.
Caesar’s cipher. One of the oldest recorded ciphers, known as Caesar’s
Introduction 9 cipher, is described in De Vita Caesarum, Divus Iulius (“The Lives of the
Caesars, the Deified Julius”), written in approximately 110 CE:
There are also letters of his to Cicero, as well as to his intimates on private affairs, and in the latter, if he had anything confidential to say, he wrote it in cipher, that is, by so changing the order of the letters of the alphabet, that not a word could be made out. . .
Julius Caesar encrypted by shifting the letters of the alphabet 3 places for- ward: a was replaced with D, b with E, and so on. At the very end of the alphabet, the letters wrap around and so z was replaced with C, y with B, and x with A. For example, encryption of the message begin the attack now, with spaces removed, gives:
EHJLQWKHDWWDFNQRZ.
An immediate problem with this cipher is that the encryption method is fixed; there is no key. Thus, anyone learning how Caesar encrypted his messages would be able to decrypt effortlessly.
Interestingly, a variant of this cipher called ROT-13 (where the shift is 13 places instead of 3) is still used nowadays in various online forums. It is understood that this does not provide any cryptographic security; it is used merely to ensure that the text (say, a movie spoiler) is unintelligible unless the reader of a message consciously chooses to decrypt it.
The shift cipher and the sufficient key-space principle. The shift cipher can be viewed as a keyed variant of Caesar’s cipher.2 Specifically, in the shift cipher the key k is a number between 0 and 25. To encrypt, letters are shifted as in Caesar’s cipher, but now by k places. Mapping this to the syntax of encryption described earlier, the message space consists of arbitrary length strings of English letters with punctuation, spaces, and numerals removed, and with no distinction between upper and lower case. Algorithm Gen outputs a uniform key k ∈ {0, . . . , 25}; algorithm Enc takes a key k and a plaintext and shifts each letter of the plaintext forward k positions (wrapping around at the end of the alphabet); and algorithm Dec takes a key k and a ciphertext and shifts every letter of the ciphertext backward k positions.
A more mathematical description is obtained by equating the English al- phabet with the set {0,…,25} (so a = 0, b = 1, etc.). The message space M is then any finite sequence of integers from this set. Encryption of the message m = m1 ···ml (where mi ∈ {0,…,25}) using key k is given by
Enck(m1···ml)=c1···cl, whereci =[(mi+k)mod26].
(The notation [a mod N] denotes the remainder of a upon division by N,
with 0 ≤ [amodN] < N. We refer to the process mapping a to [amodN] 2In some books, “Caesar’s cipher” and “shift cipher” are used interchangeably.
10 Introduction to Modern Cryptography
as reduction modulo N; we will have more to say about this beginning in
Chapter 8.) Decryption of a ciphertext c = c1 · · · cl using key k is given by Deck(c1···cl)=m1···ml, wheremi=[(ci−k)mod26].
Is the shift cipher secure? Before reading on, try to decrypt the following ciphertext that was generated using the shift cipher and a secret key k:
OVDTHUFWVZZPISLRLFZHYLAOLYL.
Is it possible to recover the message without knowing k? Actually, it is trivial! The reason is that there are only 26 possible keys. So one can try to decrypt the ciphertext using every possible key and thereby obtain a list of 26 candi- date plaintexts. The correct plaintext will certainly be on this list; moreover, if the ciphertext is “long enough” then the correct plaintext will likely be the only candidate on the list that “makes sense.” (The latter is not necessar- ily true, but will be true most of the time. Even when it is not, the attack narrows down the set of potential plaintexts to at most 26 possibilities.) By scanning the list of candidates it is easy to recover the original plaintext.
An attack that involves trying every possible key is called a brute-force or exhaustive-search attack. Clearly, for an encryption scheme to be secure it must not be vulnerable to such an attack.3 This observation is known as the sufficient key-space principle:
Any secure encryption scheme must have a key space that is suffi- ciently large to make an exhaustive-search attack infeasible.
One can debate what amount of effort makes a task “infeasible,” and an exact determination of feasibility depends on both the resources of a potential attacker and the length of time the sender and receiver want to ensure secrecy of their communication. Nowadays, attackers can use supercomputers, tens of thousands of personal computers, or graphics processing units (GPUs) to speed up brute-force attacks. To protect against such attacks the key space must therefore be very large—say, of size at least 270, and even larger if one is concerned about long-term security against a well-funded attacker.
The sufficient key-space principle gives a necessary condition for security, but not a sufficient one. The next example demonstrates this.
The mono-alphabetic substitution cipher. In the shift cipher, the key defines a map from each letter of the (plaintext) alphabet to some letter of the (ciphertext) alphabet, where the map is a fixed shift determined by the key. In the mono-alphabetic substitution cipher, the key also defines a map on the alphabet, but the map is now allowed to be arbitrary subject only to the constraint that it be one-to-one so that decryption is possible. The key
3Technically, this is only true if the message space is larger than the key space; we will return to this point in Chapter 2. Encryption schemes used in practice have this property.
Introduction 11 space thus consists of all bijections, or permutations, of the alphabet. So, for
example, the key that defines the following permutation
abcdefghijklmnopqrstuvwxyz XEUADNBKVMROCQFSYHWGLZIJPT
(in which a maps to X, etc.) would encrypt the message tellhimaboutme to GDOOKVCXEFLGCD. The name of this cipher comes from the fact that the key defines a (fixed) substitution for individual characters of the plaintext.
Assuming the English alphabet is being used, the key space is of size 26! = 26 · 25 · 24 · · · 2 · 1, or approximately 288 , and a brute-force attack is infeasible. This, however, does not mean the cipher is secure! In fact, as we will show next, it is easy to break this scheme even though it has a large key space.
Assume English-language text is being encrypted (i.e., the text is gram- matically correct English writing, not just text written using characters of the English alphabet). The mono-alphabetic substitution cipher can then be attacked by utilizing statistical patterns of the English language. (Of course, the same attack works for any language.) The attack relies on the facts that:
1. For any key, the mapping of each letter is fixed, and so if e is mapped to D, then every appearance of e in the plaintext will result in the ap- pearance of D in the ciphertext.
2. The frequency distribution of individual letters in the English language is known (see Figure 1.3). Of course, very short texts may deviate from this distribution, but even texts consisting of only a few sentences tend to have distributions that are very close to the average.
FIGURE 1.3: Average letter frequencies for English-language text.
( ' & % $ # " ! 423321 *+)
,*-/0*.
12 Introduction to Modern Cryptography
The attack works by tabulating the frequency distribution of characters in the ciphertext, i.e., recording that A appeared 11 times, B appeared 4 times, and so on. These frequencies are then compared to the known letter frequencies of normal English text. One can then guess parts of the mapping defined by the key based on the observed frequencies. For example, since e is the most frequent letter in English, one can guess that the most frequent character in the ciphertext corresponds to the plaintext character e, and so on. Some of the guesses may be wrong, but enough of the guesses will be correct to enable relatively quick decryption (especially utilizing other knowledge of English, such as the fact that u generally follows q, and that h is likely to appear be- tween t and e). We conclude that although the mono-alphabetic substitution cipher has a large key space, it is still insecure.
It should not be surprising that the mono-alphabetic substitution cipher can be quickly broken, since puzzles based on this cipher appear in newspapers (and are solved by some people before their morning coffee!). We recommend that you try to decipher the following ciphertext—this should convince you how easy the attack is to carry out. (Use Figure 1.3 to help you.)
JGRMQOYGHMVBJWRWQFPWHGFFDQGFPFZRKBEEBJIZQQOCIBZKLFAFGQVFZFWWE OGWOPFGFHWOLPHLRLOLFDMFGQWBLWBWQOLKFWBYLBLYLFSFLJGRMQBOLWJVFP FWQVHQWFFPQOQVFPQOCFPOGFWFJIGFQVHLHLROQVFGWJVFPFOLFHGQVQVFILE OGQILHQFQGIQVVOSFAFGBWQVHQWIJVWJVFPFWHGFIWIHZZRQGBABHZQOCGFHX
An improved attack on the shift cipher. We can use letter-frequency tables to give an improved attack on the shift cipher. Our previous attack on the shift cipher required decrypting the ciphertext using each possible key, and then checking which key results in a plaintext that “makes sense.” A drawback of this approach is that it is somewhat difficult to automate, since it is difficult for a computer to check whether a given plaintext “makes sense.” (We do not claim that it would be impossible, as the attack could be automated using a dictionary of valid English words. We only claim that it would not be trivial to automate.) Moreover, there may be cases—we will see one later—where the plaintext characters are distributed just like English-language text even though the plaintext itself is not valid English, in which case checking for a plaintext that “makes sense” will not work.
We now describe an attack that does not suffer from these drawbacks. As before, associate the letters of the English alphabet with 0, . . . , 25. Let pi, with 0 ≤ pi ≤ 1, denote the frequency of the ith letter in normal English text (ignoring spaces, punctuation, etc.). Calculation using Figure 1.3 gives
25
p2i ≈0.065. (1.1) i=0
Now, say we are given some ciphertext and let qi denote the frequency of the ith letter of the alphabet in this ciphertext; i.e., qi is simply the number
Introduction 13
of occurrences of the ith letter of the alphabet in the ciphertext divided by the length of the ciphertext. If the key is k, then pi should be roughly equal to qi+k for all i, because the ith letter is mapped to the (i + k)th letter. (We use i+k instead of the more cumbersome [i+k mod 26].) Thus, if we compute
d e f 2 5
Ij = pi·qi+j
i=0
for each value of j ∈ {0,...,25}, then we expect to find that Ik ≈ 0.065 (where k is the actual key), whereas Ij for j ̸= k will be different from 0.065. This leads to a key-recovery attack that is easy to automate: compute Ij for all j, and then output the value k for which Ik is closest to 0.065.
The Vigen`ere (poly-alphabetic shift) cipher. The statistical attack on the mono-alphabetic substitution cipher can be carried out because the key defines a fixed mapping that is applied letter-by-letter to the plaintext. Such an attack could be thwarted by using a poly-alphabetic substitution cipher where the key instead defines a mapping that is applied on blocks of plaintext characters. Here, for example, a key might map the 2-character block ab to DZ while mapping ac to TY; note that the plaintext character a does not get mapped to a fixed ciphertext character. Poly-alphabetic substitution ciphers “smooth out” the frequency distribution of characters in the ciphertext and make it harder to perform statistical analysis.
The Vigen`ere cipher, a special case of the above also called the poly- alphabetic shift cipher, works by applying several independent instances of the shift cipher in sequence. The key is now viewed as a string of letters; en- cryption is done by shifting each plaintext character by the amount indicated by the next character of the key, wrapping around in the key when necessary. (This degenerates to the shift cipher if the key has length 1.) For example, encryption of the message tellhimaboutme using the key cafe would work as follows:
Plaintext: tellhimaboutme Key (repeated): cafecafecafeca Ciphertext: VEQPJIREDOZXOE
(The key need not be an English word.) This is exactly the same as encrypting the first, fifth, ninth, . . . characters with the shift cipher and key c; the second, sixth, tenth, . . . characters with key a; the third, seventh, . . . characters with f; and the fourth, eighth, . . . characters with e. Notice that in the above example l is mapped once to Q and once to P. Furthermore, the ciphertext character E is sometimes obtained from e and sometimes from a. Thus, the character frequencies of the ciphertext are “smoothed out,” as desired.
If the key is sufficiently long, cracking this cipher appears daunting. Indeed, it had been considered by many to be “unbreakable,” and although it was invented in the 16th century, a systematic attack on the scheme was only devised hundreds of years later.
14 Introduction to Modern Cryptography
Attacking the Vigen`ere cipher. A first observation in attacking the Vi- gen`ere cipher is that if the length of the key is known then attacking the cipher is relatively easy. Specifically, say the length of the key, also called the period, is t. Write the key k as k = k1 ···kt where each ki is a letter of the alphabet. An observed ciphertext c = c1c2 · · · can be divided into t parts where each part can be viewed as having been encrypted using a shift cipher. Specifically, for all j ∈ {1, . . . , t} the ciphertext characters
cj,cj+t,cj+2t,...
all resulted by shifting the corresponding characters of the plaintext by kj positions. We refer to the above sequence of characters as the jth stream. All that remains is to determine, for each of the t streams, which of the 26 possible shifts was used. This is not as trivial as in the case of the shift cipher, because it is no longer possible to simply try different shifts in an attempt to determine when decryption of a stream “makes sense.” (Recall that a stream does not correspond to consecutive letters of the plaintext.) Furthermore, trying to guess the entire key k at once would require a brute- force search through 26t different possibilities, which is infeasible for large t. Nevertheless, we can still use letter-frequency analysis to analyze each stream independently. Namely, for each stream we tabulate the frequency of each ciphertext character and then check which of the 26 possible shifts yields the “right” probability distribution for that stream. Since this can be carried out independently for each stream (i.e., for each character of the key), this attack takes time 26 · t rather than time 26t.
A more principled, easier-to-automate approach is to use the improved method for attacking the shift cipher discussed earlier. That attack did not rely on checking for a plaintext that “made sense,” but only relied on the underlying frequency distribution of characters in the plaintext.
Either of the above approaches gives a successful attack when the key length is known. What if the key length is unknown?
Note first that as long as the maximum length T of the key is not too large, we can simply repeat the above attack T times (for each possible value t ∈ {1, . . . , T }). This leads to at most T different candidate plaintexts, among which the true plaintext will likely be easy to identify. So an unknown key length is not a serious obstacle.
There are also more efficient ways to determine the key length from an observed ciphertext. One is to use Kasiski’s method, published in the mid- 19th century. The first step here is to identify repeated patterns of length 2 or 3 in the ciphertext. These are likely the result of certain bigrams or trigrams that appear frequently in the plaintext. For example, consider the common word “the.” This word will be mapped to different ciphertext characters, depending on its position in the plaintext. However, if it appears twice in the same relative position, then it will be mapped to the same ciphertext characters. For a sufficiently long plaintext, there is thus a good chance that “the” will be mapped repeatedly to the same ciphertext characters.
Plaintext: Key: Ciphertext:
the man and the woman retrieved the letter from the post office bea dsb ead sbe adsbe adsbeadsb ead sbeads bead sbe adsb eadsbe ULE PSO ENG LII WREBR RHLSMEYWE XHH DFXTHJ GVOP LII PRKU SFIADI
Introduction 15
Consider the following concrete example with the key beads (spaces have been added for clarity):
The word the is mapped sometimes to ULE, sometimes to LII, and sometimes to XHH. However, it is mapped twice to LII, and in a long enough text it is likely that it would be mapped multiple times to each of these possibilities. Kasiski’s observation was that the distance between such repeated appear- ances (assuming they are not coincidental) must be a multiple of the period. (In the above example, the period is 5 and the distance between the two ap- pearances of LII is 30, which is 6 times the period.) Therefore, the greatest common divisor of the distances between repeated sequences (assuming they are not coincidental) will yield the key length t or a multiple thereof.
An alternative approach, called the index of coincidence method, is more methodical and hence easier to automate. Recall that if the key length is t, then the ciphertext characters
c1, c1+t, c1+2t, . . .
in the first stream all resulted from encryption using the same shift. This means that the frequencies of the characters in this sequence are expected to be identical to the character frequencies of standard English text in some shifted order. In more detail: let qi denote the observed frequency of the ith English letter in this stream; this is simply the number of occurrences of the ith letter of the alphabet divided by the total number of letters in the stream. If the shift used here is j (i.e., if the first character k1 of the key is equal to j), then for all i we expect qi+j ≈ pi, where pi is the frequency of the ith letter of the alphabet in standard English text. (Once again, we use qi+j in place of q[i+j mod 26].) But this means that the sequence q0, . . . , q25 is just the sequence p0, . . . , p25 shifted j places. As a consequence (cf. Equation (1.1)):
25 25
q i2 ≈ p 2i ≈ 0 . 0 6 5 . i=0 i=0
This leads to a nice way to determine the key length t. For τ = 1,2,..., look at the sequence of ciphertext characters c1, c1+τ , c1+2τ , . . . and tabulate q0 , . . . , q25 for this sequence. Then compute
d e f 2 5 Sτ= qi2.
i=0
When τ = t we expect Sτ ≈ 0.065, as discussed above. On the other hand, if τ is not a multiple of t we expect that all characters will occur with roughly equal
16 Introduction to Modern Cryptography
probability in the sequence c1, c1+τ , c1+2τ , . . ., and so we expect qi ≈ 1/26 for
all i. In this case we will obtain
25 12
Sτ ≈ i=0 26 ≈ 0.038.
The smallest value of τ for which Sτ ≈ 0.065 is thus likely the key length. One can further validate a guess τ by carrying out a similar calculation using the second stream c2, c2+τ , c2+2τ , . . ., etc.
Ciphertext length and cryptanalytic attacks. The above attacks on the Vigen`ere cipher require a longer ciphertext than the attacks on previous schemes. Forexample,theindexofcoincidencemethodrequiresc1,c1+t,c1+2t (where t is the actual key length) to be sufficiently long in order to ensure that the observed frequencies match what is expected; the ciphertext itself must then be roughly t times larger. Similarly, the attack we showed on the mono- alphabetic substitution cipher requires a longer ciphertext than the attack on the shift cipher (which can work for encryptions of even a single word). This illustrates that a longer key can, in general, require the cryptanalyst to obtain more ciphertext in order to carry out an attack. (Indeed, the Vigen`ere cipher can be shown to be secure if the key is as long as what is being encrypted. We will see a similar phenomenon in the next chapter.)
Conclusions. We have presented only a few historical ciphers. Beyond their historical interest, our aim in presenting them was to illustrate some important lessons. Perhaps the most important is that designing secure ciphers is hard. The Vigen`ere cipher remained unbroken for a long time. Far more complex schemes have also been used. But a complex scheme is not necessarily secure, and all historical schemes have been broken.
1.4 Principles of Modern Cryptography
As should be clear from the previous section, cryptography was historically more of an art than a science. Schemes were designed in an ad hoc manner and evaluated based on their perceived complexity or cleverness. A scheme would be analyzed to see if any attacks could be found; if so, the scheme would be “patched” to thwart that attack, and the process repeated. Although there may have been agreement that some schemes were not secure (as evidenced by an especially damaging attack), there was no agreed-upon notion of what requirements a “secure” scheme should satisfy, and no way to give evidence that any specific scheme was secure.
Over the past several decades, cryptography has developed into more of a science. Schemes are now developed and analyzed in a more systematic
Introduction 17
manner, with the ultimate goal being to give a rigorous proof that a given construction is secure. In order to articulate such proofs, we first need formal definitions that pin down exactly what “secure” means; such definitions are useful and interesting in their own right. As it turns out, most cryptographic proofs rely on currently unproven assumptions about the algorithmic hard- ness of certain mathematical problems; any such assumptions must be made explicit and be stated precisely. An emphasis on definitions, assumptions, and proofs distinguishes modern cryptography from classical cryptography; we discuss these three principles in greater detail in the following sections.
1.4.1 Principle 1 – Formal Definitions
One of the key contributions of modern cryptography has been the recog- nition that formal definitions of security are essential for the proper design, study, evaluation, and usage of cryptographic primitives. Put bluntly:
If you don’t understand what you want to achieve, how can you possibly know when (or if) you have achieved it?
Formal definitions provide such understanding by giving a clear description of what threats are in scope and what security guarantees are desired. As such, definitions can help guide the design of cryptographic schemes. Indeed, it is much better to formalize what is required before the design process begins, rather than to come up with a definition post facto once the design is complete. The latter approach risks having the design phase end when the designers’ patience is exhausted (rather than when the goal has been met), or may result in a construction achieving more than is needed at the expense of efficiency.
Definitions also offer a way to evaluate and analyze what is constructed. With a definition in place, one can study a proposed scheme to see if it achieves the desired guarantees; in some cases, one can even prove a given construction secure (see Section 1.4.3) by showing that it meets the definition. On the flip side, definitions can be used to conclusively show that a given scheme is not secure, insofar as the scheme does not satisfy the definition. In particular, note that the attacks in the previous section do not automatically demonstrate that any of the schemes shown there is “insecure.” For example, the attack on the Vigen`ere cipher assumed that sufficiently long English text was being encrypted, but could the Vigen`ere cipher be “secure” if short English text, or compressed text (which will have roughly uniform letter frequencies), is encrypted? It is hard to say without a formal definition in place.
Definitions enable a meaningful comparison of schemes. As we will see, there can be multiple (valid) ways to define security; the “right” one depends on the context in which a scheme is used. A scheme satisfying a weaker definition may be more efficient than another scheme satisfying a stronger definition; with precise definitions we can properly evaluate the trade-offs between the two schemes. Along the same lines, definitions enable secure usage of schemes. Consider the question of deciding which encryption scheme
18 Introduction to Modern Cryptography
to use for some larger application. A sound way to approach the problem is to first understand what notion of security is required for that application, and then find an encryption scheme satisfying that notion. A side benefit of this approach is modularity: a designer can “swap out” one encryption scheme and replace it with another (that also satisfies the necessary definition of security) without having to worry about affecting security of the overall application.
Writing a formal definition forces one to think about what is essential to the problem at hand and what properties are extraneous. Going through the process often reveals subtleties of the problem that were not obvious at first glance. We illustrate this next for the case of encryption.
An example: secure encryption. A common mistake is to think that formal definitions are not needed, or are trivial to come up with, because “everyone has an intuitive idea of what security means.” This is not the case. As an example, we consider the case of encryption. (The reader may want to pause here to think about how they would formally define what it means for an encryption scheme to be secure.) Although we postpone a formal definition of secure encryption to the next two chapters, we describe here informally what such a definition should capture.
In general, a security definition has two components: a security guarantee (or, from the attacker’s point of view, what constitutes a successful attack on the scheme) and a threat model. The security guarantee defines what the scheme is intended to prevent the attacker from doing, while the threat model describes the power of the adversary, i.e., what actions the attacker is assumed able to carry out.
Let’s start with the first of these. What should a secure encryption scheme guarantee? Here are some thoughts:
• It should be impossible for an attacker to recover the key. We have previously observed that if an attacker can determine the key shared by two parties using some scheme, then that scheme cannot be secure. However, it is easy to come up with schemes for which key recovery is impossible, yet the scheme is blatantly insecure. Consider, e.g., the scheme where Enck(m) = m. The ciphertext leaks no information about the key (and so the key cannot be recovered if it is long enough) yet the message is sent in the clear! We thus see that inability to recover the key is not sufficient for security. This makes sense: the aim of encryption is to protect the message; the key is a means for achieving this but is, in itself, unimportant.
• It should be impossible for an attacker to recover the entire plaintext from the ciphertext. This definition is better, but is still far from satisfactory. In particular, this definition would consider an encryption scheme secure if its ciphertexts revealed 90% of the plaintext, as long as 10% of the plaintext remained hard to figure out. This is clearly unacceptable in most common applications of encryption; for example, when encrypting
Introduction 19 a salary database, we would be justifiably upset if 90% of employees’
salaries were revealed!
• It should be impossible for an attacker to recover any character of the plaintext from the ciphertext. This looks like a good definition, yet is still not sufficient. Going back to the example of encrypting a salary database, we would not consider an encryption scheme secure if it re- veals whether an employee’s salary is more than or less than $100,000, even if it does not reveal any particular digit of that employee’s salary. Similarly, we would not want an encryption scheme to reveal whether employee A makes more than employee B.
Another issue is how to formalize what it means for an adversary to “recover a character of the plaintext.” What if an attacker correctly guesses, through sheer luck or external information, that the least sig- nificant digit of someone’s salary is 0? Clearly that should not render an encryption scheme insecure, and so any viable definition must somehow rule out such behavior as being a successful attack.
• The “right” answer: regardless of any information an attacker already has, a ciphertext should leak no additional information about the un- derlying plaintext. This informal definition captures all the concerns outlined above. Note in particular that it does not try to define what information about the plaintext is “meaningful”; it simply requires that no information be leaked. This is important, as it means that a secure encryption scheme is suitable for all potential applications in which se- crecy is required.
What is missing here is a precise, mathematical formulation of the def- inition. How should we capture an attacker’s prior knowledge about the plaintext? And what does it mean to (not) leak information? We will return to these questions in the next two chapters; see especially Definitions 2.3 and 3.12.
Now that we have fixed a security goal, it remains to specify a threat model. This specifies what “power” the attacker is assumed to have, but does not place any restrictions on the adversary’s strategy. This is an important dis- tinction: we specify what we assume about the adversary’s abilities, but we do not assume anything about how it uses those abilities. It is impossible to foresee what strategies might be used in an attack, and history has proven that attempts to do so are doomed to failure.
There are several plausible options for the threat model in the context of encryption; standard ones, in order of increasing power of the attacker, are:
• Ciphertext-only attack: This is the most basic attack, and refers to a scenario where the adversary just observes a ciphertext (or multiple ciphertexts) and attempts to determine information about the under- lying plaintext (or plaintexts). This is the threat model we have been
20 Introduction to Modern Cryptography
implicitly assuming when discussing classical encryption schemes in the
previous section.
• Known-plaintext attack: Here, the adversary is able to learn one or more plaintext/ciphertext pairs generated using some key. The aim of the adversary is then to deduce information about the underlying plaintext of some other ciphertext produced using the same key.
All the classical encryption schemes we have seen are trivial to break using a known-plaintext attack; we leave a demonstration as an exercise.
• Chosen-plaintext attack: In this attack, the adversary can obtain plaintext/ciphertext pairs (as above) for plaintexts of its choice.
• Chosen-ciphertext attack: The final type of attack is one where the adversary is additionally able to obtain (some information about) the decryption of ciphertexts of its choice, e.g., whether the decryption of some ciphertext chosen by the attacker yields a valid English mes- sage. The adversary’s aim, once again, is to learn information about the underlying plaintext of some other ciphertext (whose decryption the adversary is unable to obtain directly).
None of these threat models is inherently better than any other; the right one to use depends on the environment in which an encryption scheme is deployed. The first two types of attack are the easiest to carry out. In a ciphertext- only attack, the only thing the adversary needs to do is eavesdrop on the public communication channel over which encrypted messages are sent. In a known-plaintext attack it is assumed that the adversary somehow also ob- tains ciphertexts corresponding to known plaintexts. This is often easy to accomplish because not all encrypted messages are confidential, at least not indefinitely. As a trivial example, two parties may always encrypt a “hello” message whenever they begin communicating. As a more complex example, encryption may be used to keep quarterly-earnings reports secret until their release date; in this case, anyone eavesdropping on the ciphertext will later
obtain the corresponding plaintext.
In the latter two attacks the adversary is assumed to be able to obtain
encryptions and/or decryptions of plaintexts/ciphertexts of its choice. This may at first seem strange, and we defer a more detailed discussion of these attacks, and their practicality, to Section 3.4.2 (for chosen-plaintext attacks) and Section 3.7 (for chosen-ciphertext attacks).
1.4.2 Principle 2 – Precise Assumptions
Most modern cryptographic constructions cannot be proven secure uncon- ditionally; such proofs would require resolving questions in the theory of com- putational complexity that seem far from being answered today. The result of
Introduction 21
this unfortunate state of affairs is that proofs of security typically rely on as- sumptions. Modern cryptography requires any such assumptions to be made explicit and mathematically precise. At the most basic level, this is simply because mathematical proofs of security require this. But there are other reasons as well:
1. Validation of assumptions: By their very nature, assumptions are state- ments that are not proven but are instead conjectured to be true. In order to strengthen our belief in some assumption, it is necessary for the assumption to be studied. The more the assumption is examined and tested without being refuted, the more confident we are that the assumption is true. Furthermore, study of an assumption can provide evidence of its validity by showing that it is implied by some other as- sumption that is also widely believed.
If the assumption being relied upon is not precisely stated, it cannot be studied and (potentially) refuted. Thus, a pre-condition to increasing our confidence in an assumption is having a precise statement of what exactly is being assumed.
2. Comparison of schemes: Often in cryptography we are presented with two schemes that can both be proven to satisfy some definition, each based on a different assumption. Assuming all else is equal, which scheme should be preferred? If the assumption on which the first scheme is based is weaker than the assumption on which the second scheme is based (i.e., the second assumption implies the first), then the first scheme is preferable since it may turn out that the second assumption is false while the first assumption is true. If the assumptions used by the two schemes are not comparable, then the general rule is to prefer the scheme that is based on the better-studied assumption in which there is greater confidence.
3. Understanding the necessary assumptions: An encryption scheme may be based on some underlying building block. If some weaknesses are later found in the building block, how can we tell whether the encryp- tion scheme is still secure? If the underlying assumptions regarding the building block are made clear as part of proving security of the scheme, then we need only check whether the required assumptions are affected by the new weaknesses that were found.
A question that sometimes arises is: rather than prove a scheme secure based on some other assumption, why not simply assume that the construction itself is secure? In some cases—e.g., when a scheme has successfully resisted attack for many years—this may be a reasonable approach. But this approach is never preferred, and is downright dangerous when a new scheme is being introduced. The reasons above help explain why. First, an assumption that has been tested for several years is preferable to a new, ad hoc assumption
22 Introduction to Modern Cryptography
that is introduced along with a new construction. Second, there is a general preference for assumptions that are simpler to state, since such assumptions are easier to study and to (potentially) refute. So, for example, an assumption that some mathematical problem is hard to solve is simpler to study and evaluate than the assumption that an encryption scheme satisfies a complex security definition. Another advantage of relying on “lower-level” assumptions (rather than just assuming a construction is secure) is that these low-level assumptions can typically be used in other constructions. Finally, low-level assumptions can provide modularity. Consider an encryption scheme whose security relies on some assumed property of one of its building blocks. If the underlying building block turns out not to satisfy the stated assumption, the encryption scheme can still be instantiated using a different component that is believed to satisfy the necessary requirements.
1.4.3 Principle 3 – Proofs of Security
The two principles described above allow us to achieve our goal of providing a rigorous proof that a construction satisfies a given definition under certain specified assumptions. Such proofs are especially important in the context of cryptography where there is an attacker who is actively trying to “break” some scheme. Proofs of security give an iron-clad guarantee—relative to the definition and assumptions—that no attacker will succeed; this is much better than taking an unprincipled or heuristic approach to the problem. Without a proof that no adversary with the specified resources can break some scheme, we are left only with our intuition that this is the case. Experience has shown that intuition in cryptography and computer security is disastrous. There are countless examples of unproven schemes that were broken, sometimes immediately and sometimes years after being developed.
Summary: Rigorous vs. Ad Hoc Approaches to Security
Reliance on definitions, assumptions, and proofs constitutes a rigorous ap- proach to cryptography that is distinct from the informal approach of clas- sical cryptography. Unfortunately, unprincipled, “off-the-cuff” solutions are still designed and deployed by those wishing to obtain a quick solution to a problem, or by those who are simply unknowledgable. We hope this book will contribute to an awareness of the rigorous approach and its importance in developing provably secure schemes.
1.4.4 Provable Security and Real-World Security
Much of modern cryptography now rests on sound mathematical founda- tions. But this does not mean that the field is no longer partly an art as well. The rigorous approach leaves room for creativity in developing defini- tions suited to contemporary applications and environments, in proposing new
Introduction 23
mathematical assumptions or designing new primitives, and in constructing novel schemes and proving them secure. There will also, of course, always be the art of attacking deployed cryptosystems, even if they are proven secure. We expand on this point next.
The approach taken by modern cryptography has revolutionized the field, and helps provide confidence in the security of cryptographic schemes deployed in the real world. But it is important not to overstate what a proof of security implies. A proof of security is always relative to the definition being considered and the assumption(s) being used. If the security guarantee does not match what is needed, or the threat model does not capture the adversary’s true abilities, then the proof may be irrelevant. Similarly, if the assumption that is relied upon turns out to be false, then the proof of security is meaningless.
The take-away point is that provable security of a scheme does not nec- essarily imply security of that scheme in the real world.4 While some have viewed this as a drawback of provable security, we view this optimistically as illustrating the strength of the approach. To attack a provably secure scheme in the real world, it suffices to focus attention on the definition (i.e., to explore how the idealized definition differs from the real-world environment in which the scheme is deployed) or the underlying assumptions (i.e., to see whether they hold). In turn, it is the job of cryptographers to continually refine their definitions to more closely match the real world, and to investigate their as- sumptions to test their validity. Provable security does not end the age-old battle between attacker and defender, but it does provide a framework that helps shift the odds in the defender’s favor.
References and Additional Reading
In this chapter, we have studied just a few of the known historical ciphers. There are many others of both historical and mathematical interest, and we refer the reader to textbooks by Stinson [168] or Trappe and Washington [169] for further details. The important role cryptography has played throughout history is a fascinating subject covered in books by Kahn [97] and Singh [163].
Kerckhoffs’ principles were elucidated in [103, 104]. Shannon [154] was the first to pursue a rigorous approach to cryptography based on precise defini- tions and mathematical proofs; we explore his work in the next chapter.
4Here we are not even considering the possibility of an incorrect implementation of the scheme. Poorly implemented cryptography is a serious problem in the real world, but this problem is somewhat outside the scope of cryptography per se.
24 Introduction to Modern Cryptography
Exercises
1.1 Decrypt the ciphertext provided at the end of the section on mono- alphabetic substitution ciphers.
1.2 Provide a formal definition of the Gen, Enc, and Dec algorithms for the mono-alphabetic substitution cipher.
1.3 Provide a formal definition of the Gen, Enc, and Dec algorithms for the Vigen`ere cipher. (Note: there are several plausible choices for Gen; choose one.)
1.4 Implement the attacks described in this chapter for the shift cipher and the Vigen`ere cipher.
1.5 Show that the shift, substitution, and Vigen`ere ciphers are all trivial to break using a chosen-plaintext attack. How much chosen plaintext is needed to recover the key for each of the ciphers?
1.6 Assume an attacker knows that a user’s password is either abcd or bedg. Say the user encrypts his password using the shift cipher, and the at- tacker sees the resulting ciphertext. Show how the attacker can deter- mine the user’s password, or explain why this is not possible.
1.7 Repeat the previous exercise for the Vigen`ere cipher using period 2, using period 3, and using period 4.
1.8 The shift, substitution, and Vigen`ere ciphers can also be defined over the 128-character ASCII alphabet (rather than the 26-character English alphabet).
(a) Provide a formal definition of each of these schemes in this case.
(b) Discuss how the attacks we have shown in this chapter can be modified to break each of these modified schemes.
Chapter 2
Perfectly Secret Encryption
In the previous chapter we presented historical encryption schemes and showed how they can be broken with little computational effort. In this chapter, we look at the other extreme and study encryption schemes that are provably se- cure even against an adversary with unbounded computational power. Such schemes are called perfectly secret. Besides rigorously defining the notion, we will explore conditions under which perfect secrecy can be achieved. (Begin- ning in this chapter, we assume familiarity with basic probability theory. The relevant notions are reviewed in Appendix A.3.)
The material in this chapter belongs, in some sense, more to the world of “classical” cryptography than to the world of “modern” cryptography. Be- sides the fact that all the material introduced here was developed before the revolution in cryptography that took place in the mid-1970s and 1980s, the constructions we study in this chapter rely only on the first and third prin- ciples outlined in Section 1.4. That is, precise mathematical definitions are used and rigorous proofs are given, but it will not be necessary to rely on any unproven computational assumptions. It is clearly advantageous to avoid such assumptions; we will see, however, that doing so has inherent limitations. Thus, in addition to serving as a good basis for understanding the principles underlying modern cryptography, the results of this chapter also justify our later adoption of all three of the aforementioned principles.
Beginning with this chapter, we will define security and analyze schemes us- ing probabilistic experiments involving algorithms making randomized choices; a basic example is given by communicating parties’ choosing a random key. Thus, before returning to the subject of cryptography per se, we briefly discuss the issue of generating randomness suitable for cryptographic applications.
Generating randomness. Throughout the book, we will simply assume that parties have access to an unlimited supply of independent, unbiased random bits. In practice, where do these random bits come from? In principle, one could generate a small number of random bits by hand, e.g., by flipping a fair coin. But such an approach is not very convenient, nor does it scale.
Modern random-number generation proceeds in two steps. First, a “pool” of high-entropy data is collected. (For our purposes a formal definition of entropy is not needed, and it suffices to think of entropy as a measure of unpredictability.) Next, this high-entropy data is processed to yield a sequence of nearly independent and unbiased bits. This second step is necessary since high-entropy data is not necessarily uniform.
25
26 Introduction to Modern Cryptography
For the first step, some source of unpredictable data is needed. There are several ways such data can be acquired. One technique is to rely on external inputs, for example, delays between network events, hard-disk access times, keystrokes or mouse movements made by the user, and so on. Such data is likely to be far from uniform, but if enough measurements are taken the re- sulting pool of data is expected to have sufficient entropy. More sophisticated approaches—which, by design, incorporate random-number generation more tightly into the system at the hardware level—have also been used. These rely on physical phenomena such as thermal/shot noise or radioactive decay. In- tel has recently developed a processor that includes a digital random-number generator on the processor chip and provides a dedicated instruction for ac- cessing the resulting random bits (after they have been suitably processed to yield independent, unbiased bits, as discussed next).
The processing needed to “smooth” the high-entropy data to obtain (nearly) uniform bits is a non-trivial one, and is discussed briefly in Section 5.6.4. Here, we just give a simple example to give an idea of what is done. Imagine that our high-entropy pool results from a sequence of biased coin flips, where “heads” occurs with probability p and “tails” with probability 1 − p. (We do assume, however, that the result of any coin flip is independent of all other coin flips. In practice this assumption is typically not valid.) The result of 1,000 such coin flips certainly has high entropy, but is not close to uniform. We can obtain a uniform distribution by considering the coin flips in pairs: if we see a head followed by a tail then we output “0,” and if we see a tail followed by a head then we output “1.” (If we see two heads or two tails in a row, we output nothing, and simply move on to the next pair.) The probability that any pair results in a “0” is p · (1 − p), which is exactly equal to the probability that any pair results in a “1,” and we thus obtain a uniformly distributed output from our initial high-entropy pool.
Care must be taken in how random bits are produced, and using poor random-number generators can often leave a good cryptosystem vulnerable to attack. One should use a random-number generator that is designed for cryptographic use, rather than a “general-purpose” random-number generator, which is not suitable for cryptographic applications. In particular, the rand() function in the C stdlib.h library is not cryptographically secure, and using it in cryptographic settings can have disastrous consequences.
2.1 Definitions
We begin by recalling and expanding upon the syntax that was introduced in the previous chapter. An encryption scheme is defined by three algorithms Gen, Enc, and Dec, as well as a specification of a (finite) message space M
Perfectly Secret Encryption 27
with|M|>1.1 Thekey-generationalgorithmGenisaprobabilisticalgorithm that outputs a key k chosen according to some distribution. We denote by K the (finite) key space, i.e., the set of all possible keys that can be output by Gen. The encryption algorithm Enc takes as input a key k ∈ K and a message m ∈ M, and outputs a ciphertext c. We now allow the encryption algorithm to be probabilistic (so Enck(m) might output a different ciphertext when run multiple times), and we write c ← Enck(m) to denote the possibly probabilistic process by which message m is encrypted using key k to give ciphertext c. (In case Enc is deterministic, we may emphasize this by writing c := Enck(m). Looking ahead, we also sometimes use the notation x ← S to denote uniform selection of x from a set S.) We let C denote the set of all possible ciphertexts that can be output by Enck(m), for all possible choices of k ∈ K and m ∈ M (and for all random choices of Enc in case it is randomized). The decryption algorithm Dec takes as input a key k ∈ K and a ciphertext c ∈ C and outputs a message m ∈ M. We assume perfect correctness, meaning that for all k ∈ K, m ∈ M, and any ciphertext c output by Enck(m), it holds that Deck(c) = m with probability 1. Perfect correctness implies that we may assume Dec is deterministic without loss of generality, since Deck(c) must give the same output every time it is run. We will thus write m := Deck(c) to denote the process of decrypting ciphertext c using key k to yield the message m.
In the definitions and theorems below, we refer to probability distributions over K, M, and C. The distribution over K is the one defined by running Gen and taking the output. (It is almost always the case that Gen chooses a key uniformly from K and, in fact, we may assume this without loss of generality; see Exercise 2.1.) We let K be a random variable denoting the value of the key output by Gen; thus, for any k ∈ K, Pr[K = k] denotes the probability that the key output by Gen is equal to k. Similarly, we let M be a random variable denoting the message being encrypted, so Pr[M = m] denotes the probability that the message takes on the value m ∈ M. The probability distribution of the message is not determined by the encryption scheme itself, but instead reflects the likelihood of different messages being sent by the parties using the scheme, as well as an adversary’s uncertainty about what will be sent. As an example, an adversary may know that the message will either be attack today or don’t attack. The adversary may even know (by other means) that with probability 0.7 the message will be a command to attack and with probability 0.3 the message will be a command not to attack. In this case, we have Pr[M = attack today] = 0.7 and Pr[M = don’t attack] = 0.3.
K and M are assumed to be independent, i.e., what is being communicated by the parties is independent of the key they happen to share. This makes sense, among other reasons, because the distribution over K is determined by
1If |M| = 1 there is only one message and no point in communicating, let alone encrypting.
28 Introduction to Modern Cryptography
the encryption scheme itself (since it is defined by Gen), while the distribution over M depends on the context in which the encryption scheme is being used. Fixing an encryption scheme and a distribution over M determines a dis- tribution over the space of ciphertexts C given by choosing a key k ∈ K (ac- cording to Gen) and a message m ∈ M (according to the given distribution), and then computing the ciphertext c ← Enck(m). We let C be the random variable denoting the resulting ciphertext and so, for c ∈ C, write Pr[C = c]
to denote the probability that the ciphertext is equal to the fixed value c.
Example 2.1
We work through a simple example for the shift cipher (cf. Section 1.3). Here, by definition, we have K = {0,…,25} with Pr[K = k] = 1/26 for each k ∈ K.
Say we are given the following distribution over M: Pr[M =a] = 0.7 and Pr[M =z] = 0.3.
What is the probability that the ciphertext is B? There are only two ways this can occur: either M =aand K = 1, or M =zand K = 2. By independence of M and K, we have
Pr[M = a ∧ K = 1] = Pr[M = a] · Pr[K = 1] = 0.7 · 1 .
26 Similarly, Pr[M = z ∧ K = 2] = 0.3 · 1 . Therefore,
26
Pr[C = B] = Pr[M = a ∧ K = 1] + Pr[M = z ∧ K = 2] = 0.7· 1 +0.3· 1 = 1/26.
26 26
We can calculate conditional probabilities as well. For example, what is the probability that the message a was encrypted, given that we observe ciphertext B? Using Bayes’ Theorem (Theorem A.8) we have
Pr[M =a| C =B] = Pr[C =B| M =a]·Pr[M =a] Pr[C = B]
= 0.7 · Pr[C = B | M = a] . 1/26
Note that Pr[C =B| M =a] = 1/26, since if M =athen the only way C =B can occur is if K = 1 (which occurs with probability 1/26). We conclude that Pr[M =a| C =B] = 0.7. ♦
Example 2.2
Consider the shift cipher again, but with the following distribution over M: Pr[M =kim] = 0.5, Pr[M =ann] = 0.2, Pr[M =boo] = 0.3.
Perfectly Secret Encryption 29
What is the probability that C = DQQ? The only way this ciphertext can occur is if M =annand K = 3, or M =booand K = 2, which happens with probability 0.2 · 1/26 + 0.3 · 1/26 = 1/52.
So what is the probability that ann was encrypted, conditioned on observ- ing the ciphertext DQQ? A calculation as above using Bayes’ Theorem gives Pr[M =ann| C =DQQ] = 0.4. ♦
Perfect secrecy. We are now ready to define the notion of perfect secrecy. We imagine an adversary who knows the probability distribution over M; that is, the adversary knows the likelihood that different messages will be sent. This adversary also knows the encryption scheme being used; the only thing unknown to the adversary is the key shared by the parties. A message is chosen by one of the honest parties and encrypted, and the resulting ci- phertext transmitted to the other party. The adversary can eavesdrop on the parties’ communication, and thus observe this ciphertext. (That is, this is a ciphertext-only attack, where the attacker gets only a single ciphertext.) For a scheme to be perfectly secret, observing this ciphertext should have no effect on the adversary’s knowledge regarding the actual message that was sent; in other words, the a posteriori probability that some message m ∈ M was sent, conditioned on the ciphertext that was observed, should be no different from the a priori probability that m would be sent. This means that the cipher- text reveals nothing about the underlying plaintext, and the adversary learns absolutely nothing about the plaintext that was encrypted. Formally:
DEFINITION 2.3 An encryption scheme (Gen,Enc,Dec) with message space M is perfectly secret if for every probability distribution over M, every message m ∈ M, and every ciphertext c ∈ C for which Pr[C = c] > 0:
Pr[M =m|C =c]=Pr[M =m].
(The requirement that Pr[C = c] > 0 is a technical one needed to prevent conditioning on a zero-probability event.)
We now give an equivalent formulation of perfect secrecy. Informally, this formulation requires that the probability distribution of the ciphertext does not depend on the plaintext, i.e., for any two messages m, m′ ∈ M the distri- bution of the ciphertext when m is encrypted should be identical to the distri- bution of the ciphertext when m′ is encrypted. Formally, for every m, m′ ∈ M, and every c ∈ C,
Pr[EncK(m) = c] = Pr[EncK(m′) = c] (2.1)
(where the probabilities are over choice of K and any randomness of Enc). This implies that the ciphertext contains no information about the plaintext, and that it is impossible to distinguish an encryption of m from an encryption of m′, since the distributions over the ciphertext are the same in each case.
30 Introduction to Modern Cryptography
LEMMA 2.4 An encryption scheme (Gen,Enc,Dec) with message space M is perfectly secret if and only if Equation (2.1) holds for every m, m′ ∈ M and every c ∈ C.
PROOF We show that if the stated condition holds, then the scheme is perfectly secret; the converse implication is left to Exercise 2.4. Fix a distribution over M, a message m, and a ciphertext c for which Pr[C = c] > 0. If Pr[M = m] = 0 then we trivially have
Pr[M = m | C = c] = 0 = Pr[M = m]. So, assume Pr[M = m] > 0. Notice first that
Pr[C =c|M =m]=Pr[EncK(M)=c|M =m]=Pr[EncK(m)=c], where the first equality is by definition of the random variable C, and the
def
second is because we condition on the event that M is equal to m. Set δc = Pr[EncK (m) = c] = Pr[C = c | M = m]. If the condition of the lemma holds, thenforeverym′ ∈MwehavePr[EncK(m′)=c]=Pr[C=c|M=m′]=δc. Using Bayes’ Theorem (see Appendix A.3), we thus have
Pr[M = m | C = c] = Pr[C = c | M = m] · Pr[M = m] Pr[C = c]
=
=
Pr[C =c|M =m]·Pr[M =m] m′∈M Pr[C = c | M = m′] · Pr[M = m′]
δc ·Pr[M =m]
δc · Pr[M = m ] Pr[M =m]
m′ ∈M
m′∈M ′
= Pr[M=m] = Pr[M=m],
′
where the summation is over m′ ∈ M with Pr[M = m′] ̸= 0. We conclude that for every m ∈ M and c ∈ C for which Pr[C = c] > 0, it holds that Pr[M = m | C = c] = Pr[M = m], and so the scheme is perfectly secret.
Perfect (adversarial) indistinguishability. We conclude this section by presenting another equivalent definition of perfect secrecy. This definition is based on an experiment involving an adversary passively observing a cipher- text and then trying to guess which of two possible messages was encrypted. We introduce this notion since it will serve as our starting point for defining computational security in the next chapter. Indeed, throughout the rest of the book we will often use experiments of this sort to define security.
In the present context, we consider the following experiment: an adver- sary A first specifies two arbitrary messages m0,m1 ∈ M. One of these two
Perfectly Secret Encryption 31
messages is chosen uniformly at random and encrypted using a random key; the resulting ciphertext is given to A. Finally, A outputs a “guess” as to which of the two messages was encrypted; A succeeds if it guesses correctly. An encryption scheme is perfectly indistinguishable if no adversary A can suc- ceed with probability better than 1/2. (Note that, for any encryption scheme, A can succeed with probability 1/2 by outputting a uniform guess; the re- quirement is simply that no attacker can do any better than this.) We stress that no limitations are placed on the computational power of A.
Formally, let Π = (Gen,Enc,Dec) be an encryption scheme with message
space M. Let A be an adversary, which is formally just a (stateful) algorithm.
We define an experiment PrivKeav as follows: A,Π
The adversarial indistinguishability experiment PrivKeav : A,Π
1. The adversary A outputs a pair of messages m0, m1 ∈ M.
2. A key k is generated using Gen, and a uniform bit b ∈ {0, 1} is chosen. Ciphertext c ← Enck(mb) is computed and given to A. We refer to c as the challenge ciphertext.
3. A outputs a bit b′.
4. The output of the experiment is defined to be 1 if b′ = b,
and 0 otherwise. We write PrivKeav = 1 if the output of the A,Π
experiment is 1 and in this case we say that A succeeds.
As noted earlier, it is trivial for A to succeed with probability 1/2 by out- putting a random guess. Perfect indistinguishability requires that it is impos- sible for any A to do better.
DEFINITION 2.5 Encryption scheme Π = (Gen, Enc, Dec) with message space M is perfectly indistinguishable if for every A it holds that
Pr PrivKeav = 1 = 1 . A,Π 2
The following lemma states that Definition 2.5 is equivalent to Defini- tion 2.3. We leave the proof of the lemma as Exercise 2.5.
LEMMA 2.6 Encryption scheme Π is perfectly secret if and only if it is perfectly indistinguishable.
Example 2.7
We show that the Vigen`ere cipher is not perfectly indistinguishable, at least for certain parameters. Concretely, let Π denote the Vigen`ere cipher for the message space of two-character strings, and where the period is chosen uni- formly in {1, 2}. To show that Π is not perfectly indistinguishable, we exhibit
an adversary A for which Pr PrivKeav = 1 > 1 . A,Π 2
32
Introduction to Modern Cryptography
Adversary A does:
1. Output m0 =aaand m1 =ab.
2. Upon receiving the challenge ciphertext c = c1c2, do the following: if c1 = c2 output 0; else output 1.
(2.2)
where b is the uniform bit determining which message gets encrypted. A
outputs 0 if and only if the two characters of the ciphertext c = c1c2 are
equal. When b = 0 (so m0 =aais encrypted) then c1 = c2 if either (1) a key
of period 1 is chosen, or (2) a key of period 2 is chosen, and both characters
of the key are equal. The former occurs with probability 1, and the latter
Computation of Pr PrivKeav = 1 is tedious but straightforward. Pr PrivKeav = 1 A,Π
A,Π
= 1·PrPrivKeav =1|b=0+1·PrPrivKeav =1|b=1
2 A,Π 2 A,Π
= 1·Pr[Aoutputs0|b=0]+1·Pr[Aoutputs1|b=1],
22
occurs with probability 1 · 1 . So 2 26
2
Pr[Aoutputs0|b=0]=1+1· 1 ≈0.52. 2 2 26
Whenb=1thenc1 =c2 onlyifakeyofperiod2ischosenandthefirst
character of the key is one more than the second character of the key, which
happens with probability 1 · 1 . So 2 26
Pr[Aoutputs1|b=1]=1−Pr[Aoutputs0|b=1]=1−1· 1 ≈0.98. 2 26
Plugging into Equation (2.2) then gives
PrPrivKeav =1=1·1+1· 1 +1−1· 1=0.75>1,
A,Π 2 2 2 26 2 26 2
and the scheme is not perfectly indistinguishable. ♦
2.2 The One-Time Pad
In 1917, Vernam patented a perfectly secret encryption scheme now called the one-time pad. At the time Vernam proposed the scheme, there was no proof that it was perfectly secret; in fact, there was not yet a notion of what perfect secrecy was. Approximately 25 years later, however, Shannon intro- duced the definition of perfect secrecy and demonstrated that the one-time pad achieves that level of security.
Perfectly Secret Encryption 33
CONSTRUCTION 2.8
Fix an integer l > 0. The message space M, key space K, and ciphertext space C are all equal to {0, 1}l (the set of all binary strings of length l).
• Gen: the key-generation algorithm chooses a key from K = {0, 1}l according to the uniform distribution (i.e., each of the 2l strings in the space is chosen as the key with probability exactly 2−l).
• Enc: given a key k ∈ {0,1}l and a message m ∈ {0,1}l, the encryption algorithm outputs the ciphertext c := k ⊕ m.
• Dec: given a key k ∈ {0,1}l and a ciphertext c ∈ {0,1}l, the decryption algorithm outputs the message m := k ⊕ c.
The one-time pad encryption scheme.
In describing the scheme we let a ⊕ b denote the bitwise exclusive-or (XOR) of two binary strings a and b (i.e., if a = a1 ···al and b = b1 ···bl are l-bit strings, then a⊕b is the l-bit string given by a1 ⊕b1 ···al ⊕bl). In the one- time pad encryption scheme the key is a uniform string of the same length as the message; the ciphertext is computed by simply XORing the key and the message. A formal definition is given as Construction 2.8. Before discussing security, we first verify correctness: for every key k and every message m it holds that Deck(Enck(m)) = k ⊕ k ⊕ m = m, and so the one-time pad constitutes a valid encryption scheme.
One can easily prove perfect secrecy of the one-time pad using Lemma 2.4 and the fact that the ciphertext is uniformly distributed regardless of what message is encrypted. We give a proof based directly on the original definition.
THEOREM 2.9 The one-time pad encryption scheme is perfectly secret. PROOF We first compute Pr[C = c | M = m′] for arbitrary c∈C and
m′ ∈ M. For the one-time pad,
Pr[C = c | M = m′] = Pr[EncK(m′) = c] = Pr[m′ ⊕ K = c]
= Pr[K = m′ ⊕ c] = 2−l,
where the final equality holds because the key K is a uniform l-bit string. Fix any distribution over M. For any c ∈ C, we have
Pr[C = c] = Pr[C = c | M = m′] · Pr[M = m′] m′∈M
= 2−l · Pr[M = m′]
m′ ∈M = 2−l,
34 Introduction to Modern Cryptography
where the sum is over m′ ∈ M with Pr[M = m′] ̸= 0. Bayes’ Theorem gives:
Pr[M = m | C = c] = Pr[C = c | M = m] · Pr[M = m] Pr[C = c]
= 2−l · Pr[M = m] 2−l
= Pr[M = m].
We conclude that the one-time pad is perfectly secret.
The one-time pad was used by several national-intelligence agencies in the mid-20th century to encrypt sensitive traffic. Perhaps most famously, the “red phone” linking the White House and the Kremlin during the Cold War was protected using one-time pad encryption, where the governments of the US and USSR would exchange extremely long keys using trusted couriers carrying briefcases of paper on which random characters were written.
Notwithstanding the above, one-time pad encryption is rarely used any more due to a number of drawbacks it has. Most prominent is that the key is as long as the message.2 This limits the usefulness of the scheme for sending very long messages (as it may be difficult to securely share and store a very long key), and is problematic when the parties cannot predict in advance (an upper bound on) how long the message will be.
Moreover, the one-time pad—as the name indicates—is only secure if used once (with the same key). Although we did not yet define a notion of secrecy when multiple messages are encrypted, it is easy to see that encrypting more than one message with the same key leaks a lot of information. In particular, say two messages m,m′ are encrypted using the same key k. An adversary whoobtainsc=m⊕kandc′ =m′⊕kcancompute
c⊕c′ =(m⊕k)⊕(m′ ⊕k)=m⊕m′
and thus learn the exclusive-or of the two messages or, equivalently, exactly where the two messages differ. While this may not seem very significant, it is enough to rule out any claims of perfect secrecy for encrypting two messages using the same key. Moreover, if the messages correspond to natural-language text, then given the exclusive-or of two sufficiently long messages it is possible to perform frequency analysis (as in the previous chapter, though more com- plex) and recover the messages themselves. An interesting historical example of this is given by the VENONA project, as part of which the US and UK were able to decrypt ciphertexts sent by the Soviet Union that were mistak- enly encrypted with repeated portions of a one-time pad over several decades.
2This does not make the one-time pad useless, since it may be easier for two parties to share a key at some point in time before the message to be communicated is known.
Perfectly Secret Encryption 35
2.3 Limitations of Perfect Secrecy
We ended the previous section by noting some drawbacks of the one-time pad encryption scheme. Here, we show that these drawbacks are not specific to that scheme, but are instead inherent limitations of perfect secrecy. Specif- ically, we prove that any perfectly secret encryption scheme must have a key space that is at least as large as the message space. If all keys are the same length, and the message space consists of all strings of some fixed length, this implies that the key is at least as long as the message. In particular, the key length of the one-time pad is optimal. (The other limitation—namely, that the key can be used only once—is also inherent if perfect secrecy is required; see Exercise 2.13.)
THEOREM 2.10 If (Gen, Enc, Dec) is a perfectly secret encryption scheme with message space M and key space K, then |K| ≥ |M|.
PROOF We show that if |K| < |M| then the scheme cannot be perfectly secret. Assume |K| < |M|. Consider the uniform distribution over M and let c ∈ C be a ciphertext that occurs with non-zero probability. Let M(c) be the set of all possible messages that are possible decryptions of c; that is
def
M(c) = {m|m=Deck(c)forsomek∈K}.
Clearly |M(c)| ≤ |K|. (Recall that we may assume Dec is deterministic.) If
|K| < |M|, there is some m′ ∈ M such that m′ ̸∈ M(c). But then Pr[M = m′ | C = c] = 0 ̸= Pr[M = m′],
and so the scheme is not perfectly secret.
Perfect secrecy with shorter keys? The above theorem shows an inherent limitation of schemes that achieve perfect secrecy. Even so, individuals oc- casionally claim they have developed a radically new encryption scheme that is “unbreakable” and achieves the security of the one-time pad without using keys as long as what is being encrypted. The above proof demonstrates that such claims cannot be true; anyone making such claims either knows very little about cryptography or is blatantly lying.
36 Introduction to Modern Cryptography
2.4 *Shannon’s Theorem
In his work on perfect secrecy, Shannon also provided a characterization of perfectly secret encryption schemes. This characterization says that, under certain conditions, the key-generation algorithm Gen must choose the key uniformly from the set of all possible keys (as in the one-time pad); moreover, for every message m and ciphertext c there is a unique key mapping m to c (again, as in the one-time pad). Beyond being interesting in its own right, this theorem is a useful tool for proving (or disproving) perfect secrecy of suggested schemes. We discuss this further after the proof.
The theorem as stated here assumes |M| = |K| = |C|, meaning that the sets of plaintexts, keys, and ciphertexts all have the same size. We have already seen that for perfect secrecy we must have |K| ≥ |M|. It is easy to see that correct decryption requires |C| ≥ |M|. Therefore, in some sense, encryption schemes with |M| = |K| = |C| are “optimal.”
THEOREM 2.11 (Shannon’s theorem) Let (Gen, Enc, Dec) be an en- cryption scheme with message space M, for which |M| = |K| = |C|. The scheme is perfectly secret if and only if:
1. Every key k ∈ K is chosen with (equal) probability 1/|K| by algorithm Gen. 2. Foreverym∈Mandeveryc∈C,thereexistsauniquekeyk∈Ksuch
that Enck(m) outputs c.
PROOF The intuition behind the proof is as follows. To see that the stated conditions imply perfect secrecy, note that condition 2 means that any ciphertext c could be the result of encrypting any possible plaintext m, because there is some key k mapping m to c. Since there is a unique such key, and each key is chosen with equal probability, perfect secrecy follows as for the one-time pad. For the other direction, perfect secrecy immediately implies that for every m and c there is at least one key mapping m to c. The fact that |M| = |K| = |C| means, moreover, that for every m and c there is exactly one such key. Given this, each key must be chosen with equal probability or else perfect secrecy would fail to hold. A formal proof follows.
We assume for simplicity that Enc is deterministic. (One can show that this is without loss of generality here.) We first prove that if the encryption scheme satisfies conditions 1 and 2, then it is perfectly secret. The proof is essentially the same as the proof of perfect secrecy for the one-time pad, so we will be relatively brief. Fix arbitrary c ∈ C and m ∈ M. Let k be the unique key, guaranteed by condition 2, for which Enck(m) = c. Then,
Pr[C =c|M =m]=Pr[K =k]=1/|K|,
Perfectly Secret Encryption 37 where the final equality holds by condition 1. So
Pr[C =c]= Pr[EncK(m)=c]·Pr[M =m]=1/|K|. m∈M
This holds for any distribution over M. Thus, for any distribution over M, any m ∈ M with Pr[M = m] ̸= 0, and any c ∈ C, we have:
Pr[M = m | C = c] = Pr[C = c | M = m] · Pr[M = m] Pr[C = c]
= Pr[EncK(m) = c] · Pr[M = m] Pr[C = c]
= |K|−1 ·Pr[M =m] = Pr[M =m], |K|−1
and the scheme is perfectly secret.
For the second direction, assume the encryption scheme is perfectly secret;
we show that conditions 1 and 2 hold. Fix arbitrary c ∈ C. There must be some message m∗ for which Pr[EncK(m∗) = c] ̸= 0. Lemma 2.4 then implies that Pr[EncK(m) = c] ̸= 0 for every m ∈ M. In other words, if we let M = {m1,m2,...}, then for each mi ∈ M we have a nonempty set of keys Ki ⊂ K such that Enck(mi) = c if and only if k ∈ Ki. Moreover, when i ̸= j then Ki and Kj must be disjoint or else correctness fails to hold. Since |K| = |M|, we see that each Ki contains only a single key ki, as required by condition 2. Now, Lemma 2.4 shows that for any mi, mj ∈ M we have
Pr[K = ki] = Pr[EncK(mi) = c] = Pr[EncK(mj) = c] = Pr[K = kj]. Sincethisholdsforall1≤i,j≤|M|=|K|,andki ̸=kj fori̸=j,thismeans
each key is chosen with probability 1/|K|, as required by condition 1.
Shannon’s theorem is useful for deciding whether a given scheme is perfectly secret. Condition 1 is easy to check, and condition 2 can be demonstrated (or contradicted) without having to compute any probabilities (in contrast to working with Definition 2.3 directly). As an example, perfect secrecy of the one-time pad is trivial to prove using Shannon’s theorem. We stress, however, that the theorem only applies when |M| = |K| = |C|.
References and Additional Reading
The one-time pad is popularly credited to Vernam [172], who filed a patent on it, but recent historical research [25] shows that it was invented some
38 Introduction to Modern Cryptography
35 years earlier. Analysis of the one-time pad had to await the ground- breaking work of Shannon [154], who introduced the notion of perfect secrecy. In this chapter we studied perfectly secret encryption. Some other cryp- tographic problems can also be solved with “perfect” security. A notable example is the problem of message authentication where the aim is to prevent an adversary from (undetectably) modifying a message sent from one party to another. We study this problem in depth in Chapter 4, discussing “perfectly
secure” message authentication in Section 4.6.
Exercises
2.1 Prove that, by redefining the key space, we may assume that the key- generation algorithm Gen chooses a key uniformly at random from the key space, without changing Pr[C = c | M = m] for any m, c.
Hint: Define the key space to be the set of all possible random tapes for the randomized algorithm Gen.
2.2 Prove that, by redefining the key space, we may assume that Enc is deterministic without changing Pr[C = c | M = m] for any m, c.
2.3 Prove or refute: An encryption scheme with message space M is per- fectly secret if and only if for every probability distribution over M and every c0,c1 ∈ C we have Pr[C = c0] = Pr[C = c1].
2.4 Prove the second direction of Lemma 2.4.
2.5 Prove Lemma 2.6.
2.6 For each of the following encryption schemes, state whether the scheme is perfectly secret. Justify your answer in each case.
(a) The message space is M = {0, . . . , 4}. Algorithm Gen chooses a uniform key from the key space {0, . . . , 5}. Enck (m) returns [k + m mod 5], and Deck(c) returns [c − k mod 5].
(b) ThemessagespaceisM={m∈{0,1}l |thelastbitofmis0}. Gen chooses a uniform key from {0, 1}l−1. Enck (m) returns cipher- text m ⊕ (k∥0), and Deck(c) returns c ⊕ (k∥0).
2.7 When using the one-time pad with the key k = 0l, we have Enck(m) = k ⊕ m = m and the message is sent in the clear! It has therefore been suggested to modify the one-time pad by only encrypting with k ̸= 0l (i.e., to have Gen choose k uniformly from the set of nonzero keys of length l). Is this modified scheme still perfectly secret? Explain.
Perfectly Secret Encryption 39
2.8 Let Π denote the Vigen`ere cipher where the message space consists of all 3-character strings (over the English alphabet), and the key is generated by first choosing the period t uniformly from {1, 2, 3} and then letting the key be a uniform string of length t.
(a) Define A as follows: A outputs m0 = aab and m1 = abb. When given a ciphertext c, it outputs 0 if the first character of c is the same as the second character of c, and outputs 1 otherwise. Com- pute Pr[PrivKeav = 1].
A,Π
(b) Construct and analyze an adversary A′ for which Pr[PrivKeav = 1]
is greater than your answer from part (a). A′,Π
2.9 In this exercise, we look at different conditions under which the shift,
mono-alphabetic substitution, and Vigen`ere ciphers are perfectly secret:
(a) Prove that if only a single character is encrypted, then the shift cipher is perfectly secret.
(b) WhatisthelargestmessagespaceMforwhichthemono-alphabetic substitution cipher provides perfect secrecy?
(c) Prove that the Vigen`ere cipher using (fixed) period t is perfectly secret when used to encrypt messages of length t.
Reconcile this with the attacks shown in the previous chapter.
2.10 Prove that a scheme satisfying Definition 2.5 must have |K| ≥ |M|
Hint: It may be easier to let A be randomized.
without using Lemma 2.4. Specifically, let Π be an arbitrary encryption
scheme with |K| < |M|. Show an A for which Pr PrivKeav = 1 > 1 . A,Π 2
2.11 Assume we require only that an encryption scheme (Gen, Enc, Dec) with message space M satisfy the following: For all m ∈ M, we have Pr[DecK(EncK(m)) = m] ≥ 2−t. (This probability is taken over choice of the key as well as any randomness used during encryption.) Show that perfect secrecy can be achieved with |K| < |M| when t ≥ 1. Prove a lower bound on the size of K in terms of t.
2.12 Let ε ≥ 0 be a constant. Say an encryption scheme is ε-perfectly secret if for every adversary A it holds that
PrPrivKeav =1≤1+ε. A,Π 2
(Compare to Definition 2.5.) Show that ε-perfect secrecy can be achieved with|K|<|M|whenε>0. ProvealowerboundonthesizeofKin terms of ε.
40 Introduction to Modern Cryptography
2.13 In this problem we consider definitions of perfect secrecy for the en- cryption of two messages (using the same key). Here we consider dis- tributions over pairs of messages from the message space M; we let M1, M2 be random variables denoting the first and second message, re- spectively. (We stress that these random variables are not assumed to be independent.) We generate a (single) key k, sample a pair of mes- sages (m1,m2) according to the given distribution, and then compute ciphertexts c1 ← Enck(m1) and c2 ← Enck(m2); this induces a distri- bution over pairs of ciphertexts and we let C1,C2 be the corresponding random variables.
(a) Say encryption scheme (Gen,Enc,Dec) is perfectly secret for two messages if for all distributions over M × M, all m1, m2 ∈ M, and all ciphertexts c1,c2 ∈ C with Pr[C1 = c1 ∧C2 = c2] > 0:
Pr[M1 =m1 ∧M2 =m2 |C1 =c1 ∧C2 =c2] =Pr[M1 =m1 ∧M2 =m2].
Prove that no encryption scheme can satisfy this definition. Hint: Take c1 = c2.
(b) Say encryption scheme (Gen,Enc,Dec) is perfectly secret for two distinct messages if for all distributions over M × M where the first and second messages are guaranteed to be different (i.e., dis- tributions over pairs of distinct messages), all m1,m2 ∈ M, and all c1, c2 ∈ C with Pr[C1 = c1 ∧ C2 = c2] > 0:
Pr[M1 =m1 ∧M2 =m2 |C1 =c1 ∧C2 =c2] =Pr[M1 =m1 ∧M2 =m2].
Show an encryption scheme that provably satisfies this definition.
Hint: The encryption scheme you propose need not be efficient, although an efficient solution is possible.
Part II
Private-Key (Symmetric) Cryptography
Chapter 3 Private-Key Encryption
In the previous chapter we saw some fundamental limitations of perfect se- crecy. In this chapter we begin our study of modern cryptography by intro- ducing the weaker (but sufficient) notion of computational secrecy. We will then show how this definition can be used to bypass the impossibility results shown previously and, in particular, how a short key (say, 128 bits long) can be used to encrypt many long messages (say, gigabytes in total).
Along the way we will study the fundamental notion of pseudorandomness, which captures the idea that something can “look” completely random even though it is not. This powerful concept underlies much of modern cryptogra- phy, and has applications and implications beyond the field as well.
3.1 Computational Security
In Chapter 2 we introduced the notion of perfect secrecy. While perfect secrecy is a worthwhile goal, it is also unnecessarily strong. Perfect secrecy requires that absolutely no information about an encrypted message is leaked, even to an eavesdropper with unlimited computational power. For all practical purposes, however, an encryption scheme would still be considered secure if it leaked only a tiny amount of information to eavesdroppers with bounded computational power. For example, a scheme that leaks information with probability at most 2−60 to eavesdroppers investing up to 200 years of com- putational effort on the fastest available supercomputer is adequate for any real-world application. Security definitions that take into account computa- tional limits on the attacker, and allow for a small probability of failure, are called computational, to distinguish them from notions (like perfect secrecy) that are information-theoretic in nature. Computational security is now the de facto way in which security is defined for all cryptographic purposes.
We stress that although we give up on obtaining perfect security, this does not mean we do away with the rigorous mathematical approach. Definitions and proofs are still essential, and the only difference is that we now consider weaker (but still meaningful) definitions of security.
Computational security incorporates two relaxations relative to information- 43
44 Introduction to Modern Cryptography
theoretic notions of security (in the case of encryption, both these relaxations are necessary in order to go beyond the limitations of perfect secrecy discussed in the previous chapter):
1. Security is only guaranteed against efficient adversaries that run for some feasible amount of time. This means that given enough time (or suffi- cient computational resources) an attacker may be able to violate secu- rity. If we can make the resources required to break the scheme larger than those available to any realistic attacker, then for all practical pur- poses the scheme is unbreakable.
2. Adversaries can potentially succeed (i.e., security can potentially fail) with some very small probability. If we can make this probability suffi- ciently small, we need not worry about it.
To obtain a meaningful theory, we need to precisely define the above relax- ations. There are two general approaches for doing so: the concrete approach and the asymptotic approach. These are described next.
3.1.1 The Concrete Approach
The concrete approach to computational security quantifies the security of a cryptographic scheme by explicitly bounding the maximum success probabil- ity of any (randomized) adversary running for some specified amount of time or, more precisely, investing some specific amount of computational effort. Thus, a concrete definition of security takes roughly the following form:
A scheme is (t,ε)-secure if any adversary running for time at most t succeeds in breaking the scheme with probability at most ε.
(Of course, the above serves only as a general template, and for the above statement to make sense we need to define exactly what it means to “break” the scheme in question.) As an example, one might have a scheme with the guarantee that no adversary running for at most 200 years using the fastest available supercomputer can succeed in breaking the scheme with probability better than 2−60. Or, it may be more convenient to measure running time in terms of CPU cycles, and to construct a scheme such that no adversary using at most 280 cycles can break the scheme with probability better than 2−60.
It is instructive to get a feel for the large values of t and the small values of ε that are typical of modern cryptographic schemes.
Example 3.1
Modern private-key encryption schemes are generally assumed to give almost optimal security in the following sense: when the key has length n—and so the key space has size 2n—an adversary running for time t (measured in, say, computer cycles) succeeds in breaking the scheme with probability at most
Private-Key Encryption 45
ct/2n for some fixed constant c. (This simply corresponds to a brute-force search of the key space, and assumes no preprocessing has been done.)
Assuming c = 1 for simplicity, a key of length n = 60 provides adequate security against an adversary using a desktop computer. Indeed, on a 4 GHz processor (that executes 4 × 109 cycles per second) 260 CPU cycles require 260/(4×109) seconds, or about 9 years. However, the fastest supercomputer at the time of this writing can execute roughly 2 × 1016 floating point operations per second, and 260 such operations require only about 1 minute on such a machine. Taking n = 80 would be a more prudent choice; even the computer just mentioned would take about 2 years to carry out 280 operations.
(The above numbers are for illustrative purposes only; in practice c > 1, and several other factors—such as the time required for memory access and the possibility of parallel computation on a network of computers—significantly affect the performance of brute-force attacks.)
Today, however, a recommended key length might be n = 128. The differ- ence between 280 and 2128 is a multiplicative factor of 248. To get a feeling for how big this is, note that according to physicists’ estimates the number of seconds since the Big Bang is on the order of 258.
If the probability that an attacker can successfully recover an encrypted message in one year is at most 2−60, then it is much more likely that the sender and receiver will both be hit by lightning in that same period of time. An event that occurs once every hundred years can be roughly estimated to occur with probability 2−30 in any given second. Something that occurs with probability 2−60 in any given second is 230 times less likely, and might be expected to occur roughly once every 100 billion years. ♦
The concrete approach is important in practice, since concrete guarantees are what users of a cryptographic scheme are ultimately interested in. How- ever, precise concrete guarantees are difficult to provide. Furthermore, one must be careful in interpreting concrete security claims. For example, a claim that no adversary running for 5 years can break a given scheme with proba- bility better than ε begs the questions: what type of computing power (e.g., desktop PC, supercomputer, network of hundreds of computers) does this assume? Does this take into account future advances in computing power (which, by Moore’s Law, roughly doubles every 18 months)? Does the es- timate assume the use of “off-the-shelf” algorithms, or dedicated software implementations optimized for the attack? Furthermore, such a guarantee says little about the success probability of an adversary running for 2 years (other than the fact that it can be at most ε) and says nothing about the success probability of an adversary running for 10 years.
3.1.2 The Asymptotic Approach
As partly noted above, there are some technical and theoretical difficulties in using the concrete-security approach. These issues must be dealt with in
46 Introduction to Modern Cryptography
practice, but when concrete security is not an immediate concern it is conve- nient instead to use an asymptotic approach to security; this is the approach taken in this book. This approach, rooted in complexity theory, introduces an integer-valued security parameter (denoted by n) that parameterizes both cryptographic schemes as well as all involved parties (namely, the honest par- ties as well as the attacker). When honest parties initialize a scheme (i.e., when they generate keys), they choose some value n for the security parame- ter; for the purposes of this discussion, one can think of the security parameter as corresponding to the length of the key. The security parameter is assumed to be known to any adversary attacking the scheme, and we now view the running time of the adversary, as well as its success probability, as functions of the security parameter rather than as concrete numbers. Then:
1. We equate “efficient adversaries” with randomized (i.e., probabilistic) algorithms running in time polynomial in n. This means there is some polynomial p such that the adversary runs for time at most p(n) when the security parameter is n. We also require—for real-world efficiency— that honest parties run in polynomial time, although we stress that the adversary may be much more powerful (and run much longer than) the honest parties.
2. We equate the notion of “small probabilities of success” with success probabilities smaller than any inverse polynomial in n (see Definition 3.4). Such probabilities are called negligible.
Let ppt stand for “probabilistic polynomial-time.” A definition of asymptotic security then takes the following general form:
A scheme is secure if any ppt adversary succeeds in breaking the scheme with at most negligible probability.
This notion of security is asymptotic since security depends on the behavior of the scheme for sufficiently large values of n. The following example makes this clear.
Example 3.2
Say we have a scheme that is asymptotically secure. Then it may be the case that an adversary running for n3 minutes can succeed in “breaking the scheme” with probability 240 · 2−n (which is a negligible function of n). When n ≤ 40 this means that an adversary running for 403 minutes (about 6 weeks) can break the scheme with probability 1, so such values of n are not very useful. Even for n = 50 an adversary running for 503 minutes (about 3 months) can break the scheme with probability roughly 1/1000, which may not be acceptable. On the other hand, when n = 500 an adversary running for 200 years breaks the scheme only with probability roughly 2−500. ♦
Private-Key Encryption 47
As indicated by the previous example, we can view the security parameter as a mechanism that allows the honest parties to “tune” the security of a scheme to some desired level. (Increasing the security parameter also increases the time required to run the scheme, as well as the length of the key, so the honest parties will want to set the security parameter as small as possible subject to defending against the class of attacks they are concerned about.) Viewing the security parameter as the key length, this corresponds roughly to the fact that the time required for an exhaustive-search attack grows exponentially in the length of the key. The ability to “increase security” by increasing the security parameter has important practical ramifications, since it enables honest parties to defend against increases in computing power. The following example gives a sense of how this might play out in practice.
Example 3.3
Let us see the effect that the availability of faster computers might have on security in practice. Say we have a cryptographic scheme in which the honest parties run for 106 · n2 cycles, and for which an adversary running for 108 · n4 cycles can succeed in “breaking” the scheme with probability at most 2−n/2. (The numbers are intended to make calculations easier, and are not meant to correspond to any existing cryptographic scheme.)
Say all parties are using 2 GHz computers and the honest parties set n = 80. Then the honest parties run for 106 · 6400 cycles, or 3.2 seconds, and an adversary running for 108 · (80)4 cycles, or roughly 3 weeks, can break the scheme with probability only 2−40.
Say 8 GHz computers become available, and all parties upgrade. Honest parties can increase n to 160 (which requires generating a fresh key) and maintain a running time of 3.2 seconds (i.e., 106 · 1602 cycles at 8 · 109 cy- cles/second). In contrast, the adversary now has to run for over 8 million seconds, or more than 13 weeks, to achieve a success probability of 2−80. The effect of a faster computer has been to make the adversary’s job harder. ♦
Even when using the asymptotic approach it is important to remember that, ultimately, when a cryptosystem is deployed in practice a concrete security guarantee will be needed. (After all, one must decide on some value of n.) As the above examples indicate, however, it is generally the case that an asymptotic security claim can be translated into a concrete security bound for any desired value of n.
The Asymptotic Approach in Detail
We now discuss more formally the notions of “polynomial-time algorithms” and “negligible success probabilities.”
Efficient algorithms. We have defined an algorithm to be efficient if it runs in polynomial time. An algorithm A runs in polynomial time if there exists a
48 Introduction to Modern Cryptography
polynomial p such that, for every input x ∈ {0, 1}∗, the computation of A(x) terminates within at most p(|x|) steps. (Here, |x| denotes the length of the string x.) As mentioned earlier, we are only interested in adversaries whose running time is polynomial in the security parameter n. Since we measure the running time of an algorithm in terms of the length of its input, we sometimes provide algorithms with the security parameter written in unary (i.e., as 1n, or a string of n ones) as input. Parties (or, more precisely, the algorithms they run) may take other inputs besides the security parameter—for example, a message to be encrypted—and we allow their running time to be polynomial in the (total) length of their inputs.
By default, we allow all algorithms to be probabilistic (or randomized). Any such algorithm may “toss a coin” at each step of its execution; this is a metaphorical way of saying that the algorithm can access an unbiased random bit at each step. Equivalently, we can view a randomized algorithm as one that, in addition to its input, is given a uniformly distributed random tape of sufficient length1 whose bits it can use, as needed, throughout its execution.
We consider randomized algorithms by default for two reasons. First, ran- domness is essential to cryptography (e.g., in order to choose random keys and so on) and so honest parties must be probabilistic; given this, it is nat- ural to allow adversaries to be probabilistic as well. Second, randomization is practical and—as far as we know—gives attackers additional power. Since our goal is to model all realistic attacks, we prefer a more liberal definition of efficient computation.
Negligible success probability. A negligible function is one that is asymp- totically smaller than any inverse polynomial function. Formally:
DEFINITION 3.4 A function f from the natural numbers to the non-
negative real numbers is negligible if for every positive polynomial p there is
an N such that for all integers n>N it holds that f(n)< 1 . p(n)
For shorthand, the above is also stated as follows: for every polynomial p and all sufficiently large values of n it holds that f(n) < 1 . An equivalent
p(n)
formulation of the above is to require that for all constants c there exists an
N such that for all n > N it holds that f(n) < n−c. We typically denote an arbitrary negligible function by negl.
Example 3.5 √
The functions 2−n , 2− n , and n− log n are all negligible. However, they ap- proach zero at very different rates. For example, we can look at the minimum value of n for which each function is smaller than 1/n5:
1If the algorithm in question runs for p(n) steps on inputs of length n, then a random tape of length p(n) is sufficient since the attacker can read at most one random bit per time step.
Private-Key Encryption 49
1. Solving 2−n < n−5 we get n > 5 log n. The smallest integer value of n for which this holds is n = 23.
3. Solving n− log n < n−5 we get log n > 5. The smallest integer value of n for which this holds is n = 33.
2. Solving 2−√
n < n−5 we get n > 25 log2 n. The smallest integer value of n for which this holds is n ≈ 3500.
From the above you may have the impression that n− log n approaches zero
n. However, this is incorrect; for all n > 65536 it holds more quickly than 2−√
that 2−√
n < n− log n. Nevertheless, this does show that for values of n in the hundreds or thousands, an adversarial success probability of n− log n is
A technical advantage of working with negligible success probabilities is that they obey certain closure properties. The following is an easy exercise.
PROPOSITION 3.6 Let negl1 and negl2 be negligible functions. Then,
1. The function negl3 defined by negl3(n) = negl1(n)+negl2(n) is negligible.
2. For any positive polynomial p, the function negl4 defined by negl4(n) = p(n) · negl1(n) is negligible.
The second part of the above proposition implies that if a certain event oc- curs with only negligible probability in a certain experiment, then the event occurs with negligible probability even if the experiment is repeated polyno- mially many times. (This relies on the union bound; see Proposition A.7.) For example, the probability that n fair coin flips all come up “heads” is neg- ligible. This means that even if we repeat the experiment of flipping n coins polynomially many times, the probability that any of those experiments result in n heads is still negligible.
A corollary of the second part of the above proposition is that if a function
def
g is not negligible, then neither is the function f(n) = g(n)/p(n) for any positive polynomial p.
Asymptotic Security: A Summary
Any security definition consists of two parts: a definition of what is consid- ered a “break” of the scheme, and a specification of the power of the adversary. The power of the adversary can relate to many issues (e.g., in the case of en- cryption, whether we assume a ciphertext-only attack or a chosen-plaintext attack). However, when it comes to the computational power of the adversary, we will from now on model the adversary as efficient and thus only consider adversarial strategies that can be implemented in probabilistic polynomial
n. ♦ preferable to an adversarial success probability of 2−√
50 Introduction to Modern Cryptography
time. Definitions will also always be formulated so that a break that occurs with negligible probability is not considered significant. Thus, the general framework of any security definition will be as follows:
A scheme is secure if for every probabilistic polynomial-time adver- sary A carrying out an attack of some formally specified type, the probability that A succeeds in the attack (where success is also formally specified) is negligible.
Such a definition is asymptotic because it is possible that for small values of n an adversary can succeed with high probability. In order to see this in more detail, we expand the term “negligible” in the above statement:
A scheme is secure if for every ppt adversary A carrying out an attack of some formally specified type, and for every positive poly- nomial p, there exists an integer N such that when n > N the probability that A succeeds in the attack is less than 1 .
p(n)
Note that nothing is guaranteed for values n ≤ N.
On the Choices Made in Defining Asymptotic Security
In defining the general notion of asymptotic security, we have made two choices: we have identified efficient adversarial strategies with the class of probabilistic, polynomial-time algorithms, and have equated small chances of success with negligible probabilities. Both of these choices are—to some extent—arbitrary, and one could build a perfectly reasonable theory by defin- ing, say, efficient strategies as those running in quadratic time, or small suc- cess probabilities as those bounded by 2−n. Nevertheless, we briefly justify the choices we have made (which are the standard ones).
Those familiar with complexity theory or algorithms will recognize that the idea of equating efficient computation with (probabilistic) polynomial-time algorithms is not unique to cryptography. One advantage of using (proba- bilistic) polynomial time as our measure of efficiency is that this frees us from having to precisely specify our model of computation, since the extended Church–Turing thesis states that all “reasonable” models of computation are polynomially equivalent. Thus, we need not specify whether we use Turing machines, boolean circuits, or random-access machines; we can present algo- rithms in high-level pseudocode and be confident that if our analysis shows that these algorithms run in polynomial time, then any reasonable implemen- tation will also.
Another advantage of (probabilistic) polynomial-time algorithms is that they satisfy desirable closure properties: in particular, an algorithm that makes polynomially many calls to a polynomial-time subroutine (and does only polynomial computation in addition) will itself run in polynomial time.
Private-Key Encryption 51
The most important feature of negligible probabilities is the closure prop- erty we have already seen in Proposition 3.6(2): any polynomial times a negli- gible function is still negligible. This means, in particular, that if an algorithm makes polynomially many calls to some subroutine that “fails” with negligible probability each time it is called, then the probability that any of the calls to that subroutine fail is still negligible.
Necessity of the Relaxations
Computational secrecy introduces two relaxations of perfect secrecy: first, security is guaranteed only against efficient adversaries; second, a small prob- ability of success is allowed. Both these relaxations are essential for achieving practical encryption schemes, and in particular for bypassing the negative re- sults for perfectly secret encryption. We informally discuss why this is the case. Assume we have an encryption scheme where the size of the key space K is much smaller than the size of the message space M. (As shown in the previ- ous chapter, this means the scheme cannot be perfectly secret.) Two attacks apply regardless of how the encryption scheme is constructed:
• Given a ciphertext c, an adversary can decrypt c using all keys k ∈ K. This gives a list of all the messages to which c can possibly correspond. Since this list cannot contain all of M (because |K| < |M|), this attack leaks some information about the message that was encrypted.
Moreover, say the adversary carries out a known-plaintext attack and learns that ciphertexts c1, . . . , cl correspond to the messages m1, . . . , ml, respectively. The adversary can again try decrypting each of these ci- phertexts with all possible keys until it finds a key k for which Deck(ci) = mi for all i. Later, given a ciphertext c that is the encryption of an un- known message m, it is almost surely the case that Deck(c) = m.
Exhaustive-search attacks like the above allow an adversary to succeed with probability essentially 1 in time linear in |K|.
• Consider again the case where the adversary learns that ciphertexts c1, . . . , cl correspond to messages m1, . . . , ml. The adversary can guess a uniform key k ∈ K and check to see whether Deck(ci) = mi for all i. If so, then, as above, the attacker can use k to decrypt anything subsequently encrypted by the honest parties.
Here the adversary runs in essentially constant time and succeeds with nonzero (though very small) probability 1/|K|.
It follows that if we wish to encrypt many messages using a single short key, security can only be achieved if we limit the running time of the adversary (so the adversary does not have sufficient time to carry out a brute-force search) and are willing to allow a very small probability of success (so the second “attack” is ruled out).
52 Introduction to Modern Cryptography
3.2 Defining Computationally Secure Encryption
Given the background of the previous section, we are ready to present a definition of computational security for private-key encryption. First, we re- define the syntax of private-key encryption; this will be essentially the same as the syntax introduced in Chapter 2 except that we now explicitly take into account the security parameter n. We also allow the decryption algorithm to output an error message in case it is presented with an invalid ciphertext. Finally, by default, we let the message space be the set {0, 1}∗ of all (finite- length) binary strings.
DEFINITION 3.7 A private-key encryption scheme is a tuple of proba- bilistic polynomial-time algorithms (Gen, Enc, Dec) such that:
1. The key-generation algorithm Gen takes as input 1n (i.e., the security parameter written in unary) and outputs a key k; we write k ← Gen(1n) (emphasizing that Gen is a randomized algorithm). We assume without loss of generality that any key k output by Gen(1n) satisfies |k| ≥ n.
2. The encryption algorithm Enc takes as input a key k and a plaintext message m ∈ {0,1}∗, and outputs a ciphertext c. Since Enc may be randomized, we write this as c ← Enck(m).
3. The decryption algorithm Dec takes as input a key k and a ciphertext c, and outputs a message m or an error. We assume that Dec is deter- ministic, and so write m := Deck(c) (assuming here that Dec does not return an error). We denote a generic error by the symbol ⊥.
It is required that for every n, every key k output by Gen(1n), and every m ∈ {0,1}∗, it holds that Deck(Enck(m)) = m.
If (Gen,Enc,Dec) is such that for k output by Gen(1n), algorithm Enck is only defined for messages m ∈ {0, 1}l(n), then we say that (Gen, Enc, Dec) is a fixed-length private-key encryption scheme for messages of length l(n).
Almost always, Gen(1n) simply outputs a uniform n-bit string as the key. When this is the case, we will omit Gen and simply define a private-key en- cryption scheme by a pair of algorithms (Enc, Dec).
The above definition considers stateless schemes, in which each invocation of Enc (and Dec) is independent of all prior invocations. Later in the chapter, we will occasionally discuss stateful schemes in which the sender (and possibly the receiver) is required to maintain state across invocations. Unless explicitly noted otherwise, all our results assume stateless encryption/decryption.
Private-Key Encryption 53 3.2.1 The Basic Definition of Security
We begin by presenting the most basic notion of security for private-key encryption: security against a ciphertext-only attack where the adversary observes only a single ciphertext or, equivalently, security when a given key is used to encrypt just a single message. We consider stronger definitions of security later in the chapter.
Motivating the definition. As we have already discussed, any definition of security consists of two distinct components: a threat model (i.e., a speci- fication of the assumed power of the adversary) and a security goal (usually specified by describing what constitutes a “break” of the scheme). We begin our definitional treatment by considering the simplest threat model, where we have an eavesdropping adversary who observes the encryption of a single message. This is exactly the threat model that was considered in the previous chapter with the exception that, as explained in the previous section, we are now interested only in adversaries that are computationally bounded and so limited to running in polynomial time.
Although we have made two assumptions about the adversary’s capabili- ties (namely, that it only eavesdrops, and that it runs in polynomial time), we make no assumptions whatsoever about the adversary’s strategy in trying to decipher the ciphertext it observes. This is crucial for obtaining meaningful notions of security; the definition ensures protection against any computa- tionally bounded adversary, regardless of the algorithm it uses.
Correctly defining the security goal for encryption is not trivial, but we have already discussed this issue at length in Section 1.4.1 and in the previ- ous chapter. We therefore just recall that the idea behind the definition is that the adversary should be unable to learn any partial information about the plaintext from the ciphertext. The definition of semantic security (cf. Section 3.2.2) exactly formalizes this notion, and was the first definition of computationally secure encryption to be proposed. Semantic security is com- plex and difficult to work with. Fortunately, there is an equivalent definition called indistinguishability that is much simpler.
The definition of indistinguishability is patterned on the alternative defini-
tion of perfect secrecy given as Definition 2.5. (This serves as further moti-
vation that the definition of indistinguishability is a good one.) Recall that
Definition 2.5 considers an experiment PrivKeav in which an adversary A out- A,Π
puts two messages m0 and m1, and is then given an encryption of one of those messages using a uniform key. The definition states that a scheme Π is secure if no adversary A can determine which of the messages m0, m1 was encrypted with probability any different from 1/2, which is the probability that A is correct if it just makes a random guess.
Here, we keep the experiment PrivKeav almost exactly the same (except for A,Π
some technical differences discussed below), but introduce two key modifica- tions in the definition itself:
54 Introduction to Modern Cryptography
1. We now consider only adversaries running in polynomial time, whereas
Definition 2.5 considered even adversaries with unbounded running time. 2. We now concede that the adversary might determine the encrypted mes-
sage with probability negligibly better than 1/2.
As discussed extensively in the previous section, the above relaxations consti- tute the core elements of computational security.
As for the other differences, the most prominent is that we now parame-
terize the experiment by a security parameter n. We then measure both the
running time of the adversary A as well as its success probability as functions
of n. We write PrivKeav (n) to denote the experiment being run with security A,Π
parameter n, and write
Pr[PrivKeav (n) = 1] (3.1) A,Π
to denote the probability that the output of experiment PrivKeav (n) is 1. Note that with A, Π fixed, Equation (3.1) is a function of n. A,Π
A second difference in experiment PrivKeav is that we now explicitly re- A,Π
quire the adversary to output two messages m0,m1 of equal length. (In Def- inition 2.5 this requirement is implicit if the message space M only contains messages of some fixed length, as is the case for the one-time pad encryption scheme.) This means that, by default, we do not require a secure encryption scheme to hide the length of the plaintext. We revisit this point at the end of this section; see also Exercises 3.2 and 3.3.
Indistinguishability in the presence of an eavesdropper. We now give the formal definition, beginning with the experiment outlined above. The ex- periment is defined for any private-key encryption scheme Π = (Gen, Enc, Dec), any adversary A, and any value n for the security parameter:
The adversarial indistinguishability experiment PrivKeav (n): A,Π
1. The adversary A is given input 1n, and outputs a pair of messages m0, m1 with |m0| = |m1|.
2. A key k is generated by running Gen(1n), and a uniform bit b ∈ {0,1} is chosen. Ciphertext c ← Enck(mb) is computed and given to A. We refer to c as the challenge ciphertext.
3. A outputs a bit b′.
4. The output of the experiment is defined to be 1 if b′ = b, and
0 otherwise. If PrivKeav (n) = 1, we say that A succeeds. A,Π
There is no limitation on the lengths of m0 and m1, as long as they are the same. (Of course, if A runs in polynomial time, then m0 and m1 have length polynomial in n.) If Π is a fixed-length scheme for messages of length l(n), the above experiment is modified by requiring m0, m1 ∈ {0, 1}l(n).
Private-Key Encryption 55
The fact that the adversary can only eavesdrop is implicit in the fact that its input is limited to a (single) ciphertext, and the adversary does not have any further interaction with the sender or the receiver. (As we will see later, allowing additional interaction makes the adversary significantly stronger.)
The definition of indistinguishability states that an encryption scheme is se- cure if no ppt adversary A succeeds in guessing which message was encrypted in the above experiment with probability significantly better than random guessing (which is correct with probability 1/2):
DEFINITION 3.8 A private-key encryption scheme Π = (Gen, Enc, Dec) has indistinguishable encryptions in the presence of an eavesdropper, or is EAV- secure, if for all probabilistic polynomial-time adversaries A there is a negli- gible function negl such that, for all n,
Pr PrivKeav (n) = 1 ≤ 1 + negl(n), A,Π 2
where the probability is taken over the randomness used by A and the ran- domness used in the experiment (for choosing the key and the bit b, as well as any randomness used by Enc).
Note: unless otherwise qualified, when we write “f(n) ≤ g(n)” we mean that inequality holds for all n.
It should be clear that Definition 3.8 is weaker than Definition 2.5, which is equivalent to perfect secrecy. Thus, any perfectly secret encryption scheme has indistinguishable encryptions in the presence of an eavesdropper. Our goal, therefore, will be to show that there exist encryption schemes satisfying the above in which the key is shorter than the message. That is, we will show schemes that satisfy Definition 3.8 but cannot satisfy Definition 2.5.
An equivalent formulation. Definition 3.8 requires that no ppt adver-
sary can determine which of two messages was encrypted, with probability
significantly better than 1/2. An equivalent formulation is that every ppt ad-
versary behaves the same whether it sees an encryption of m0 or of m1. Since
A outputs a single bit, “behaving the same” means it outputs 1 with almost
the same probability in each case. To formalize this, define PrivKeav (n, b) as A,Π
above except that the fixed bit b is used (rather than being chosen at random). Let out (PrivKeav (n, b)) denote the output bit b′ of A in the experiment. The
A A,Π
following essentially states that no A can determine whether it is running in
experiment PrivKeav (n, 0) or experiment PrivKeav (n, 1). A,Π A,Π
DEFINITION 3.9 A private-key encryption scheme Π = (Gen, Enc, Dec) has indistinguishable encryptions in the presence of an eavesdropper if for all ppt adversaries A there is a negligible function negl such that
Pr[outA (PrivKeav (n, 0)) = 1] − Pr[outA (PrivKeav (n, 1)) = 1] ≤ negl(n). A,Π A,Π
The fact that this is equivalent to Definition 3.8 is left as an exercise.
56 Introduction to Modern Cryptography Encryption and Plaintext Length
The default notion of secure encryption does not require the encryption scheme to hide the plaintext length and, in fact, all commonly used encryp- tion schemes reveal the plaintext length (or a close approximation thereof). The main reason for this is that it is impossible to support arbitrary-length messages while hiding all information about the plaintext length (cf. Exer- cise 3.2). In many cases this is inconsequential since the plaintext length is already public or is not sensitive. This is not always the case, however, and sometimes leaking the plaintext length is problematic. As examples:
• Simple numeric/text data: Say the encryption scheme being used reveals the plaintext length exactly. Then encrypted salary information would reveal whether someone makes a 5-figure or a 6-figure salary. Similarly, encryption of “yes”/“no” responses would leak the answer exactly.
• Auto-suggestions: Websites often include an “auto-complete” or “auto- suggestion” functionality by which the webserver suggests a list of poten- tial words or phrases based on partial information the user has already typed. The size of this list can reveal information about the letters the user has typed so far. (For example, the number of auto-completions returned for “th” is far greater than the number for “zo.”)
• Database searches: Consider a user querying a database for all records matching some search term. The number of records returned can reveal a lot of information about what the user was searching for. This can be particularly damaging if the user is searching for medical information and the query reveals information about a disease the user has.
• Compressed data: If the plaintext is compressed before being encrypted, then information about the plaintext might be revealed even if only fixed-length data is ever encrypted. (Such an encryption scheme would therefore not satisfy Definition 3.8.) For example, a short compressed plaintext would indicate that the original (uncompressed) plaintext has a lot of redundancy. If an adversary can control a portion of what gets encrypted, this vulnerability can enable an adversary to learn additional information about the plaintext; it has been shown possible to use an attack of exactly this sort (the CRIME attack ) against encrypted HTTP traffic to reveal secret session cookies.
When using encryption one should determine whether leaking the plaintext length is a concern and, if so, take steps to mitigate or prevent such leakage by padding all messages to some pre-determined length before encrypting them.
3.2.2 *Semantic Security
We motivated the definition of secure encryption by saying that it should be infeasible for an adversary to learn any partial information about the plaintext
Private-Key Encryption 57
from the ciphertext. However, the definition of indistinguishability looks very different. As we have mentioned, Definition 3.8 is equivalent to a definition called semantic security that formalizes the notion that partial information cannot be learned. We build up to that definition by discussing two weaker notions and showing that they are implied by indistinguishability.
We begin by showing that indistinguishability means that ciphertexts leak no information about individual bits of the plaintext. Formally, say encryption scheme (Enc, Dec) is EAV-secure (recall then when Gen is omitted, the key is a uniform n-bit string), and m ∈ {0, 1}l is uniform. Then we show that for any index i, it is infeasible to guess mi from Enck(m) (where, in this section, mi denotes the ith bit of m) with probability much better than 1/2.
THEOREM 3.10 Let Π = (Enc, Dec) be a fixed-length private-key encryp- tion scheme for messages of length l that has indistinguishable encryptions in the presence of an eavesdropper. Then for all ppt adversaries A and any i ∈ {1, . . . , l}, there is a negligible function negl such that
Pr A(1n, Enck(m)) = mi ≤ 1 + negl(n), 2
where the probability is taken over uniform m ∈ {0, 1}l and k ∈ {0, 1}n, the randomness of A, and the randomness of Enc.
PROOF The idea behind the proof of this theorem is that if it were possible to guess the ith bit of m from Enck(m), then it would also be possible to distinguish between encryptions of messages m0 and m1 whose ith bits differ. We formalize this via a proof by reduction, in which we show how to use any efficient adversary A to construct an efficient adversary A′ such that if A violates the security notion of the theorem for Π, then A′ violates the definition of indistinguishability for Π. (See Section 3.3.2.) Since Π has indistinguishable encryptions, it must also be secure in the sense of the theorem.
Fix an arbitrary ppt adversary A and i ∈ {1,...,l}. Let I0 ⊂ {0,1}l be the set of all strings whose ith bit is 0, and let I1 ⊂ {0,1}l be the set of all strings whose ith bit is 1. We have
PrA(1n,Enck(m)) = mi
= 1· Pr [A(1n,Enck(m0))=0]+1· Pr
[A(1n,Enck(m1))=1]. Construct the following eavesdropping adversary A′:
2 m0←I0 Adversary A′:
2 m1←I1
1. Choose uniform m0 ∈ I0 and m1 ∈ I1. Output m0, m1.
2. Upon observing a ciphertext c, invoke A(1n, c). If A outputs 0, output b′ = 0; otherwise, output b′ = 1.
A′ runs in polynomial time since A does.
58 Introduction to Modern Cryptography
By the definition of experiment PrivKeav (n), we have that A′ succeeds if
A′ ,Π
and only if A outputs b upon receiving Enck(mb). So
Pr PrivKeav (n) = 1 A′ ,Π
2 m0←I0
= PrA(1n,Enck(m)) = mi.
Pr A(1n, Enck(m)) = mi ≤ 1 + negl(n), 2
completing the proof.
We next claim, roughly, that indistinguishability means that no ppt adver- sary can learn any function of the plaintext given the ciphertext, regardless of the distribution of the message being sent. This is intended to capture the idea that no information about a plaintext is leaked by the resulting cipher- text. This requirement is, however, non-trivial to define formally. To see why, note that even for the case considered above, it is easy to compute the ith bit of m if m is chosen, say, uniformly from the set of all strings whose ith bit is 0 (rather than uniformly from {0, 1}l). Thus, what we actually want to say is that if there exists any adversary who correctly computes f(m) with some probability when given Enck(m), then there exists an adversary that can correctly compute f(m) with the same probability without being given the ciphertext at all (and only knowing the distribution of m). In what follows we focus on the case when m is chosen uniformly from some set S ⊆ {0, 1}l.
THEOREM 3.11 Let (Enc, Dec) be a fixed-length private-key encryption scheme for messages of length l that has indistinguishable encryptions in the presence of an eavesdropper. Then for any ppt algorithm A there is a ppt al- gorithm A′ such that for any S ⊆ {0, 1}l and any function f : {0, 1}l → {0, 1}, there is a negligible function negl such that:
P r [ A ( 1 n , E n c k ( m ) ) = f ( m ) ] − P r [ A ′ ( 1 n ) = f ( m ) ] ≤ n e g l ( n ) ,
where the first probability is taken over uniform choice of k ∈ {0,1}n and m ∈ S, the randomness of A, and the randomness of Enc, and the second probability is taken over uniform choice of m ∈ S and the randomness of A′.
= Pr [A(1n, Enck(mb)) = b]
=1· Pr [A(1n,Enck(m0))=0]+1· Pr [A(1n,Enck(m1))=1]
By the assumption that (Enc,Dec) has indistinguishable encryptions in the presence of an eavesdropper, there is a negligible function negl such that
Pr PrivKeav 1
(n) = 1 ≤ + negl(n). We conclude that A ,Π 2
′
2 m1←I1
Private-Key Encryption 59
PROOF (Sketch) The fact that (Enc, Dec) is EAV-secure means that, for any S ⊆ {0,1}l, no ppt adversary can distinguish between Enck(m) (for uni- form m ∈ S) and Enck(1l). Consider now the probability that A successfully computes f(m) given Enck(m). We claim that A should successfully compute f(m) given Enck(1l) with almost the same probability; otherwise, A could be used to distinguish between Enck(m) and Enck(1l). The distinguisher is easily constructed: choose uniform m ∈ S, and output m0 = m, m1 = 1l. When given a ciphertext c that is an encryption of either m0 or m1, invoke A(1n,c) and output 0 if and only if A outputs f(m). If A outputs f(m) when given an encryption of m with probability that is significantly different from the probability that it outputs f(m) when given an encryption of 1l, then the described distinguisher violates Definition 3.9.
The above suggests the following algorithm A′ that does not receive c = Enck(m), yet computes f(m) almost as well as A does: A′(1n) chooses a uniform key k ∈ {0, 1}n, invokes A on c ← Enck(1l), and outputs whatever A does. By the above, we have that A outputs f(m) when run as a subroutine by A′ with almost the same probability as when it receives Enck(m). Thus, A′ fulfills the property required by the claim.
Semantic security. The full definition of semantic security guarantees con- siderably more than the property considered in Theorem 3.11. The definition allows the length of the plaintext to depend on the security parameter, and allows for essentially arbitrary distributions over plaintexts. (Actually, we allow only efficiently sampleable distributions. This means that there is some probabilistic polynomial-time algorithm Samp such that Samp(1n) outputs messages according to the distribution.) The definition also takes into ac- count arbitrary “external” information h(m) about the plaintext that may be leaked to the adversary through other means (e.g., because the same message m is used for some other purpose as well).
DEFINITION 3.12 A private-key encryption scheme (Enc, Dec) is seman- tically secure in the presence of an eavesdropper if for every ppt algorithm A there exists a ppt algorithm A′ such that for any ppt algorithm Samp and polynomial-time computable functions f and h, the following is negligible:
P r [ A ( 1 n , E n c k ( m ) , h ( m ) ) = f ( m ) ] − P r [ A ′ ( 1 n , | m | , h ( m ) ) = f ( m ) ] ,
where the first probability is taken over uniform k ∈ {0,1}n, m output by Samp(1n), the randomness of A, and the randomness of Enc, and the second probability is taken over m output by Samp(1n) and the randomness of A′.
The adversary A is given the ciphertext Enck(m) as well as the external information h(m), and attempts to guess the value of f(m). Algorithm A′ also attempts to guess the value of f(m), but is given only h(m) and the
60 Introduction to Modern Cryptography
length of m. The security requirement states that A’s probability of correctly guessing f(m) is about the same as that of A′. Intuitively, then, the ciphertext Enck(m) does not reveal any additional information about the value of f(m).
Definition 3.12 constitutes a very strong and convincing formulation of the security guarantees that should be provided by an encryption scheme. How- ever, it is easier to work with the definition of indistinguishability (Defini- tion 3.8). Fortunately, the definitions are equivalent:
THEOREM 3.13 A private-key encryption scheme has indistinguishable encryptions in the presence of an eavesdropper if and only if it is semantically secure in the presence of an eavesdropper.
Looking ahead, a similar equivalence between semantic security and indis- tinguishability is known for all the definitions that we present in this chapter as well as those in Chapter 11. We can therefore use indistinguishability as our working definition, while being assured that the guarantees achieved are those of semantic security.
3.3 Constructing Secure Encryption Schemes
Having defined what it means for an encryption scheme to be secure, the reader may expect us to turn immediately to constructions of secure encryp- tion schemes. Before doing so, however, we need to introduce the notions of pseudorandom generators (PRGs) and stream ciphers, important building blocks for private-key encryption. These, in turn, will lead to a discussion of pseudorandomness, which plays a fundamental role in cryptography in general and private-key encryption in particular.
3.3.1 Pseudorandom Generators and Stream Ciphers
A pseudorandom generator G is an efficient, deterministic algorithm for transforming a short, uniform string called the seed into a longer, “uniform- looking” (or “pseudorandom”) output string. Stated differently, a pseudoran- dom generator uses a small amount of true randomness in order to generate a large amount of pseudorandomness. This is useful whenever a large num- ber of random(-looking) bits are needed, since generating true random bits is difficult and slow. (See the discussion at the beginning of Chapter 2.) In- deed, pseudorandom generators have been studied since at least the 1940s when they were proposed for running statistical simulations. In that context, researchers proposed various statistical tests that a pseudorandom generator should pass in order to be considered “good.” As a simple example, the first
Private-Key Encryption 61
bit of the output of a pseudorandom generator should be equal to 1 with prob- ability very close to 1/2 (where the probability is taken over uniform choice of the seed), since the first bit of a uniform string is equal to 1 with probability exactly 1/2. In fact, the parity of any fixed subset of the output bits should also be 1 with probability very close to 1/2. More complex statistical tests can also be considered.
This historical approach to determining the quality of some candidate pseu- dorandom generator is ad hoc, and it is not clear when passing some set of statistical tests is sufficient to guarantee the soundness of using a candidate pseudorandom generator for some application. (In particular, there may be another statistical test that does successfully distinguish the output of the generator from true random bits.) The historical approach is even more prob- lematic when using pseudorandom generators for cryptographic applications; in that setting, security may be compromised if an attacker is able to distin- guish the output of a generator from uniform, and we do not know in advance what strategy an attacker might use.
The above considerations motivated a cryptographic approach to defining pseudorandom generators in the 1980s. The basic realization was that a good pseudorandom generator should pass all (efficient) statistical tests. That is, for any efficient statistical test (or distinguisher) D, the probability that D returns 1 when given the output of the pseudorandom generator should be close to the probability that D returns 1 when given a uniform string of the same length. Informally, then, the output of a pseudorandom generator should “look like” a uniform string to any efficient observer.
(We stress that, formally speaking, it does not make sense to say that any fixed string is “pseudorandom,” in the same way that it is meaningless to refer to any fixed string as “random.” Rather, pseudorandomness is a property of a distribution on strings. Nevertheless, we sometimes informally call a string sampled according to the uniform distribution a “uniform string,” and a string output by a pseudorandom generator a “pseudorandom string.”)
Another perspective is obtained by defining what it means for a distribu- tion to be pseudorandom. Let Dist be a distribution on l-bit strings. (This means that Dist assigns some probability to every string in {0, 1}l; sampling from Dist means that we choose an l-bit string according to this probability distribution.) Informally, Dist is pseudorandom if the experiment in which a string is sampled from Dist is indistinguishable from the experiment in which a uniform string of length l is sampled. (Strictly speaking, since we are in an asymptotic setting we need to speak of the pseudorandomness of a sequence of distributions Dist = {Distn}, where distribution Distn is used for security parameter n. We ignore this point in our current discussion.) More precisely, it should be infeasible for any polynomial-time algorithm to tell (better than guessing) whether it is given a string sampled according to Dist, or whether it is given a uniform l-bit string. This means that a pseudorandom string is just as good as a uniform string, as long as we consider only polynomial- time observers. Just as indistinguishability is a computational relaxation of
62 Introduction to Modern Cryptography
perfect secrecy, pseudorandomness is a computational relaxation of true ran- domness. (We will generalize this perspective when we discuss the notion of indistinguishability in Chapter 7.)
Now let G : {0,1}n → {0,1}l be a function, and define Dist to be the distribution on l-bit strings obtained by choosing a uniform s ∈ {0, 1}n and outputting G(s). Then G is a pseudorandom generator if and only if the distribution Dist is pseudorandom.
The formal definition. As discussed above, G is a pseudorandom generator if no efficient distinguisher can detect whether it is given a string output by G or a string chosen uniformly at random. As in Definition 3.9, this is formalized by requiring that every efficient algorithm outputs 1 with almost the same probability when given G(s) (for uniform seed s) or a uniform string. (For an equivalent definition analogous to Definition 3.8, see Exercise 3.5.) We obtain a definition in the asymptotic setting by letting the security parameter n determine the length of the seed. We then insist that G be computable by an efficient algorithm. As a technicality, we also require that G’s output be longer than its input; otherwise, G is not very useful or interesting.
DEFINITION 3.14 Let l be a polynomial and let G be a deterministic polynomial-time algorithm such that for any n and any input s ∈ {0,1}n, the result G(s) is a string of length l(n). We say that G is a pseudorandom generator if the following conditions hold:
1. (Expansion:) For every n it holds that l(n) > n.
2. (Pseudorandomness:) For any ppt algorithm D, there is a negligible
function negl such that
Pr[D(G(s)) = 1] − Pr[D(r) = 1] ≤ negl(n),
where the first probability is taken over uniform choice of s ∈ {0, 1}n and the randomness of D, and the second probability is taken over uniform choice of r ∈ {0, 1}l(n) and the randomness of D.
We call l the expansion factor of G.
We give an example of an insecure pseudorandom generator to gain famil-
iarity with the definition.
Example 3.15
Define G(s) to output s followed by ⊕ni=1si, so the expansion factor of G is l(n) = n + 1. The output of G can easily be distinguished from uniform. Consider the following efficient distinguisher D: on input a string w, output 1 if and only if the final bit of w is equal to the XOR of all the preceding bits of w. Since this property holds for all strings output by G, we have
Private-Key Encryption 63 Pr[D(G(s)) = 1] = 1. On the other hand, if w is uniform, the final bit of w
is uniform and so Pr[D(w) = 1] = 1 . The quantity | 1 − 1| is constant, not 22
negligible, and so this G is not a pseudorandom generator. (Note that D is not always “correct,” since it sometimes outputs 1 even when given a uniform string. This does not change the fact that D is a good distinguisher.) ♦
Discussion. The distribution on the output of a pseudorandom generator G is far from uniform. To see this, consider the case that l(n) = 2n and so G doubles the length of its input. Under the uniform distribution on {0,1}2n, each of the 22n possible strings is chosen with probability exactly 2−2n. In contrast, consider the distribution of the output of G (when G is run on a uniform seed). When G receives an input of length n, the number of different strings in the range of G is at most 2n. The fraction of strings of length 2n that are in the range of G is thus at most 2n/22n = 2−n, and we see that the vast majority of strings of length 2n do not occur as outputs of G.
This in particular means that it is trivial to distinguish between a random string and a pseudorandom string given an unlimited amount of time. Let G be as above and consider the exponential-time distinguisher D that works as follows: D(w) outputs 1 if and only if there exists an s ∈ {0,1}n such that G(s) = w. (This computation is carried out in exponential time by exhaustively computing G(s) for every s ∈ {0, 1}n. Recall that by Kerckhoffs’ principle, the specification of G is known to D.) Now, if w were output by G, then D outputs 1 with probability 1. In contrast, if w is uniformly distributed in {0,1}2n, then the probability that there exists an s with G(s) = w is at most 2−n, and so D outputs 1 in this case with probability at most 2−n. So
Pr[D(r) = 1] − Pr[D(G(s)) = 1] ≥ 1 − 2−n,
which is large. This is just another example of a brute-force attack, and does
not contradict the pseudorandomness of G since the attack is not efficient.
The seed and its length. The seed for a pseudorandom generator is anal- ogous to the cryptographic key used by an encryption scheme, and the seed must be chosen uniformly and be kept secret from any adversary. Another important point, evident from the above discussion of brute-force attacks, is that s must be long enough so that it is not feasible to enumerate all possible seeds. In an asymptotic sense this is taken care of by setting the length of the seed equal to the security parameter, so that exhaustive search over all possible seeds requires exponential time. In practice, the seed must be long enough so that it is impossible to try all possible seeds within some specified time bound.
On the existence of pseudorandom generators. Do pseudorandom gen- erators exist? They certainly seem difficult to construct, and one may rightly ask whether any algorithm satisfying Definition 3.14 exists. Although we do not know how to unconditionally prove the existence of pseudorandom gen- erators, we have strong reasons to believe they exist. For one, they can be
64 Introduction to Modern Cryptography
constructed under the rather weak assumption that one-way functions exist (which is true if certain problems like factoring large numbers are hard); this will be discussed in detail in Chapter 7. We also have several practical con- structions of candidate pseudorandom generators called stream ciphers for which no efficient distinguishers are known. (Later, we will introduce even stronger primitives called block ciphers.) We give a high-level overview of stream ciphers next, and discuss concrete stream ciphers in Chapter 6.
Stream Ciphers
Our definition of a pseudorandom generator is limited in two ways: the expansion factor is fixed, and the generator produces its entire output in “one shot.” Stream ciphers, used in practice to instantiate pseudorandom genera- tors, work somewhat differently. The pseudorandom output bits of a stream cipher are produced gradually and on demand, so that an application can re- quest exactly as many pseudorandom bits as needed. This improves efficiency (since an application can request fewer bits, if sufficient) and flexibility (since there is no upper bound on the number of bits that can be requested).
Formally, we view a stream cipher2 as a pair of deterministic algorithms (Init, GetBits) where:
• Init takes as input a seed s and an optional initialization vector IV , and outputs an initial state st0.
• GetBits takes as input state information sti, and outputs a bit y and updated state sti+1. (In practice, y is a block of several bits; we treat y as a single bit here for generality and simplicity.)
Given a stream cipher and any desired expansion factor l, we can define an algorithm Gl mapping inputs of length n to outputs of length l(n). The algorithm simply runs Init, and then repeatedly runs GetBits a total of l times.
ALGORITHM 3.16
Constructing Gl from (Init,GetBits)
Input: Seed s and optional initialization vector IV
Output: y1,…,yl
st0 :=Init(s,IV) for i=1 to l:
(yi,sti) := GetBits(sti−1) return y1,…,yl
2The terminology here is not completely standard, and beware that “stream cipher” is used by different people in different (but related) ways. For example, some use it to refer to Gl (see below), while some use it to refer to Construction 3.17 when instantiated with Gl.
Private-Key Encryption 65
A stream cipher is secure in the basic sense if it takes no IV and for any polynomial l with l(n) > n, the function Gl constructed above is a pseudo- random generator with expansion factor l. We briefly discuss one possible security notion for stream ciphers that use an IV in Section 3.6.1.
3.3.2 Proofs by Reduction
If we wish to prove that a given construction is computationally secure, then we must rely on unproven assumptions3 (unless the scheme is information- theoretically secure). Our strategy will be to assume that some mathematical problem is hard, or that some low-level cryptographic primitive is secure, and then to prove that a given construction based on this problem/primitive is secure under this assumption. In Section 1.4.2 we have already explained in great detail why this approach is preferable so we do not repeat those arguments here.
The proof that a cryptographic construction is secure as long as some under- lying problem is hard generally proceeds by presenting an explicit reduction showing how to transform any efficient adversary A that succeeds in “break- ing” the construction into an efficient algorithm A′ that solves the problem that was assumed to be hard. Since this is so important, we walk through a high-level outline of the steps of such a proof in detail. (We will see nu- merous concrete examples through the book, beginning with the proof of Theorem 3.18.) We begin with an assumption that some problem X cannot be solved (in some precisely defined sense) by any polynomial-time algorithm, except with negligible probability. We want to prove that some cryptographic construction Π is secure (again, in some sense that is precisely defined). A proof proceeds via the following steps (see also Figure 3.1):
1. Fix some efficient (i.e., probabilistic polynomial-time) adversary A at- tacking Π. Denote this adversary’s success probability by ε(n).
2. Construct an efficient algorithm A′, called the “reduction,” that at- tempts to solve problem X using adversary A as a subroutine. An im- portant point here is that A′ knows nothing about how A works; the only thing A′ knows is that A is expecting to attack Π. So, given some input instance x of problem X, our algorithm A′ will simulate for A an instance of Π such that:
(a) As far as A can tell, it is interacting with Π. That is, the view of A when run as a subroutine by A′ should be distributed identically to (or at least close to) the view of A when it interacts with Π itself.
(b) If A succeeds in “breaking” the instance of Π that is being sim- ulated by A′, this should allow A′ to solve the instance x it was given, at least with inverse polynomial probability 1/p(n).
3In particular, most of cryptography requires the unproven assumption that P ̸= NP.
66
Introduction to Modern Cryptography
FIGURE 3.1: A high-level overview of a security proof by reduction.
3. Taken together, 2(a) and 2(b) imply that A′ solves X with probability ε(n)/p(n). If ε(n) is not negligible, then neither is ε(n)/p(n). Moreover, if A is efficient then we obtain an efficient algorithm A′ solving X with non-negligible probability, contradicting the initial assumption.
4. Given our assumption regarding X, we conclude that no efficient ad- versary A can succeed in breaking Π with non-negligible probability. Stated differently, Π is computationally secure.
In the following section we will illustrate exactly the above idea: we will show how to use any pseudorandom generator G to construct an encryption scheme; we prove the encryption scheme secure by showing that any attacker who can “break” the encryption scheme can be used to distinguish the output of G from a uniform string. Under the assumption that G is a pseudorandom generator, then, the encryption scheme is secure.
3.3.3 A Secure Fixed-Length Encryption Scheme
A pseudorandom generator provides a natural way to construct a secure, fixed-length encryption scheme with a key shorter than the message. Recall that in the one-time pad (see Section 2.2), encryption is done by XORing a random pad with the message. The insight is that we can use a pseudorandom pad instead. Rather than sharing this long, pseudorandom pad, however, the sender and receiver can instead share a seed which is used to generate that pad when needed (see Figure 3.2); this seed will be shorter than the pad and hence shorter than the message. As for security, the intuition is that a pseudorandom string “looks random” to any polynomial-time adversary and so a computationally bounded eavesdropper cannot distinguish between a message encrypted using the one-time pad or a message encrypted using this “pseudo-”one-time pad encryption scheme.
Private-Key Encryption 67
FIGURE 3.2:
Encryption with a pseudorandom generator.
The encryption scheme. Fix some message length l and let G be a pseu- dorandom generator with expansion factor l (that is, |G(s)| = l(|s|)). Recall that an encryption scheme is defined by three algorithms: a key-generation algorithm Gen, an encryption algorithm Enc, and a decryption algorithm Dec. The key-generation algorithm is the trivial one: Gen(1n) simply outputs a uniform key k of length n. Encryption works by applying G to the key (which serves as a seed) in order to obtain a pad that is then XORed with the plain- text. Decryption applies G to the key and XORs the resulting pad with the ciphertext to recover the message. The scheme is described formally in Con- struction 3.17. In Section 3.6.1, we describe how stream ciphers are used to implement a variant of this scheme in practice.
CONSTRUCTION 3.17
Let G be a pseudorandom generator with expansion factor l. Define a private-key encryption scheme for messages of length l as follows:
• Gen: on input 1n, choose uniform k ∈ {0,1}n and output it as the key.
• Enc: on input a key k ∈ {0,1}n and a message m ∈ {0,1}l(n),
output the ciphertext
c := G(k) ⊕ m.
• Dec: on input a key k ∈ {0,1}n and a ciphertext c ∈ {0,1}l(n),
output the message
m := G(k) ⊕ c.
A private-key encryption scheme based on any pseudorandom generator.
68 Introduction to Modern Cryptography
THEOREM 3.18 If G is a pseudorandom generator, then Construc- tion 3.17 is a fixed-length private-key encryption scheme that has indistin- guishable encryptions in the presence of an eavesdropper.
PROOF Let Π denote Construction 3.17. We show that Π satisfies Def- inition 3.8. Namely, we show that for any probabilistic polynomial-time ad- versary A there is a negligible function negl such that
Pr PrivKeav (n) = 1 ≤ 1 + negl(n). (3.2) A,Π 2
The intuition is that if Π used a uniform pad in place of the pseudorandom pad G(k), then the resulting scheme would be identical to the one-time pad encryption scheme and A would be unable to correctly guess which message was encrypted with probability any better than 1/2. Thus, if Equation (3.2) does not hold then A must implicitly be distinguishing the output of G from a random string. We make this explicit by showing a reduction; namely, by showing how to use A to construct an efficient distinguisher D, with the property that D’s ability to distinguish the output of G from a uniform string is directly related to A’s ability to determine which message was encrypted by Π. Security of G then implies security of Π.
Let A be an arbitrary ppt adversary. We construct a distinguisher D that takes a string w as input, and whose goal is to determine whether w was chosen uniformly (i.e., w is a “random string”) or whether w was generated by choosing a uniform k and computing w := G(k) (i.e., w is a “pseudorandom string”). We construct D so that it emulates the eavesdropping experiment for A, as described below, and observes whether A succeeds or not. If A succeeds then D guesses that w must be a pseudorandom string, while if A does not succeed then D guesses that w is a random string. In detail:
Distinguisher D:
D is given as input a string w ∈ {0, 1}l(n). (We assume that n can be determined from l(n).)
1. Run A(1n) to obtain a pair of messages m0, m1 ∈ {0, 1}l(n).
2. Chooseauniformbitb∈{0,1}. Setc:=w⊕mb.
3. GivectoAandobtainoutputb′. Output1ifb′ =b,and output 0 otherwise.
D clearly runs in polynomial time (assuming A does).
Before analyzing the behavior of D, we define a modified encryption scheme
Π = (Gen, Enc, Dec) that is exactly the one-time pad encryption scheme, ex- cept that we now incorporate a security parameter that determines the length
n
of the message to be encrypted. That is, Gen(1 ) outputs a uniform key k of
length l(n), and the encryption of message m ∈ 2l(n) using key k ∈ {0, 1}l(n)
Private-Key Encryption 69 is the ciphertext c := k ⊕ m. (Decryption can be performed as usual, but is
inessential to what follows.) Perfect secrecy of the one-time pad implies
Pr PrivKeav (n) = 1 = 1 . (3.3)
A,Π 2
To analyze the behavior of D, the main observations are:
1. If w is chosen uniformly from {0, 1}l(n), then the view of A when run as a
subroutine by D is distributed identically to the view of A in experiment
PrivKeav (n). This is because when A is run as a subroutine by D(w) in A,Π
this case, A is given a ciphertext c = w ⊕ mb where w ∈ {0, 1}l(n) is uni- form. Since D outputs 1 exactly when A succeeds in its eavesdropping experiment, we therefore have (cf. Equation (3.3))
Prw←{0,1}l(n) [D(w) = 1] = Pr PrivKeav (n) = 1 = 1 . (3.4) A,Π 2
(The subscript on the first probability just makes explicit that w is chosen uniformly from {0, 1}l(n) there.)
2. If w is instead generated by choosing uniform k ∈ {0,1}n and then
setting w := G(k), the view of A when run as a subroutine by D is
distributed identically to the view of A in experiment PrivKeav (n). This A,Π
is because A, when run as a subroutine by D, is now given a ciphertext c=w⊕mb wherew=G(k)forauniformk∈{0,1}n. Thus,
Pr eav
[D(G(k)) = 1] = Pr PrivK (n) = 1 . (3.5) k←{0,1} A,Π
n
Since G is a pseudorandom generator (and since D runs in polynomial time), we know there is a negligible function negl such that
Prw←{0,1}l(n) [D(w) = 1] − Prk←{0,1}n [D(G(k)) = 1] ≤ negl(n). Using Equations (3.4) and (3.5), we thus see that
1 − Pr PrivKeav (n) = 1 ≤ negl(n), 2 A,Π
which implies Pr PrivKeav (n) = 1 ≤ 1 + negl(n). Since A was an arbitrary A,Π 2
ppt adversary, this completes the proof that Π has indistinguishable encryp- tions in the presence of an eavesdropper.
It is easy to get lost in the details of the proof and wonder whether anything has been gained as compared to the one-time pad; after all, the one-time pad also encrypts an l-bit message by XORing it with an l-bit string! The point of the construction, of course, is that the l-bit string G(k) can be much
70 Introduction to Modern Cryptography
longer than the shared key k. In particular, using the above scheme it is possible to securely encrypt a 1 Mb file using only a 128-bit key. By relying on computational secrecy we have thus circumvented the impossibility result of Theorem 2.10, which states that any perfectly secret encryption scheme must use a key at least as long as the message.
Reductions—a discussion. We do not prove unconditionally that Con- struction 3.17 is secure. Rather, we prove that it is secure under the assump- tion that G is a pseudorandom generator. This approach of reducing the security of a higher-level construction to a lower-level primitive has a number of advantages (as discussed in Section 1.4.2). One of these advantages is that, in general, it is easier to design a lower-level primitive than a higher-level one; it is also easier, in general, to directly analyze an algorithm G with respect to a lower-level definition than to analyze a more complex scheme Π with respect to a higher-level definition. This does not mean that constructing a pseudorandom generator is “easy,” only that it is easier than constructing an encryption scheme from scratch. (In the present case the encryption scheme does nothing except XOR the output of a pseudorandom generator with the message and so this isn’t really true. However, we will see more complex con- structions and in those cases the ability to reduce the task to a simpler one is of great importance.) Another advantage is that once an appropriate G has been constructed, it can be used as a component of various other schemes.
Concrete security. Although Theorem 3.18 and its proof are in an asymp- totic setting, we can readily adapt the proof to bound the concrete security of the encryption scheme in terms of the concrete security of G. Fix some value of n for the remainder of this discussion, and let Π now denote Con- struction 3.17 using this value of n. Assume G is (t, ε)-pseudorandom (for the given value of n), in the sense that for all distinguishers D running in time at most t we have
Pr[D(r) = 1] − Pr[D(G(s)) = 1] ≤ ε. (3.6)
(Think of t ≈ 280 and ε ≈ 2−60, though precise values are irrelevant for our discussion.) We claim that Π is (t − c, ε)-secure for some (small) constant c, in the sense that for all A running in time at most t − c we have
Pr PrivKeav = 1 ≤ 1 + ε. (3.7) A,Π 2
(Note that the above are now fixed numbers, not functions of n, since we are not in an asymptotic setting here.) To see this, let A be an arbitrary adversary running in time at most t − c. Distinguisher D, as constructed in the proof of Theorem 3.18, has very little overhead besides running A; setting c appropriately ensures that D runs in time at most t. Our assumption on the concrete security of G then implies Equation (3.6); proceeding exactly as in the proof of Theorem 3.18, we obtain Equation (3.7).
Private-Key Encryption 71
3.4 Stronger Security Notions
Until now we have considered a relatively weak definition of security in which the adversary only passively eavesdrops on a single ciphertext sent between the honest parties. In this section, we consider two stronger security notions. Recall that a security definition specifies a security goal and an attack model. In defining the first new security notion, we modify the security goal; for the second we strengthen the attack model.
3.4.1 Security for Multiple Encryptions
Definition 3.8 deals with the case where the communicating parties transmit a single ciphertext that is observed by an eavesdropper. It would be conve- nient, however, if the communicating parties could send multiple ciphertexts to each other—all generated using the same key—even if an eavesdropper might observe all of them. For such applications we need an encryption scheme secure for the encryption of multiple messages.
We begin with an appropriate definition of security for this setting. As in the case of Definition 3.8, we first introduce an appropriate experiment defined for any encryption scheme Π, adversary A, and security parameter n:
The multiple-message eavesdropping experiment PrivKmult (n): n A,Π
1. The adversary A is given input 1 , and outputs a pair of equal-length lists of messages M⃗ 0 = (m0,1, . . . , m0,t) and M⃗ 1 = (m1,1, . . . , m1,t), with |m0,i| = |m1,i| for all i.
2. A key k is generated by running Gen(1n), and a uniform bit b ∈ {0,1} is chosen. For all i, the ciphertext ci ← Enck(mb,i) is computed and the list C⃗ = (c1,…,ct) is given to A.
3. A outputs a bit b′.
4. The output of the experiment is defined to be 1 if b′ = b, and
0 otherwise.
The definition of security is the same as before, except that it now refers to
the above experiment.
DEFINITION 3.19 A private-key encryption scheme Π = (Gen, Enc, Dec) has indistinguishable multiple encryptions in the presence of an eavesdropper if for all probabilistic polynomial-time adversaries A there is a negligible function negl such that
Pr PrivKmult (n) = 1 ≤ 1 + negl(n), A,Π 2
where the probability is taken over the randomness used by A and the random- ness used in the experiment.
72 Introduction to Modern Cryptography
Any scheme that has indistinguishable multiple encryptions in the pres- ence of an eavesdropper clearly also satisfies Definition 3.8, since experiment PrivKeav corresponds to the special case of PrivKmult where the adversary out- puts two lists containing only a single message each. In fact, our new definition is strictly stronger than Definition 3.8, as the following shows.
PROPOSITION 3.20 There is a private-key encryption scheme that has indistinguishable encryptions in the presence of an eavesdropper, but not in- distinguishable multiple encryptions in the presence of an eavesdropper.
PROOF We do not have to look far to find an example of an encryption scheme satisfying the proposition. The one-time pad is perfectly secret, and so also has indistinguishable encryptions in the presence of an eavesdropper. We show that it is not secure in the sense of Definition 3.19. (We have discussed this attack in Chapter 2 already; here, we merely analyze the attack with respect to Definition 3.19.)
Concretely, consider the following adversary A attacking the scheme (in the sense defined by experiment PrivKmult): A outputs M⃗0 = (0l,0l) and M⃗ 1 = (0l, 1l). (The first contains the same plaintext twice, while the second contains two different messages.) Let C⃗ = (c1,c2) be the list of ciphertexts that A receives. If c1 = c2, then A outputs b′ = 0; otherwise, A outputs b′ = 1.
We now analyze the probability that b′ = b. The crucial point is that the one-time pad is deterministic, so encrypting the same message twice (using the same key) yields the same ciphertext. Thus, if b = 0 then we must have c1 =c2 andAoutputs0inthiscase. Ontheotherhand,ifb=1thena different message is encrypted each time; hence c1 ̸= c2 and A outputs 1. We conclude that A correctly outputs b′ = b with probability 1, and so the encryption scheme is not secure with respect to Definition 3.19.
Necessity of probabilistic encryption. The above might appear to show that Definition 3.19 is impossible to achieve using any encryption scheme. But in fact this is true only if the encryption scheme is deterministic and so encrypting the same message multiple times (using the same key) always yields the same result. This is important enough to state as a theorem.
THEOREM 3.21 If Π is a (stateless4) encryption scheme in which Enc is a deterministic function of the key and the message, then Π cannot have indistinguishable multiple encryptions in the presence of an eavesdropper.
This should not be taken to mean that Definition 3.19 is too strong. Indeed,
4We will see in Section 3.6.1 that if the encryption scheme is stateful, then it is possible to securely encrypt multiple messages even if encryption is deterministic.
Private-Key Encryption 73
leaking to an eavesdropper the fact that two encrypted messages are the same can be a significant security breach. (Consider, e.g., a scenario in which a student encrypts a series of true/false answers!)
To construct a scheme secure for encrypting multiple messages, we must design a scheme in which encryption is randomized so that when the same message is encrypted multiple times, different ciphertexts can be produced. This may seem impossible since decryption must always be able to recover the message. However, we will soon see how to achieve it.
3.4.2 Chosen-Plaintext Attacks and CPA-Security
Chosen-plaintext attacks capture the ability of an adversary to exercise (partial) control over what the honest parties encrypt. We imagine a scenario in which two honest parties share a key k, and the attacker can influence these parties to encrypt messages m1, m2, . . . (using k) and send the resulting ciphertexts over a channel that the attacker can observe. At some later point in time, the attacker observes a ciphertext corresponding to some unknown message m encrypted using the same key k; let us even assume that the attacker knows that m is one of two possibilities m0,m1. Security against chosen-plaintext attacks means that even in this case the attacker cannot tell which of these two messages was encrypted with probability significantly better than random guessing. (For now we revert back to the case where the eavesdropper is given only a single encryption of an unknown message. Shortly, we will return to consideration of the multiple-message case.)
Chosen-plaintext attacks in the real world. Are chosen-plaintext at- tacks a realistic concern? For starters, note that chosen-plaintext attacks also encompass known-plaintext attacks—in which the attacker knows what mes- sages are being encrypted, even if it does not get to choose them—as a special case. Moreover, there are several real-world scenarios in which an adversary might have significant influence over what messages get encrypted. A simple example is given by an attacker typing on a terminal, which in turn encrypts and sends everything the adversary types using a key shared with a remote server (and unknown to the attacker). Here the attacker exactly controls what gets encrypted, but the encryption scheme should remain secure when it is used—with the same key— to encrypt data for another user.
Interestingly, chosen-plaintext attacks have also been used successfully as part of historical efforts to break military encryption schemes. For example, during World War II the British placed mines at certain locations, knowing that the Germans—when finding those mines—would encrypt the locations and send them back to headquarters. These encrypted messages were used by cryptanalysts at Bletchley Park to break the German encryption scheme.
Another example is given by the famous story involving the Battle of Mid- way. In May 1942, US Navy cryptanalysts intercepted an encrypted message from the Japanese which they were able to partially decode. The result in-
74 Introduction to Modern Cryptography
dicated that the Japanese were planning an attack on AF, where AF was a ciphertext fragment that the US was unable to decode. For other reasons, the US believed that Midway Island was the target. Unfortunately, their at- tempts to convince Washington planners that this was the case were futile; the general belief was that Midway could not possibly be the target. The Navy cryptanalysts devised the following plan: They instructed US forces at Midway to send a fake message that their freshwater supplies were low. The Japanese intercepted this message and immediately reported to their superi- ors that “AF is low on water.” The Navy cryptanalysts now had their proof that AF corresponded to Midway, and the US dispatched three aircraft carriers to that location. The result was that Midway was saved, and the Japanese incurred significant losses. This battle was a turning point in the war between the US and Japan in the Pacific.
The Navy cryptanalysts here carried out a chosen-plaintext attack, as they were able to influence the Japanese (albeit in a roundabout way) to encrypt the word “Midway.” If the Japanese encryption scheme had been secure against chosen-plaintext attacks, this strategy by the US cryptanalysts would not have worked (and history may have turned out very differently)!
CPA-security. In the formal definition we model chosen-plaintext attacks by giving the adversary A access to an encryption oracle, viewed as a “black box” that encrypts messages of A’s choice using a key k that is unknown to A. That is, we imagine A has access to an “oracle” Enck(·); when A queries this oracle by providing it with a message m as input, the oracle returns a ciphertext c ← Enck(m) as the reply. (When Enc is randomized, the oracle uses fresh randomness each time it answers a query.) The adversary is allowed to interact with the encryption oracle adaptively, as many times as it likes.
Consider the following experiment defined for any encryption scheme Π = (Gen, Enc, Dec), adversary A, and value n for the security parameter:
The CPA indistinguishability experiment PrivKcpa (n): A,Π
1. A key k is generated by running Gen(1n).
2. The adversary A is given input 1n and oracle access to Enck(·),
and outputs a pair of messages m0,m1 of the same length.
3. A uniform bit b ∈ {0,1} is chosen, and then a ciphertext
c ← Enck(mb) is computed and given to A.
4. The adversary A continues to have oracle access to Enck(·),
and outputs a bit b′.
5. The output of the experiment is defined to be 1 if b′ = b, and 0 otherwise. In the former case, we say that A succeeds.
Private-Key Encryption 75
DEFINITION 3.22 A private-key encryption scheme Π = (Gen, Enc, Dec) has indistinguishable encryptions under a chosen-plaintext attack, or is CPA- secure, if for all probabilistic polynomial-time adversaries A there is a negli- gible function negl such that
PrPrivKcpa (n)=1≤1+negl(n), A,Π 2
where the probability is taken over the randomness used by A, as well as the randomness used in the experiment.
CPA-Security for Multiple Encryptions
Definition 3.22 can be extended to the case of multiple encryptions in the same way that Definition 3.8 is extended to give Definition 3.19, i.e., by using lists of plaintexts. Here, we take a different approach that is somewhat sim- pler and has the advantage of modeling attackers that can adaptively choose plaintexts to be encrypted, even after observing previous ciphertexts. In the present definition, we give the attacker access to a “left-or-right” oracle LRk,b that, on input a pair of equal-length messages m0,m1, computes the cipher- text c ← Enck(mb) and returns c. That is, if b = 0 then the adversary receives an encryption of the “left” plaintext, and if b = 1 then it receives an encryp- tion of the “right” plaintext. Here, b is a random bit chosen at the beginning of the experiment, and as in previous definitions the goal of the attacker is to guess b. This generalizes the previous definition of multiple-message secu- rity (Definition 3.19) because instead of outputting the lists (m0,1, . . . , m0,t) and (m1,1,…,m1,t), one of whose messages will be encrypted, the attacker can now sequentially query LRk,b(m0,1,m1,1), …, LRk,b(m0,t,m1,t). This also encompasses the attacker’s access to an encryption oracle, since the attacker can simply query LRk,b(m,m) to obtain Enck(m).
We now formally define this experiment, called the LR-oracle experiment. Let Π be an encryption scheme, A an adversary, and n the security parameter:
The LR-oracle experiment PrivKLR-cpa(n): A,Π
1. A key k is generated by running Gen(1n).
2. A uniform bit b ∈ {0, 1} is chosen.
3. The adversary A is given input 1n and oracle access to LRk,b (·, ·), as defined above.
4. The adversary A outputs a bit b′.
5. The output of the experiment is defined to be 1 if b′ = b, and 0 otherwise. In the former case, we say that A succeeds.
76 Introduction to Modern Cryptography
DEFINITION 3.23 Private-key encryption scheme Π has indistinguish- able multiple encryptions under a chosen-plaintext attack, or is CPA-secure for multiple encryptions, if for all probabilistic polynomial-time adversaries A there is a negligible function negl such that
Pr PrivKLR-cpa(n) = 1 ≤ 1 + negl(n), A,Π 2
where the probability is taken over the randomness used by A and the random- ness used in the experiment.
Our earlier discussion shows that CPA-security for multiple encryptions is at least as strong as all our previous definitions. In particular, if a private-key en- cryption scheme is CPA-secure for multiple encryptions then it is clearly CPA- secure as well. Importantly, the converse also holds; that is, CPA-security implies CPA-security for multiple encryptions. (This stands in contrast to the case of eavesdropping adversaries; see Proposition 3.20.) We state the fol- lowing theorem here without proof; a similar result in the public-key setting is proved in Section 11.2.2.
THEOREM 3.24 Any private-key encryption scheme that is CPA-secure is also CPA-secure for multiple encryptions.
This is a significant technical advantage of CPA-security: It suffices to prove that a scheme is CPA-secure (for a single encryption), and we then obtain “for free” that it is CPA-secure for multiple encryptions as well.
Security against chosen-plaintext attacks is nowadays the minimal notion of security an encryption scheme should satisfy, though it is becoming more com- mon to require even the stronger security properties discussed in Section 4.5.
Fixed-length vs. arbitrary-length messages. Another advantage of working with the definition of CPA-security is that it allows us to treat fixed- length encryption schemes without loss of generality. In particular, given any CPA-secure fixed-length encryption scheme Π = (Gen,Enc,Dec), it is possi- ble to construct a CPA-secure encryption scheme Π′ = (Gen′, Enc′, Dec′) for arbitrary-length messages quite easily. For simplicity, say Π encrypts mes- sages that are 1-bit long (though everything we say extends in the natural way regardless of the message length supported by Π). Leave Gen′ the same as Gen. Define Enc′k for any message m (having some arbitrary length l) as Enc′k(m) = Enck(m1), . . . , Enck(ml), where mi denotes the ith bit of m. De- cryption is done in the natural way. Π′ is CPA-secure if Π is; a proof follows from Theorem 3.24.
There are more efficient ways to encrypt messages of arbitrary length than by adapting a fixed-length encryption scheme in the above manner. We ex- plore this further in Section 3.6.
Private-Key Encryption 77
3.5 Constructing CPA-Secure Encryption Schemes
Before constructing encryption schemes secure against chosen-plaintext at- tacks, we first introduce the important notion of pseudorandom functions.
3.5.1 Pseudorandom Functions and Block Ciphers
Pseudorandom functions (PRFs) generalize the notion of pseudorandom generators. Now, instead of considering “random-looking” strings we con- sider “random-looking” functions. As in our earlier discussion of pseudo- randomness, it does not make much sense to say that any fixed function f : {0, 1}∗ → {0, 1}∗ is pseudorandom (in the same way that it makes little sense to say that any fixed function is random). Thus, we must instead refer to the pseudorandomness of a distribution on functions. Such a distribution is induced naturally by considering keyed functions, defined next.
A keyed function F : {0, 1}∗ × {0, 1}∗ → {0, 1}∗ is a two-input function, where the first input is called the key and denoted k. We say F is effi- cient if there is a polynomial-time algorithm that computes F(k,x) given k and x. (We will only be interested in efficient keyed functions.) In typ- ical usage a key k is chosen and fixed, and we are then interested in the single-input function Fk : {0, 1}∗ → {0, 1}∗ defined by Fk (x) = F (k, x). The security parameter n dictates the key length, input length, and output length. That is, we associate with F three functions lkey, lin, and lout ; for any key k ∈ {0, 1}lkey(n), the function Fk is only defined for inputs x ∈ {0, 1}lin(n), in which case Fk(x) ∈ {0,1}lout(n). Unless stated otherwise, we assume for sim- plicity that F is length-preserving, meaning lkey(n) = lin(n) = lout(n) = n. That is, by fixing a key k ∈ {0, 1}n we obtain a function Fk mapping n-bit input strings to n-bit output strings.
A keyed function F induces a natural distribution on functions given by choosing a uniform key k ∈ {0, 1}n and then considering the resulting single- input function Fk. We call F pseudorandom if the function Fk (for a uniform key k) is indistinguishable from a function chosen uniformly at random from the set of all functions having the same domain and range; that is, if no efficient adversary can distinguish—in a sense we more carefully define below—whether it is interacting with Fk (for uniform k) or f (where f is chosen uniformly from the set of all functions mapping n-bit inputs to n-bit outputs).
Since choosing a function at random is less intuitive than choosing a string at random, it is worth spending a bit more time on this idea. Consider the set Funcn of all functions mapping n-bit strings to n-bit strings. This set is finite, and selecting a uniform function mapping n-bit strings to n-bit strings means choosing an element uniformly from this set. How large is Funcn? A function f is specified by giving its value on each point in its domain. We can view any function (over a finite domain) as a large look-up table that stores
78 Introduction to Modern Cryptography
f(x) in the row of the table labeled by x. For f ∈ Funcn, the look-up table for f has 2n rows (one for each point of the domain {0, 1}n), with each row containing an n-bit string (since the range of f is {0, 1}n). Concatenating all the entries of the table, we see that any function in Funcn can be represented by a string of length 2n · n. Moreover, this correspondence is one-to-one, as each string of length 2n · n (i.e., each table containing 2n entries of length n) defines a unique function in Funcn. Thus, the size of Funcn is exactly the number of strings of length n · 2n, or |Funcn| = 2n·2n .
Viewing a function as a look-up table provides another useful way to think about selecting a uniform function f ∈ Funcn: It is exactly equivalent to choosing each row in the look-up table of f uniformly. This means, in par- ticular, that the values f(x) and f(y) (for any two inputs x ̸= y) are uniform and independent. We can view this look-up table being populated by random entries in advance, before f is evaluated on any input, or we can view entries of the table being chosen uniformly “on-the-fly,” as needed, whenever f is evaluated on a new input on which f has not been evaluated before.
Coming back to our discussion of pseudorandom functions, recall that a pseudorandom function is a keyed function F such that Fk (for k ∈ {0, 1}n chosen uniformly at random) is indistinguishable from f (for f ∈ Funcn chosen uniformly at random). The former is chosen from a distribution over (at most) 2n distinct functions, whereas the latter is chosen from all 2n·2n functions in Funcn. Despite this, the “behavior” of these functions must look the same to any polynomial-time distinguisher.
A first attempt at formalizing the notion of a pseudorandom function would be to proceed in the same way as in Definition 3.14. That is, we could require that every polynomial-time distinguisher D that receives a description of the pseudorandom function Fk outputs 1 with “almost” the same probability as when it receives a description of a random function f . However, this definition is inappropriate since the description of a random function has exponential length (given by its look-up table of length n · 2n), while D is limited to running in polynomial time. So, D would not even have sufficient time to examine its entire input.
The definition therefore gives D access to an oracle O which is either equal to Fk (for uniform k) or f (for a uniform function f). The distinguisher D may query its oracle at any point x, in response to which the oracle returns O(x). We treat the oracle as a black box in the same way as when we provided the adversary with oracle access to the encryption algorithm in the definition of a chosen-plaintext attack. Here, however, the oracle computes a deterministic function and so returns the same result if queried twice on the same input. (For this reason, we may assume without loss of generality that D never queries the oracle twice on the same input.) D may interact freely with its oracle, choosing its queries adaptively based on all previous outputs. Since D runs in polynomial time, however, it can ask only polynomially many queries.
We now present the formal definition. (The definition assumes F is length- preserving for simplicity only.)
Private-Key Encryption 79
DEFINITION 3.25 Let F : {0, 1}∗ × {0, 1}∗ → {0, 1}∗ be an efficient, length-preserving, keyed function. F is a pseudorandom function if for all probabilistic polynomial-time distinguishers D, there is a negligible function negl such that:
Pr[DFk(·)(1n) = 1] − Pr[Df(·)(1n) = 1] ≤ negl(n),
where the first probability is taken over uniform choice of k ∈ {0, 1}n and the randomness of D, and the second probability is taken over uniform choice of f ∈ Funcn and the randomness of D.
An important point is that D is not given the key k. It is meaningless to require that Fk be pseudorandom if k is known, since given k it is trivial to distinguish an oracle for Fk from an oracle for f. (All the distinguisher has to do is query the oracle at any point x to obtain the answer y, and compare this to the result y′ := Fk(x) that it computes itself using the known value k. An oracle for Fk will return y = y′, while an oracle for a random function will have y = y′ with probability only 2−n.) This means that if k is revealed, any claims about the pseudorandomness of Fk no longer hold. To take a concrete example, if F is a pseudorandom function, then given oracle access to Fk (for uniform k) it must be hard to find an input x for which Fk(x) = 0n (since it would be hard to find such an input for a truly random function f). But if k is known, finding such an input may be easy.
Example 3.26
As usual, we can gain familiarity with the definition by looking at an insecure example. Define the keyed, length-preserving function F by F (k, x) = k ⊕ x. For any input x, the value of Fk(x) is uniformly distributed (when k is uni- form). Nevertheless, F is not pseudorandom since its values on any two points are correlated. Concretely, consider the distinguisher D that queries its or- acle O on arbitrary, distinct points x1,x2 to obtain values y1 = O(x1) and y2 =O(x2),andoutputs1ifandonlyify1⊕y2 =x1⊕x2. IfO=Fk,for any k, then D outputs 1. On the other hand, if O = f for f chosen uniformly from Funcn, then the probability that f(x1) ⊕ f(x2) = x1 ⊕ x2 is exactly the probability that f(x2) = x1 ⊕ x2 ⊕ f(x1), or 2−n, and D outputs 1 with this probability. The difference is |1 − 2−n|, which is not negligible. ♦
Pseudorandom Permutations/Block Ciphers
Let Permn be the set of all permutations (i.e., bijections) on {0, 1}n. View- ing any f ∈ Permn as a look-up table as before, we now have the added constraint that the entries in any two distinct rows must be different. We have 2n different choices for the entry in the first row of the table; once we fix this entry, we are left with only 2n − 1 choices for the second row, and so on. We thus see that the size of Permn is (2n)!.
80 Introduction to Modern Cryptography
Let F be a keyed function. We call F a keyed permutation if lin = lout, and
furthermore for all k ∈ lkey(n) the function Fk : {0,1}lin(n) → {0,1}lin(n) is
one-to-one (i.e., Fk is a permutation). We call lin the block length of F.
As before, unless stated otherwise we assume F is length-preserving and
so lkey(n) = lin(n) = n. A keyed permutation is efficient if there is a
polynomial-time algorithm for computing Fk(x) given k and x, as well as
a polynomial-time algorithm for computing F−1(y) given k and y. That is, k
Fk should be both efficiently computable and efficiently invertible given k. The definition of what it means for an efficient, keyed permutation F to be a pseudorandom permutation is exactly analogous to Definition 3.25, with the only difference being that now we require Fk to be indistinguishable from a uniform permutation rather than a uniform function. That is, we require that no efficient algorithm can distinguish between access to Fk (for uniform key k) and access to f (for uniform f ∈ Permn). It turns out that this is merely an aesthetic choice since, whenever the block length is sufficiently long, a random permutation is itself indistinguishable from a random function. Intuitively this is due to the fact that a uniform function f looks identical to a uniform permutation unless distinct values x and y are found for which f(x) = f(y), since in such a case the function cannot be a permutation. However, the probability of finding such values x, y using a polynomial number of queries
is negligible. (This follows from the results of Appendix A.4.)
PROPOSITION 3.27 If F is a pseudorandom permutation and addition- ally lin(n) ≥ n, then F is also a pseudorandom function.
If F is a keyed permutation then cryptographic schemes based on F might require the honest parties to compute the inverse F−1 in addition to com-
k
puting Fk itself. This potentially introduces new security concerns. In par-
ticular, it may now be necessary to impose the stronger requirement that Fk be indistinguishable from a uniform permutation even if the distinguisher is additionally given oracle access to the inverse of the permutation. If F has this property, we call it a strong pseudorandom permutation.
DEFINITION 3.28 Let F : {0, 1}∗ × {0, 1}∗ → {0, 1}∗ be an efficient, length-preserving, keyed permutation. F is a strong pseudorandom permuta- tion if for all probabilistic polynomial-time distinguishers D, there exists a negligible function negl such that:
k
P r [ D F k ( · ) , F − 1 ( · ) ( 1 n ) = 1 ] − P r [ D f ( · ) , f − 1 ( · ) ( 1 n ) = 1 ] ≤ n e g l ( n ) ,
where the first probability is taken over uniform choice of k ∈ {0, 1}n and the randomness of D, and the second probability is taken over uniform choice of f ∈ Permn and the randomness of D.
Of course, any strong pseudorandom permutation is also a pseudorandom permutation.
Private-Key Encryption 81
Block ciphers. In practice, block ciphers are designed to be secure instan- tiations of (strong) pseudorandom permutations with some fixed key length and block length. We discuss approaches for building block ciphers, and some popular candidate block ciphers, in Chapter 6. For the purposes of the present chapter the details of these constructions are unimportant, and for now we simply assume that (strong) pseudorandom permutations exist.
Pseudorandom functions and pseudorandom generators. As one
might expect, there is a close relationship between pseudorandom functions
and pseudorandom generators. It is fairly easy to construct a pseudorandom
generator G from a pseudorandom function F by simply evaluating F on a def
series of different inputs; e.g., we can define G(s) = Fs(1)∥Fs(2)∥···∥Fs(l) for any desired l. If Fs were replaced by a uniform function f , the output of G would be uniform; thus, when using F instead, the output is pseudorandom. You are asked to prove this formally in Exercise 3.14.
More generally, we can use the above idea to construct a stream cipher (Init, GetBits) that accepts an initialization vector I V . (See Section 3.3.1.) The only difference is that instead of evaluating Fs on the fixed input sequence 1,2,3,…, we evaluate F on the inputs IV +1, IV +2, ….
CONSTRUCTION 3.29
Let F be a pseudorandom function. Define a stream cipher (Init, GetBits), where each call to GetBits outputs n bits, as follows:
• Init: oninputs∈{0,1}n andIV ∈{0,1}n,setst0 :=(s,IV).
• GetBits: oninputsti =(s,IV),computeIV′ :=IV +1andset y:=Fs(IV′)andsti+1 :=(s,IV′). Output(y,sti+1).
A stream cipher from any pseudorandom function/block cipher.
Although stream ciphers can be constructed from block ciphers, dedicated stream ciphers used in practice typically have better performance, especially in resource-constrained environments. On the other hand, stream ciphers appear to be less well understood (in practice) than block ciphers, and confidence in their security is lower. It is therefore recommended to use block ciphers (possibly by converting them to stream ciphers first) whenever possible.
Considering the other direction, a pseudorandom generator G immedi- ately gives a pseudorandom function F with small block length. Specifi- cally, say G has expansion factor n · 2t(n). We can define the keyed function F : {0, 1}n × {0, 1}t(n) → {0, 1}n as follows: to compute Fk (i), first com- pute G(k) and interpret the result as a look-up table with 2t(n) rows each containing n bits; output the ith row. This runs in polynomial time only if t(n) = O(log n). It is possible, though more difficult, to construct pseudoran- dom functions with large block length from pseudorandom generators; this is
82 Introduction to Modern Cryptography
shown in Section 7.5. Pseudorandom generators, in turn, can be constructed based on certain mathematical problems conjectured to be hard. The exis- tence of pseudorandom functions based on these hard mathematical problems represents one of the amazing contributions of modern cryptography.
3.5.2 CPA-Secure Encryption from Pseudorandom Functions
We focus here on constructing a CPA-secure, fixed-length encryption scheme. By what we have said at the end of Section 3.4.2, this implies the existence of a CPA-secure encryption scheme for arbitrary-length messages. In Section 3.6 we will discuss more efficient ways of encrypting messages of arbitrary length.
A naive attempt at constructing a secure encryption scheme from a pseu- dorandom permutation is to define Enck(m) = Fk(m). Although we expect that this “reveals no information about m” (since, if f is a uniform function, then f(m) is simply a uniform n-bit string), this method of encryption is de- terministic and so cannot possibly be CPA-secure. In particular, encrypting the same plaintext twice will yield the same ciphertext.
Our secure construction is randomized. Specifically, we encrypt by applying the pseudorandom function to a random value r (rather than the message) and XORing the result with the plaintext. (See Figure 3.3 and Construction 3.30.) This can again be viewed as an instance of XORing a pseudorandom pad with the plaintext, with the major difference being the fact that a fresh pseudoran- dom pad is used each time. (In fact, the pseudorandom pad is only “fresh” if the pseudorandom function is applied to a “fresh” value on which it has never been applied before. While it is possible that a random r will be equal to some r-value chosen previously, this happens with only negligible probability.)
FIGURE 3.3:
Encryption with a pseudorandom function.
Private-Key Encryption 83
Proofs of security based on pseudorandom functions. Before turning to the proof that the above construction is CPA-secure, we highlight a common template that is used by most proofs of security (even outside the context of encryption) for constructions based on pseudorandom functions. The first step of such proofs is to consider a hypothetical version of the construction in which the pseudorandom function is replaced with a random function. It is then argued—using a proof by reduction—that this modification does not significantly affect the attacker’s success probability. We are then left with analyzing a scheme that uses a completely random function. At this point the rest of the proof typically relies on probabilistic analysis and does not rely on any computational assumptions. We will utilize this proof template several times in this and the next chapter.
CONSTRUCTION 3.30
Let F be a pseudorandom function. Define a private-key encryption scheme for messages of length n as follows:
• Gen: on input 1n , choose uniform k ∈ {0, 1}n and output it.
• Enc: on input a key k ∈ {0,1}n and a message m ∈ {0,1}n, choose
uniform r ∈ {0, 1}n and output the ciphertext c := ⟨r, Fk(r) ⊕ m⟩.
• Dec: on input a key k ∈ {0, 1}n and a ciphertext c = ⟨r, s⟩, output the plaintext message
m := Fk(r) ⊕ s.
A CPA-secure encryption scheme from any pseudorandom function.
THEOREM 3.31 If F is a pseudorandom function, then Construction 3.30 is a CPA-secure private-key encryption scheme for messages of length n.
PROOF Let Π = (Gen, Enc, Dec) be an encryption scheme that is exactly the same as Π = (Gen, Enc, Dec) from Construction 3.30, except that a truly
n
random function f is used in place of Fk. That is, Gen(1 ) chooses a uniform
function f ∈ Funcn, and Enc encrypts just like Enc except that f is used instead of Fk. (This modified encryption scheme is not efficient. But we can still define it as a hypothetical encryption scheme for the sake of the proof.)
Fix an arbitrary ppt adversary A, and let q(n) be an upper bound on the number of queries that A(1n) makes to its encryption oracle. (Note that q must be upper-bounded by some polynomial.) As the first step of the proof, we show that there is a negligible function negl such that
PrPrivKcpa (n)=1−PrPrivKcpa (n)=1≤negl(n). (3.8) A,Π A,Π
84 Introduction to Modern Cryptography
We prove this by reduction. We use A to construct a distinguisher D for the pseudorandom function F. The distinguisher D is given oracle access to some function O, and its goal is to determine whether this function is “pseudorandom” (i.e., equal to Fk for uniform k ∈ {0, 1}n) or “random” (i.e., equal to f for uniform f ∈ Funcn). To do this, D emulates experiment PrivKcpa for A in the manner described below, and observes whether A succeeds or not. If A succeeds then D guesses that its oracle must be a pseudorandom function, whereas if A does not succeed then D guesses that its oracle must be a random function. In detail:
Distinguisher D:
D is given input 1n and access to an oracle O : {0, 1}n → {0, 1}n.
1. Run A(1n). Whenever A queries its encryption oracle on a message m ∈ {0, 1}n, answer this query in the following way:
(a) Choose uniform r ∈ {0, 1}n.
(b) Query O(r) and obtain response y.
(c) Return the ciphertext ⟨r, y ⊕ m⟩ to A.
2. When A outputs messages m0,m1 ∈ {0,1}n, choose a uni-
form bit b ∈ {0, 1} and then:
(a) Choose uniform r ∈ {0, 1}n.
(b) Query O(r) and obtain response y.
(c) Return the challenge ciphertext ⟨r, y ⊕ mb⟩ to A.
3. Continue answering encryption-oracle queries of A as before until A outputs a bit b′. Output 1 if b′ = b, and 0 otherwise.
D runs in polynomial time since A does. The key points are as follows:
1. If D’s oracle is a pseudorandom function, then the view of A when
run as a subroutine by D is distributed identically to the view of A
in experiment PrivKcpa (n). This is because, in this case, a key k is A,Π
chosen uniformly at random and then every encryption is carried out by choosing a uniform r, computing y := Fk(r), and setting the ciphertext equal to ⟨r, y ⊕ m⟩, exactly as in Construction 3.30. Thus,
PrFk(·)ncpa
D (1 )=1 =Pr PrivK (n)=1 , (3.9) k←{0,1} A,Π
n
where we emphasize that k is chosen uniformly on the left-hand side.
2. If D’s oracle is a random function, then the view of A when run as a
subroutine by D is distributed identically to the view of A in experiment
PrivKcpa (n). This can be seen exactly as above, with the only difference A,Π
being that a uniform function f ∈ Funcn is used instead of Fk. Thus, Prf←Funcn Df(·)(1n) = 1 = PrPrivKcpa (n) = 1, (3.10)
A,Π
where f is chosen uniformly from Funcn on the left-hand side.
Private-Key Encryption 85 By the assumption that F is a pseudorandom function (and since D is effi-
cient), there exists a negligible function negl for which
P r D F k ( · ) ( 1 n ) = 1 − P r D f ( · ) ( 1 n ) = 1 ≤ n e g l ( n ) .
Combining the above with Equations (3.9) and (3.10) gives Equation (3.8). For the second part of the proof, we show that
Pr PrivKcpa (n) = 1 ≤ 1 + q(n) . (3.11) A,Π 2 2n
(Recall that q(n) is a bound on the number of encryption queries made by A.
The above holds even if we place no computational restrictions on A.) To see
that Equation (3.11) holds, observe that every time a message m is encrypted
in PrivKcpa (n) (either by the encryption oracle or when the challenge cipher-
A,Π n
text is computed), a uniform r ∈ {0, 1}
equal to ⟨r, f (r) ⊕ m⟩. Let r∗ denote the random string used when generating the challenge ciphertext ⟨r∗,f(r∗)⊕mb⟩. There are two possibilities:
1. The value r∗ is never used when answering any of A’s encryption-oracle queries: In this case, A learns nothing about f(r∗) from its interaction with the encryption oracle (since f is a truly random function). This means that, as far as A is concerned, the value f(r∗) that is XORed with mb is uniformly distributed and independent of the rest of the experiment, and so the probability that A outputs b′ = b in this case is exactly 1/2 (as in the case of the one-time pad).
2. The value r∗ is used when answering at least one of A’s encryption- oracle queries: In this case, A may easily determine whether m0 or m1 was encrypted. This is so because if the encryption oracle ever returns a ciphertext ⟨r∗,s⟩ in response to a request to encrypt the message m, the adversary learns that f (r∗ ) = s ⊕ m.
However, since A makes at most q(n) queries to its encryption ora- cle (and thus at most q(n) values of r are used when answering A’s encryption-oracle queries), and since r∗ is chosen uniformly from {0, 1}n, the probability of this event is at most q(n)/2n.
Let Repeat denote the event that r∗ is used by the encryption oracle when answering at least one of A’s queries. As just discussed, the probability of Repeat is at most q(n)/2n, and the probability that A succeeds in PrivKcpa
if Repeat does not occur is exactly 1/2. Therefore: A,Π Pr[PrivKcpa (n)=1]
A,Π
=Pr[PrivKcpa (n)=1∧Repeat]+Pr[PrivKcpa (n)=1∧Repeat]
is chosen and the ciphertext is set
A,Π A,Π
≤ Pr[Repeat] + Pr[PrivKcpa (n) = 1 | Repeat] ≤ q(n) + 1 . A,Π 2n 2
86 Introduction to Modern Cryptography
Combining the above with Equation (3.8), we see that there is a negligible
function negl such that Pr[PrivKcpa (n) = 1] ≤ 1 + q(n) + negl(n). Since q(n) A,Π 2 2n
q is polynomial, 2n is negligible. In addition, the sum of two negligible
functions is negligible, and thus there exists a negligible function negl′ such
that Pr[PrivKcpa (n) = 1] ≤ 1 + negl′(n), completing the proof. A,Π 2
Concrete security. The above proof shows that Pr[PrivKcpa (n) = 1] ≤ 1 + q(n) + negl(n)
for some negligible function negl. The final term depends on the security of F
as a pseudorandom function; it is a bound on the distinguishing probability of
algorithm D (which has roughly the same running time as the adversary A).
The term q(n) represents a bound on the probability that the value r∗ used 2n
to encrypt the challenge ciphertext was used to encrypt some other message.
3.6 Modes of Operation
Modes of operation provide a way to securely (and efficiently) encrypt long messages using stream or block ciphers.
3.6.1 Stream-Cipher Modes of Operation
Construction 3.17 provides a way to construct an encryption scheme using a pseudorandom generator. That scheme has two main drawbacks. First, as presented, the length of the message to be encrypted must be fixed and known in advance. Second, the scheme is only EAV-secure, not CPA-secure.
Stream ciphers, which can be viewed as flexible pseudorandom generators, can be used to address these drawbacks. In practice, stream ciphers are used for encryption in two ways: synchronized mode and unsynchronized mode.
Synchronized mode. In this stateful encryption scheme, the sender and re- ceiver must be synchronized in the sense that they know how much plaintext has been encrypted (resp., decrypted) so far. Synchronized mode is typically used in a single communication session between parties (see Section 4.5.3), where statefulness is acceptable and messages are received in order without being lost. The intuition here is that a long pseudorandom stream is gener- ated, and a different part of it is used to encrypt each message. Synchroniza- tion is needed to ensure correct decryption (i.e., so the receiver knows which part of the stream was used to encrypt the next message), and to ensure that no portion of the stream is re-used. We describe this mode in detail next.
A,Π 2 2n
Private-Key Encryption
87
FIGURE 3.4:
Synchronized mode and unsynchronized mode.
We have seen in Algorithm 3.16 that a stream cipher can be used to con- struct a pseudorandom generator Gl with any desired expansion factor l. We can easily modify that algorithm to obtain a pseudorandom generator G∞ with variable output length. G∞ takes two inputs: a seed s and a desired output length 1l (we specify this in unary since G∞ will run in time polyno- mial in l). As in Algorithm 3.16, G∞(s,1l) runs Init(s) and then repeatedly runs GetBits a total of l times.
We can use G∞ in Construction 3.17 to handle encryption of arbitrary- length messages: encryption of a message m using the key k is done by com- puting the ciphertext c := G∞(k,1|m|)⊕m; decryption of a ciphertext c using the key k is carried out by computing the message m := G∞(k, 1|c|) ⊕ c. A minor modification of the proof of Theorem 3.18 shows that if the stream ci- pher is secure then this encryption scheme has indistinguishable encryptions in the presence of an eavesdropper.
A little thought shows that if the communicating parties are willing to
maintain state, then they can use the same key to encrypt multiple messages.
(See Figure 3.4.) The conceptual insight is that the parties can treat multiple
messages m1, m2, . . . as a single, long message; furthermore, Construction 3.17
(as well as the modified version in the previous paragraph) has the property
that initial portions of a message can be encrypted and transmitted even if
the rest of the message is not yet known. Concretely, the parties share a key k
and both begin by computing st0 := Init(k). To encrypt the first message m1
of length l1, the sender repeatedly runs GetBits a total of l1 times, beginning def
at st0, to obtain a stream of bits pad1 = y1, . . . , yl1 along with updated
state stl1 ; it then sends c1 := pad1 ⊕ m1. Upon receiving c1, the other party
repeatedly runs GetBits a total of l1 times to obtain the same values pad1
and stl1 ; it uses pad1 to recover m1 := pad1 ⊕ c1. Later, to encrypt a second
message m2 of length l2, the sender will repeatedly run GetBits a total of l2 def
times, beginning at stl1 , to obtain pad2 = yl1+1, . . . , yl1+l2 and updated state stl1+l2 , and then compute the ciphertext c2 := pad2 ⊕ m2, and so on. This can continue indefinitely, allowing the parties to send an unlimited number of messages of arbitrary length. We remark that in this mode, the stream cipher does not need to use an IV .
88 Introduction to Modern Cryptography
This method of encrypting multiple messages requires the communicating parties to maintain synchronized state, explaining the terminology “synchro- nized mode.” For that reason, the method is appropriate when two parties are communicating within a single “session,” but it does not work well for spo- radic communication or when a party might, over time, communicate from different devices. (It is relatively easy to keep copies of a fixed key in different locations; it is harder to maintain synchronized state across multiple loca- tions.) Furthermore, if the parties ever get out of sync (e.g., because one of the transmissions between the parties is dropped), decryption will return an incorrect result. Resynchronization is possible, but adds additional overhead.
Unsynchronized mode. For stream ciphers whose Init function accepts an
initialization vector as input, we can achieve stateless CPA-secure encryption
for messages of arbitrary length. Here we modify G∞ to accept three inputs:
a seed s, an initialization vector IV , and a desired output length 1l. Now this
algorithm first computes st0 := Init(s, IV ) before repeatedly running GetBits
a total of l times. Encryption can then be carried out using a variant of
Construction 3.30: the encryption of a message m using the key k is done
by choosing a uniform initialization vector IV ∈ {0,1}n and computing the
ciphertext ⟨IV, G∞(s, IV, 1|m|) ⊕ m⟩; decryption is performed in the natural
way. (See Figure 3.4.) This scheme is CPA-secure if the stream cipher now
has the stronger property that, for any polynomial l, the function F defined def l
by Fk(IV ) = G∞(k, IV, 1 ) is a pseudorandom function. (In fact, F need only be pseudorandom when evaluated on uniform inputs. Keyed functions with this weaker property are called weak pseudorandom functions.)
3.6.2 Block-Cipher Modes of Operation
We have already seen a construction of a CPA-secure encryption scheme based on pseudorandom functions/block ciphers. But Construction 3.30 (and its extension to arbitrary-length messages as discussed at the end of Sec- tion 3.4.2) have the drawback that the length of the ciphertext is double the length of the plaintext. Block-cipher modes of operation provide a way of encrypting arbitrary-length messages using shorter ciphertexts.
In this section, let F be a block cipher with block length n. We assume here that all messages m being encrypted have length a multiple of n, and write m = m1,m2,…,ml where each mi ∈ {0,1}n represents a block of the plaintext. Messages that are not a multiple of n can always be unambiguously padded to have length a multiple of n by appending a 1 followed by sufficiently many 0s, and so this assumption is without much loss of generality.
Several block-cipher modes of operation are known; we present four of the most common ones and discuss their security.
Electronic Code Book (ECB) mode. This is a naive mode of operation in which the ciphertext is obtained by direct application of the block cipher to each plaintext block. That is, c := ⟨Fk(m1),Fk(m2),…,Fk(ml)⟩; see
Private-Key Encryption 89
FIGURE 3.5: Electronic Code Book (ECB) mode.
Figure 3.5. Decryption is done in the obvious way, using the fact that F −1 is
efficiently computable.
ECB mode is deterministic and therefore cannot be CPA-secure. Worse, ECB-mode encryption does not even have indistinguishable encryptions in the presence of an eavesdropper. This is because if a block is repeated in the plaintext, it will result in a repeating block in the ciphertext. Thus, it is easy to distinguish an encryption of a plaintext that consists of two identical blocks from an encryption of a plaintext that consists of two different blocks. This is not just a theoretical problem. Consider encrypting an image in which small groups of pixels correspond to a plaintext block. Encrypting using ECB mode may reveal a significant amount of information about patterns in the image, something that should not happen when using a secure encryption scheme. Figure 3.6 demonstrates this.
For these reasons, ECB mode should never be used. (We include it only because of its historical significance.)
FIGURE 3.6: An illustration of the dangers of using ECB mode. The middle figure is an encryption of the image on the left using ECB mode; the figure on the right is an encryption of the same image using a secure mode. (Taken from http://en.wikipedia.org and derived from images created by Larry Ewing (lewing@isc.tamu.edu) using The GIMP.)
k
90
Introduction to Modern Cryptography
FIGURE 3.7: Cipher Block Chaining (CBC) mode.
Cipher Block Chaining (CBC) mode. To encrypt using this mode, a
uniform initialization vector (IV ) of length n is first chosen. Then, ciphertext
blocks are generated by applying the block cipher to the XOR of the current
plaintext block and the previous ciphertext block. That is, set c0 := IV
and then, for i = 1 to l, set ci := Fk(ci−1 ⊕ mi). The final ciphertext is
⟨c0, c1, . . . , cl⟩. (See Figure 3.7.) Decryption of a ciphertext c0, . . . , cl is done
by computing mi := F−1(ci)⊕ci−1 for i = 1,…,l. We stress that the IV is k
included in the ciphertext; this is crucial so decryption can be carried out. Importantly, encryption in CBC mode is probabilistic and it has been proven that if F is a pseudorandom permutation then CBC-mode encryp- tion is CPA-secure. The main drawback of this mode is that encryption must be carried out sequentially because the ciphertext block ci−1 is needed in or- der to encrypt the plaintext block mi. Thus, if parallel processing is available,
CBC-mode encryption may not be the most efficient choice.
One may be tempted to think that it suffices to use a distinct IV (rather than a random IV ) for every encryption, e.g., to first use IV = 1 and then increment the IV by one each time a message is encrypted. In Exercise 3.20,
we ask you to show that this variant of CBC-mode encryption is not secure. One might also consider a stateful variant of CBC-mode encryption—called chained CBC mode—in which the last block of the previous ciphertext is used as the IV when encrypting the next message. This reduces the bandwidth, as the IV need not be sent each time. See Figure 3.8, where an initial message m1, m2, m3 is encrypted using a random IV , and then subsequently a second message m4, m5 is encrypted using c3 as the IV . (In contrast, encryption using stateless CBC mode would generate a fresh IV when encrypting the
second message.) Chained CBC mode is used in SSL 3.0 and TLS 1.0.
It may appear that chained CBC mode is as secure as CBC mode, since the chained-CBC encryption of m1, m2, m3 followed by encryption of m4, m5 yields the same ciphertext blocks as CBC-mode encryption of the (single) mes- sage m1, m2, m3, m4, m5. Nevertheless, chained CBC mode is vulnerable to a chosen-plaintext attack. The basis of the attack is that the adversary knows in
Private-Key Encryption 91
FIGURE 3.8: Chained CBC.
advance the “initialization vector” that will be used for the second encrypted message. We describe the attack informally, based on Figure 3.8. Assume the attacker knows that m1 ∈ {m01 , m1 }, and observes the first ciphertext IV, c1, c2, c3. The attacker then requests an encryption of a second message m4,m5 with m4 = IV ⊕m01 ⊕c3, and observes a second ciphertext c4,c5. One can verify that m1 = m01 if and only if c4 = c1. This example should serve as a strong warning against making any modifications to cryptographic schemes, even if those modifications seem benign.
Output Feedback (OFB) mode. The third mode we present can be viewed as an unsynchronized stream-cipher mode, where the stream cipher is con- structed in a specific way from the underlying block cipher. We describe the mode directly. First, a uniform IV ∈ {0,1}n is chosen. Then, a pseudoran- dom stream is generated from I V in the following way: Define y0 := I V , and set the ith block yi of the stream to be yi := Fk(yi−1). Each block of the plaintext is encrypted by XORing it with the appropriate block of the stream; that is, ci := yi ⊕ mi. (See Figure 3.9.) As in CBC mode, the IV is included as part of the ciphertext to enable decryption. However, in contrast to CBC mode, here it is not required that F be invertible. (In fact, it need
FIGURE 3.9: Output Feedback (OFB) mode.
92 Introduction to Modern Cryptography
not even be a permutation.) Furthermore, as in stream-cipher modes of oper- ation, here it is not necessary for the plaintext length to be a multiple of the block length. Instead, the generated stream can be truncated to exactly the plaintext length. Another advantage of OFB mode is that its stateful variant (in which the final value yl used to encrypt some message is used as the IV for encrypting the next message) is secure. This stateful variant is equivalent to a synchronized stream-cipher mode, with the stream cipher constructed from the block cipher in the specific manner just described.
OFB mode can be shown to be CPA-secure if F is a pseudorandom function. Although both encryption and decryption must be carried out sequentially, this mode has the advantage relative to CBC mode that the bulk of the computation (namely, computation of the pseudorandom stream) can be done independently of the actual message to be encrypted. So, it is possible to generate a pseudorandom stream ahead of time using preprocessing, after which point encryption of the plaintext (once it is known) is incredibly fast.
FIGURE 3.10: Counter (CTR) mode.
Counter (CTR) mode. Counter mode can also be viewed as an unsynchro- nized stream-cipher mode, where the stream cipher is constructed from the block cipher as in Construction 3.29. We give a self-contained description here. To encrypt using CTR mode, a uniform value ctr ∈ {0,1}n is first chosen. Then, a pseudorandom stream is generated by computing yi := Fk(ctr + i), where ctr and i are viewed as integers and addition is done modulo 2n. The ith ciphertext block is ci := yi ⊕ mi, and the IV is again sent as part of the ciphertext; see Figure 3.10. Note again that decryption does not require F to be invertible, or even a permutation. As with OFB mode, another “stream- cipher” mode, the generated stream can be truncated to exactly the plaintext length, preprocessing can be used to generate the pseudorandom stream before the message is known, and the stateful variant of CTR mode is secure.
In contrast to all the secure modes discussed previously, CTR mode has the advantage that encryption and decryption can be fully parallelized, since all the blocks of the pseudorandom stream can be computed independently of each other. In contrast to OFB, it is also possible to decrypt the ith block of
Private-Key Encryption 93
the ciphertext using only a single evaluation of F . These features make CTR mode an attractive choice.
CTR mode is also fairly simple to analyze:
THEOREM 3.32 If F is a pseudorandom function, then CTR mode is CPA-secure.
PROOF We follow the same template as in the proof of Theorem 3.31: we first replace F with a random function and then analyze the resulting scheme. Let Π = (Gen, Enc, Dec) be the (stateless) CTR-mode encryption scheme,
and let Π = (Gen, Enc, Dec) be the encryption scheme that is identical to Π n
except that a truly random function is used in place of Fk. That is, Gen(1 ) chooses a uniform function f ∈ Funcn, and Enc encrypts just like Enc except
that f is used instead of Fk. (Once again, neither Gen nor Enc is efficient but this does not matter for the purposes of defining an experiment involving Π.)
Fix an arbitrary ppt adversary A, and let q(n) be a polynomial upper- bound on the number of encryption-oracle queries made by A(1n) as well as the maximum number of blocks in any such query and the maximum number of blocks in m0 and m1. As the first step of the proof, we claim that there is a negligible function negl such that
PrPrivKcpa (n)=1−PrPrivKcpa (n)=1≤negl(n). (3.12) A,Π A,Π
This is proved by reduction in a way similar to the analogous step in the proof of Theorem 3.31, and is left as an exercise for the reader.
(3.13)
We next claim that
Pr PrivKA,Π(n)=1 <2+ 2n .
cpa 1 2q(n)2 Combined with Equation (3.12) this means that
cpa 1 2q(n)2
Pr PrivKA,Π(n) = 1 < 2 + 2n + negl(n).
Since q is polynomial, 2q(n)2 is negligible and this completes the proof. 2n
We now prove Equation (3.13). Fix some value n for the security parameter.
Let l∗ ≤ q(n) denote the length (in blocks) of the messages m0, m1 output by
A in experiment PrivKcpa (n), and let ctr∗ denote the initial value used when A,Π
generating the challenge ciphertext. Similarly, let li ≤ q(n) be the length (in blocks) of the ith encryption-oracle query made by A, and let ctri denote the initial value used when answering this query. When the ith encryption-oracle query is answered, f is applied to the values ctri + 1, . . . , ctri + li. When the challenge ciphertext is encrypted, f is applied to ctr∗ +1, . . . , ctr∗ +l∗, and the ith ciphertext block is computed by XORing f(ctr∗ +i) with the ith message block. There are two cases:
94
Introduction to Modern Cryptography
1. Theredonotexistanyi,j,j∗ ≥1(withj≤li andj∗ ≤l∗)forwhich ctri +j = ctr∗ +j∗: In this case, the values f(ctr∗ +1),...,f(ctr∗ +l∗) used when encrypting the challenge ciphertext are uniformly distributed and independent of the rest of the experiment since f was not applied to any of these inputs when encrypting the adversary’s oracle queries. This means that the challenge ciphertext is computed by XORing a stream of uniform bits with the message mb, and so the probability that A outputs b′ = b is exactly 1/2 (as in the case of the one-time pad).
2.Thereexisti,j,j∗ ≥1(withj≤li andj∗ ≤l∗)forwhichctri+j= ctr∗ + j∗: We denote this event by Overlap. In this case A may poten- tially determine which of its messages was encrypted to give the chal- lenge ciphertext, since A can deduce the value of f(ctri+j) = f(ctr∗+j∗) from the answer to its ith oracle query.
Let us analyze the probability that Overlap occurs. The probability is maxi- mized if l∗ and each li are as large as possible, so assume l∗ = li = q(n) for all i. Let Overlapi denote the event that the sequence ctri + 1, . . . , ctri + q(n) overlaps the sequence ctr∗ + 1, . . . , ctr∗ + q(n); then Overlap is the event that Overlapi occurs for some i. Since there are at most q(n) oracle queries, a union bound (cf. Proposition A.7) gives
q(n) i=1
Fixing ctr∗, event Overlapi occurs exactly when ctri satisfies ctr∗ −q(n)+1≤ctri ≤ctr∗ +q(n)−1.
Pr[Overlap] ≤
Pr[Overlapi].
(3.14)
Since there are 2q(n) − 1 values of ctri for which Overlapi occurs, and ctri is chosen uniformly from {0, 1}n, we see that
Pr[Overlapi] = 2q(n) − 1 < 2q(n) . 2n 2n
Combined with Equation (3.14), this gives Pr[Overlap] < 2q(n)2/2n. Given the above, we can easily bound the success probability of A:
Pr[PrivKcpa (n)=1]=Pr[PrivKcpa (n)=1∧Overlap] A,Π A,Π
+Pr[PrivKcpa (n)=1∧Overlap] cpa A,Π
≤ Pr[PrivKA,Π (n) = 1 | Overlap] + Pr[Overlap] 1 2q(n)2
<2+ 2n .
This proves Equation (3.13) and completes the proof.
Private-Key Encryption 95
Modes of operation and message tampering. In many texts, modes of operation are also compared based on how well they protect against adver- sarial modification of the ciphertext. We do not include such a comparison here because the issue of message integrity or message authentication should be dealt with separately from encryption, and we do so in the next chapter. None of the above modes achieves message integrity in the sense we will define there. We defer further discussion to the next chapter.
With regard to the behavior of different modes in the presence of “benign” (i.e., non-adversarial) transmission errors, see Exercises 3.21 and 3.22. We remark, however, that in general such errors can be addressed using standard techniques (e.g., error correction or re-transmission).
Block length and concrete security. CBC, OFB, and CTR modes all use a random IV . This has the effect of randomizing the encryption process, and ensures that (with high probability) the underlying block cipher is always evaluated on fresh (i.e., new) inputs. This is important because, as we have seen in the proofs of Theorem 3.31 and Theorem 3.32, if an input to the block cipher is used more than once then security can be violated.
The block length of a block cipher thus has a significant impact on the concrete security of encryption schemes based on that cipher. Consider, e.g., CTR mode using a block cipher F with block length l. The IV is then a uniform l-bit string, and we expect an IV to repeat after encrypting about 2l/2 messages (see Appendix A.4). If l is too short, then—even if F is secure as a pseudorandom permutation—the resulting concrete-security bound will be too weak for practical applications. Concretely, if l = 64 as is the case for DES (a block cipher we will study in Chapter 6), then after 232 ≈ 4,300,000,000 encryptions—or roughly 34 gigabytes of plaintext—a repeated IV is expected to occur. Although this may seem like a lot of data, it is smaller than the capacity of modern hard drives.
IV misuse. In our description and discussion of the various (secure) modes, we have assumed a random IV is chosen each time a message encrypted. What if this assumption goes wrong due, e.g., to poor randomness generation or a mistaken implementation? Certainly we can no longer guarantee security in the sense of Definition 3.22. From a practical point of view, though, the “stream-cipher modes” (OFB and CTR) are much worse than CBC mode. If an IV repeats when using the former, an attacker can XOR the two resulting ciphertexts and learn a lot of information about the entire contents of both encrypted messages (as we have seen previously in the context of the one-time pad if the key is re-used). With CBC mode, however, it is likely that after only a few blocks the inputs to the block cipher will “diverge” and the attacker will be unable to learn any information beyond the first few message blocks.
One way to address potential IV misuse is to use stateful encryption, as discussed in the context of OFB and CTR modes. If stateful encryption is not possible, and there is concern about potential IV misuse, then CBC mode is recommended for the reasons described above.
96
Introduction to Modern Cryptography
3.7
3.7.1
Chosen-Ciphertext Attacks
Defining CCA-Security
Until now we have defined security against two types of attacks: passive eavesdropping and chosen-plaintext attacks. A chosen-ciphertext attack is even more powerful. In a chosen-ciphertext attack, the adversary has the ability not only to obtain encryptions of messages of its choice (as in a chosen- plaintext attack), but also to obtain the decryption of ciphertexts of its choice (with one exception discussed later). Formally, we give the adversary access to a decryption oracle in addition to an encryption oracle. We present the formal definition and defer further discussion.
Consider the following experiment for any private-key encryption scheme Π = (Gen, Enc, Dec), adversary A, and value n for the security parameter.
The CCA indistinguishability experiment PrivKcca (n): A,Π
1. A key k is generated by running Gen(1n).
2. Adversary A is given input 1n and oracle access to Enck(·)
and Deck (·). It outputs a pair of messages m0 , m1 of the same length.
3. A uniform bit b ∈ {0,1} is chosen, and then a ciphertext c ← Enck(mb) is computed and given to A. We call c the challenge ciphertext.
4. The adversary A continues to have oracle access to Enck(·) and Deck(·), but is not allowed to query the latter on the challenge ciphertext itself. Eventually, A outputs a bit b′.
5. The output of the experiment is defined to be 1 if b′ = b, and 0 otherwise. If the output of the experiment is 1, we say that A succeeds.
DEFINITION 3.33 A private-key encryption scheme Π has indistinguish- able encryptions under a chosen-ciphertext attack, or is CCA-secure, if for all probabilistic polynomial-time adversaries A there is a negligible function negl such that:
where the probability is taken over all randomness used in the experiment.
For completeness, we remark that the natural analogue of Theorem 3.24 holds for CCA-security as well. (Namely, if a scheme has indistinguishable encryptions under a chosen-ciphertext attack then it has indistinguishable multiple encryptions under a chosen-ciphertext attack, defined appropriately.)
Pr[PrivKcca (n) = 1] ≤ 1 + negl(n), A,Π 2
Private-Key Encryption 97
In the experiment above, the adversary’s access to the decryption oracle is unlimited except for the restriction that the adversary may not request decryption of the challenge ciphertext itself. This restriction is necessary or else there is clearly no hope for any encryption scheme to satisfy the definition.
At this point you may be wondering if chosen-ciphertext attacks realistically model any real-world attack. As in the case of chosen-plaintext attacks, we do not expect honest parties to decrypt arbitrary ciphertexts of an adversary’s choice. Nevertheless, there may be scenarios where an adversary might be able to influence what gets decrypted and learn some partial information about the result. For example:
1. In the Midway example from Section 3.4.2, it is conceivable that the US cryptanalysts might also have tried to send encrypted messages to the Japanese and then monitor their behavior. Such behavior (e.g., move- ment of forces and the like) could have provided important information about the underlying plaintext.
2. Imagine a user sending encrypted messages to their bank. An adver- sary may be able to send ciphertexts on behalf of that user; the bank will decrypt those ciphertexts and the adversary may learn something about the result. For example, if a ciphertext corresponds to an ill- formed plaintext (e.g., an unintelligible message, or simply one that is not formatted correctly), the adversary may be able to deduce this from the bank’s reaction (i.e., the pattern of subsequent communication). A practical attack of this type is presented in Section 3.7.2 below.
3. Encryption is often used in higher-level protocols; e.g., an encryption scheme might be used as part of an authentication protocol where one party sends a ciphertext to the other, who decrypts it and returns the result. In this case, one of the honest parties behaves exactly like a decryption oracle.
Insecurity of the schemes we have studied. None of the encryption schemes we have seen thus far is CCA-secure. We demonstrate this for Con- struction 3.30, where encryption is computed as Enck(m) = ⟨r,Fk(r)⊕m⟩. Consider an adversary A running in the CCA indistinguishability experiment who chooses m0 = 0n and m1 = 1n. Then, upon receiving a ciphertext c = ⟨r, s⟩, the adversary can flip the first bit of s and ask for a decryption of the resulting ciphertext c′. Since c′ ̸= c, this query is allowed and the decryp- tion oracle answers with either 10n−1 (in which case it is clear that b = 0) or 01n−1 (in which case b = 1). This example demonstrates that CCA-security is quite stringent. Any encryption scheme that allows ciphertexts to be “ma- nipulated” in any controlled way cannot be CCA-secure. Thus, CCA-security implies a very important property called non-malleability. Loosely speaking, a non-malleable encryption scheme has the property that if the adversary tries to modify a given ciphertext, the result is either an invalid ciphertext or one
98 Introduction to Modern Cryptography
that decrypts to a plaintext having no relation to the original one. This is a
very useful property for schemes used in complex cryptographic protocols.
Constructing a CCA-secure encryption scheme. We show how to con- struct a CCA-secure encryption scheme in Section 4.5.4. The construction is presented there because it uses tools that we develop in Chapter 4.
3.7.2 Padding-Oracle Attacks
The chosen-ciphertext attack on Construction 3.30 described in the previous section is a bit contrived, since it assumes the attacker can obtain the complete decryption of a modified ciphertext. While this sort of attack is allowed under Definition 3.33, it is not clear that it represents a serious concern in practice. Here we show a chosen-ciphertext attack on a natural and widely used encryption scheme; moreover, the attack only requires the ability of an attacker to determine whether or not a modified ciphertext decrypts correctly. Such information is frequently very easy to obtain since, for example, a server might request a retransmission or terminate a session if it ever receives a ciphertext that does not decrypt correctly, and either of these events would generate a noticeable change in the observed traffic. The attack has been shown to work in practice on various deployed protocols; we give one concrete example at the end of this section.
As mentioned previously, when using CBC mode the length of the plaintext must be a multiple of the block length; if a plaintext does not satisfy this property, it must be padded before being encrypted. We refer to the original plaintext as the message, and the result after padding as encoded data. The padding scheme we use must allow the receiver to unambiguously determine from the encoded data where the message ends. One popular and standardized approach is to use PKCS #5 padding. Assume the original message has an integral number of bytes, and let L denote the block length (in bytes) of the block cipher being used. Let b denote the number of bytes that need to be appended to the message in order to make the total length of the resulting encoded data a multiple of the block length. Here, b is an integer between 1 and L, inclusive. (We cannot have b = 0 since this would lead to ambiguous padding. Thus, if the message length is already a multiple of the block length, L bytes of padding are appended.) Then we append to the message the string containing the integer b (represented in 1 byte, or 2 hexadecimal digits) repeated b times. That is, if 1 byte of padding is needed then the string 0x01 (written in hexadecimal) is appended; if 4 bytes of padding are needed then the hexadecimal string 0x04040404 is appended; etc. The encoded data is then encrypted using regular CBC-mode encryption.
When decrypting, the receiver first applies CBC-mode decryption as usual to recover the encoded data, and then checks that the encoded data is cor- rectly padded. (This is easily done: simply read the value b of the final byte and then verify that the final b bytes of the result all have value b.) If so,
Private-Key Encryption 99
then the padding is stripped off and the original message returned. Other- wise, the standard procedure is to return a “bad padding” error (e.g., in Java the standard exception is called javax.crypto.BadPaddingException). The presence of such an error message provides an adversary with a partial decryp- tion oracle. That is, an adversary can send any ciphertext to the server and learn (based on whether a “bad padding” error is returned or not) whether the underlying encoded data is padded in the correct format. Although this may seem like meaningless information, we show that it enables an adversary to completely recover the original message for any ciphertext of its choice.
We describe the attack on a 3-block ciphertext for simplicity. Let IV, c1, c2 be a ciphertext observed by the attacker, and let m1,m2 be the underlying encoded data (unknown to the attacker) which corresponds to the padded message, as discussed above. (Each block is L bytes long.) Note that
m2 =F−1(c2)⊕c1, (3.15) k
where k is the key (which is, of course, not known to the attacker) being used by the honest parties. The second block m2 ends in 0xb · · · 0xb, where we let
b times
0xb denote the 1-byte representation of the integer b. The key property used in
the attack is that certain changes to the ciphertext yield predictable changes
in the underlying encoded data after CBC-mode decryption. Specifically, let
c′1 be identical to c1 except for a modification in the final byte. Consider
decryption of the modified ciphertext IV,c′1,c2. This will result in encoded
data m′ , m′ where m′ = F −1(c ) ⊕ c′ . Comparing to Equation (3.15) we 122k21
see that m′2 will be identical to m2 except for a modification in the final byte. (The value of m′1 is unpredictable, but this will not adversely affect the attack.) Similarly, if c′1 is the same as c1 except for a change in its ith byte, then decryption of IV,c′1,c2 will result in m′1,m′2 where m′2 is the same as m2 except for a change in its ith byte. More generally, if c′1 = c1 ⊕ ∆ for any string ∆, then decryption of IV, c′1, c2 yields m′1, m′2 where m′2 = m2 ⊕ ∆. The upshot is that the attacker can exercise significant control over the final block of the encoded data.
As a warmup, let us see how the adversary can exploit this to learn b, the amount of padding. (This reveals the length of the original message.) Recall that upon decryption, the receiver looks at the value b of the final byte of the second block of the encoded data, and then verifies that the final b bytes all have the same value. The attacker begins by modifying the first byte of c1 and sending the resulting ciphertext IV, c′1, c2 to the receiver. If decryption fails (i.e., the receiver returns an error) then it must be the case that the receiver is checking all L bytes of m′2, and therefore b = L! Otherwise, the attacker learns that b < L, and it can then repeat the process with the second byte, and so on. The left-most modified byte for which decryption fails reveals exactly the left-most byte being checked by the receiver, and so reveals exactly b.
With b known, the attacker can proceed to learn the bytes of the message one-by-one. We illustrate the idea for the final byte of the message, which we
100 Introduction to Modern Cryptography
denote by B. The attacker knows that m2 ends in 0xB0xb · · · 0xb (with 0xb
repeated b times) and wishes to learn B. Define
b times
def
⊕ 0x00 ··· 0x00 0x00 0xb ··· 0xb
∆i = 0x00 ··· 0x00 0xi 0x(b+1) ··· 0x(b+1) b times
for 0 ≤ i < 28; i.e., the final b+1 byes of ∆i contain the integer i (rep- resented in hexadecimal) followed by the value (b + 1) ⊕ b (in hexadecimal) repeated b times. If the attacker submits the ciphertext IV, c1 ⊕ ∆i, c2 to the receiver then, after CBC-mode decryption, the resulting encoded data will equal 0x(B ⊕ i)0x(b + 1) · · · 0x(b + 1) (with 0x(b + 1) repeated b times). De- cryption will fail unless 0x(B ⊕ i) = 0x(b + 1). The attacker simply needs to
try at most 28 values ∆0,...,∆2 −1 i
until decryption succeeds for some ∆ , at which point it can deduce that B = 0x(b + 1) ⊕ i. We leave a full description of how to extend this attack so as to learn the entire plaintext—and not just
8
A padding-oracle attack on CAPTCHAs. A CAPTCHA is a distorted image of, say, an English word that is easy for humans to read, but hard for a computer to process. CAPTCHAs are used in order to ensure that a hu- man user—and not some automated software—is interacting with a webpage. CAPTCHAs are used, e.g., by online email services to ensure that humans open accounts; this is important to prevent spammers from automatically opening thousands of accounts and using them to send spam.
One way that CAPTCHAs can be configured is as a separate service run on an independent server. In order to see how this works, we denote a webserver by SW, the CAPTCHA server by SC, and the user by U. When the user U loads a webpage served by SW , the following events occur: SW encrypts a random English word w using a key k that was initially shared between SW and SC, and sends the resulting ciphertext to the user (along with the webpage). U forwards the ciphertext to SC, who decrypts it, obtains w, and renders a distorted image of w to U. Finally, U sends the word w back to SW for verification. The interesting observation here is that SC decrypts any ciphertext it receives from U and will issue a “bad padding” error message if decryption fails as described earlier. This presents U with an opportunity to carry out a padding-oracle attack as described above, and thus to solve the CAPTCHA (i.e., to determine w) automatically without any human involve- ment, rendering the CAPTCHA ineffective. While ad hoc measures can be used (e.g., having SC return a random image instead of a decryption error), what is really needed is to use an encryption scheme that is CCA-secure.
the final block—as an exercise.
Private-Key Encryption 101
References and Additional Reading
The modern computational approach to cryptography was initiated in a groundbreaking paper by Goldwasser and Micali [80]. That paper introduced the notion of semantic security, and showed how this goal could be achieved in the setting of public-key encryption (see Chapters 10 and 11). That paper also proposed the notion of indistinguishability (i.e., Definition 3.8), and showed that it implies semantic security. The converse was shown in [125]. The reader is referred to [76] for further discussion of semantic security.
Blum and Micali [37] introduced the notion of pseudorandom generators and proved their existence based on a specific number-theoretic assumption. In the same work, Blum and Micali also pointed out the connection between pseudorandom generators and private-key encryption as in Construction 3.17. The definition of pseudorandom generators given by Blum and Micali is dif- ferent from the definition we use in this book (Definition 3.14); the latter definition originates in the work of Yao [179], who showed equivalence of the two formulations. Yao also showed constructions of pseudorandom generators based on general assumptions; we will explore this result in Chapter 7.
Formal definitions of security against chosen-plaintext attacks were given by Luby [115] and Bellare et al. [15]. Chosen-ciphertext attacks (in the context of public-key encryption) were first formally defined by Naor and Yung [129] and Rackoff and Simon [147], and were considered also in [61] and [15]. See [101] for other notions of security for private-key encryption.
Pseudorandom functions were defined and constructed by Goldreich et al. [78], and their application to encryption was demonstrated in subsequent work by the same authors [77]. Pseudorandom permutations and strong pseu- dorandom permutations were studied by Luby and Rackoff [116]. A proof of Proposition 3.27 (the “switching lemma”) can be found in [24]. Block ciphers had been used for many years before they began to be studied in the theo- retical sense initiated by the above works. Practical stream ciphers and block ciphers are studied in detail in Chapter 6.
The ECB, CBC, and OFB modes of operation (as well as CFB, a mode of operation not covered here) were standardized along with the DES block cipher [130]. CTR mode was standardized by NIST in 2001. CBC and CTR mode were proven CPA-secure in [15]. For more recent modes of operation, see http://csrc.nist.gov/groups/ST/toolkit/BCM/index.html.
The attack on chained CBC was first described by Rogaway (unpublished), and was used to attack SSL/TLS in the so-called “BEAST attack” by Rizzo and Duong. The padding-oracle attack we describe here originated in the work of Vaudenay [171].
102 Introduction to Modern Cryptography
Exercises
3.1 Prove Proposition 3.6.
3.2 Prove that Definition 3.8 cannot be satisfied if Π can encrypt arbitrary-
length messages and the adversary is not restricted to output equal-
length messages in experiment PrivKeav . A,Π
Hint: Let q(n) be a polynomial upper-bound on the length of the cipher- text when Π is used to encrypt a single bit. Then consider an adversary who outputs m0 ∈ {0, 1} and a uniform m1 ∈ {0, 1}q(n)+2.
3.3 Say Π = (Gen, Enc, Dec) is such that for k ∈ {0, 1}n, algorithm Enck is only defined for messages of length at most l(n) (for some polynomial l). Construct a scheme satisfying Definition 3.8 even when the adversary is not restricted to outputting equal-length messages in PrivKeav .
A,Π 3.4 Prove the equivalence of Definition 3.8 and Definition 3.9.
3.5 Let |G(s)| = l(|s|) for some l. Consider the following experiment: The PRG indistinguishability experiment PRGA,G(n):
(a) A uniform bit b ∈ {0,1} is chosen. If b = 0 then choose a uniform r ∈ {0, 1}l(n); if b = 1 then choose a uniform s ∈ {0,1}n and set r := G(s).
(b) The adversary A is given r, and outputs a bit b′.
(c) The output of the experiment is defined to be 1 if b′ = b,
and 0 otherwise.
Provide a definition of a pseudorandom generator based on this exper- iment, and prove that your definition is equivalent to Definition 3.14. (That is, show that G satisfies your definition if and only if it satisfies Definition 3.14.)
3.6 Let G be a pseudorandom generator with expansion factor l(n) > 2n. In each of the following cases, say whether G′ is necessarily a pseudorandom generator. If yes, give a proof; if not, show a counterexample.
′ def
(a) Define G (s) = G(s1 ···s⌊n/2⌋), where s = s1 ···sn.
′ def |s| (b)DefineG(s)=G0 ∥s.
′ def
(c) Define G(s) = G(s)∥G(s+1).
3.7 Prove the converse of Theorem 3.18. Namely, show that if G is not a pseudorandom generator then Construction 3.17 does not have indistin- guishable encryptions in the presence of an eavesdropper.
Private-Key Encryption 103
3.8 (a) Define a notion of indistinguishability for the encryption of multiple distinct messages, in which a scheme need not hide whether the same message is encrypted twice.
(b) Show that Construction 3.17 does not satisfy your definition.
(c) Give a construction of a deterministic (stateless) encryption scheme that satisfies your definition.
3.9 Prove unconditionally the existence of a pseudorandom function F : {0,1}∗×{0,1}∗ →{0,1}∗ withlkey(n)=nandlin(n)=O(logn).
Hint: Implement a uniform function with logarithmic input length.
3.10 Let F be a length-preserving pseudorandom function. For the following constructions of a keyed function F ′ : {0, 1}n × {0, 1}n−1 → {0, 1}2n, state whether F′ is a pseudorandom function. If yes, prove it; if not, show an attack.
′ def
(a) Fk(x) = Fk(0∥x)∥Fk(1∥x).
′ def
(b) Fk(x) = Fk(0∥x)∥Fk(x∥1).
3.11 Assuming the existence of pseudorandom functions, prove that there is an encryption scheme that has indistinguishable multiple encryptions in the presence of an eavesdropper (i.e., satisfies Definition 3.19), but is not CPA-secure (i.e., does not satisfy Definition 3.22).
Hint: The scheme need not be “natural.” You will need to use the fact that in a chosen-plaintext attack the adversary can choose its queries to the encryption oracle adaptively.
3.12 Let F be a keyed function and consider the following experiment: The PRF indistinguishability experiment PRFA,F (n):
(a) A uniform bit b ∈ {0,1} is chosen. If b = 1 then choose uniform k ∈ {0, 1}n.
(b) Aisgiven1n forinput. Ifb=0thenAisgivenaccessto a uniform function f ∈ Funcn. If b = 1 then A is instead given access to Fk(·).
(c) A outputs a bit b′.
(d) The output of the experiment is defined to be 1 if b′ = b,
and 0 otherwise.
Define pseudorandom functions using this experiment, and prove that
your definition is equivalent to Definition 3.25.
3.13 Consider the following keyed function F : For security parameter n, the
key is an n × n boolean matrix A and an n-bit boolean vector b. Define n n def
FA,b : {0, 1} → {0, 1} by FA,b(x) = Ax + b, where all operations are done modulo 2. Show that F is not a pseudorandom function.
104 Introduction to Modern Cryptography
3.14 Prove that if F is a length-preserving pseudorandom function, then
def
G(s) = Fs(1)∥Fs(2)∥···∥Fs(l) is a pseudorandom generator with ex- pansion factor l · n.
3.15 Define a notion of perfect secrecy under a chosen-plaintext attack by adapting Definition 3.22. Show that the definition cannot be achieved.
3.16 Prove Proposition 3.27.
Hint: Use the results of Appendix A.4.
3.17 Assume pseudorandom permutations exist. Show that there exists a function F′ that is a pseudorandom permutation but is not a strong pseudorandom permutation.
H i n t : C o n s t r u c t F ′ s u c h t h a t F k′ ( k ) = 0 | k | .
3.18 Let F be a pseudorandom permutation, and define a fixed-length en- cryption scheme (Enc, Dec) as follows: On input m ∈ {0, 1}n/2 and key k ∈ {0,1}n, algorithm Enc chooses a uniform string r ∈ {0,1}n/2 of length n/2 and computes c := Fk(r∥m).
Show how to decrypt, and prove that this scheme is CPA-secure for mes- sages of length n/2. (If you are looking for a real challenge, prove that this scheme is CCA-secure if F is a strong pseudorandom permutation.)
3.19 Let F be a pseudorandom function and G be a pseudorandom generator with expansion factor l(n) = n + 1. For each of the following encryption schemes, state whether the scheme has indistinguishable encryptions in the presence of an eavesdropper and whether it is CPA-secure. (In each case, the shared key is a uniform k ∈ {0, 1}n.) Explain your answer.
(a) To encrypt m ∈ {0, 1}n+1, choose uniform r ∈ {0, 1}n and output the ciphertext ⟨r, G(r) ⊕ m⟩.
(b) Toencryptm∈{0,1}n,outputtheciphertextm⊕Fk(0n).
(c) To encrypt m ∈ {0, 1}2n, parse m as m1∥m2 with |m1| = |m2|, then
choose uniform r ∈ {0,1}n and send ⟨r, m1⊕Fk(r), m2⊕Fk(r+1)⟩.
3.20 Consider a stateful variant of CBC-mode encryption where the sender simply increments the IV by 1 each time a message is encrypted (rather than choosing IV at random each time). Show that the resulting scheme is not CPA-secure.
3.21 What is the effect of a single-bit error in the ciphertext when using the CBC, OFB, and CTR modes of operation?
3.22 What is the effect of a dropped ciphertext block (e.g., if the trabsmitted ciphertext c1,c2,c3,… is received as c1,c3,…) when using the CBC, OFB, and CTR modes of operation?
Private-Key Encryption 105
3.23 Say CBC-mode encryption is used with a block cipher having a 256-bit key and 128-bit block length to encrypt a 1024-bit message. What is the length of the resulting ciphertext?
3.24 Give the details of the proof by reduction for Equation (3.12).
3.25 Let F be a pseudorandom function such that for k ∈ {0, 1}n the function
Fk maps lin(n)-bit inputs to lout(n)-bit outputs.
(a) Consider implementing CTR-mode encryption using F. For which
functions lin,lout is the resulting encryption scheme CPA-secure?
(b) Consider implementing CTR-mode encryption using F, but only for fixed-length messages of length l(n) (which is an integer mul- tiple of lout(n)). For which lin, lout, l does the scheme have indis- tinguishable encryptions in the presence of an eavesdropper?
3.26 For any function g : {0,1}n → {0,1}n, define g$(·) to be a probabilis- tic oracle that, on input 1n, chooses uniform r ∈ {0,1}n and returns ⟨r, g(r)⟩. A keyed function F is a weak pseudorandom function if for all ppt algorithms D, there exists a negligible function negl such that:
Pr[DFk$(·)(1n) = 1] − Pr[Df$(·)(1n) = 1] ≤ negl(n), where k ∈ {0, 1}n and f ∈ Funcn are chosen uniformly.
(a) Prove that if F is pseudorandom then it is weakly pseudorandom.
(b) Let F′ be a pseudorandom function, and define
def Fk′(x) ifxiseven Fk(x)= Fk′(x+1)ifxisodd.
Prove that F is weakly pseudorandom, but not pseudorandom.
(c) Is CTR-mode encryption using a weak pseudorandom function nec- essarily CPA-secure? Does it necessarily have indistinguishable en- cryptions in the presence of an eavesdropper? Prove your answers.
(d) Prove that Construction 3.30 is CPA-secure if F is a weak pseudo- random function.
3.27 Let F be a pseudorandom permutation. Consider the mode of operation in which a uniform value ctr ∈ {0, 1}n is chosen, and the ith ciphertext block ci is computed as ci := Fk(ctr + i + mi). Show that this scheme does not have indistinguishable encryptions in the presence of an eaves- dropper.
3.28 Show that the CBC, OFB, and CTR modes of operation do not yield CCA-secure encryption schemes (regardless of F).
106 Introduction to Modern Cryptography
3.29 Let Π1 = (Enc1, Dec1) and Π2 = (Enc2, Dec2) be two encryption schemes for which it is known that at least one is CPA-secure (but you don’t know which one). Show how to construct an encryption scheme Π that is guaranteed to be CPA-secure as long as at least one of Π1 or Π2 is CPA-secure. Provide a full proof of your solution.
Hint: Generate two plaintext messages from the original plaintext so that knowledge of either one reveals nothing about the original plaintext, but knowledge of both enables the original plaintext to be computed.
3.30 Write pseudocode for obtaining the entire plaintext via a padding-oracle attack on CBC-mode encryption using PKCS #5 padding, as described in the text.
3.31 Describe a padding-oracle attack on CTR-mode encryption (assuming PKCS #5 padding is used to pad messages to a multiple of the block length before encrypting).
Chapter 4
Message Authentication Codes
4.1 Message Integrity
4.1.1 Secrecy vs. Integrity
One of the most basic goals of cryptography is to enable parties to commu- nicate over an open communication channel in a secure way. But what does “secure communication” entail? In Chapter 3 we showed that it is possible to obtain secret communication over an open channel. That is, we showed how encryption can be used to prevent an eavesdropper (or possibly a more active adversary) from learning anything about the content of messages sent over an unprotected communication channel. However, not all security concerns are related to secrecy. In many cases, it is of equal or greater importance to guarantee message integrity (or message authentication) in the sense that each party should be able to identify when a message it receives was sent by the party claiming to send it, and was not modified in transit. We look at two canonical examples.
Consider the case of a user communicating with their bank over the Internet. When the bank receives a request to transfer $1,000 from the user’s account to the account of some other user X, the bank has to consider the following:
1. Is the request authentic? That is, did the user in question really issue this request, or was the request issued by an adversary (perhaps X itself) who is impersonating the legitimate user?
2. Assuming a transfer request was issued by the legitimate user, are the details of the request as received exactly those intended by the legitimate user? Or was, e.g., the transfer amount modified?
Note that standard error-correction techniques do not suffice for the second concern. Error-correcting codes are only intended to detect and recover from “random” errors that affect only a small portion of the transmission, but they do nothing to protect against a malicious adversary who can choose exactly where to introduce an arbitrary number of errors.
A second scenario where the need for message integrity arises in practice is with regard to web cookies. The HTTP protocol used for web traffic is state- less, and so when a client and server communicate in some session (e.g., when
107
108 Introduction to Modern Cryptography
a user [client] shops at a merchant’s [server’s] website), any state generated as part of that session (e.g., the contents of the user’s shopping cart) is often placed in a “cookie” that is stored by the client and sent from the client to the server as part of each message the client sends. Assume that the cookie stored by some user includes the items in the user’s shopping cart along with a price for each item, as might be done if the merchant offers different prices to different users (reflecting, e.g., discounts and promotions, or user-specific pricing). It should be infeasible here for the user to modify the cookie that it stores so as to alter the prices of the items in its cart. The merchant thus needs a technique to ensure the integrity of the cookie that it stores at the user. Note that the contents of the cookie (namely, the items and their prices) are not secret and, in fact, must be known by the user. The problem here is thus purely one of integrity.
In general, one cannot assume the integrity of communication without tak- ing specific measures to ensure it. Indeed, any unprotected online purchase order, online banking operation, email, or SMS message cannot, in general, be trusted to have originated from the claimed source and to have been un- modified in transit. Unfortunately, people are in general trusting and thus in- formation like the caller-ID or an email return address are taken to be “proofs of origin” in many cases, even though they are relatively easy to forge. This leaves the door open to potentially damaging attacks.
In this chapter we will show how to achieve message integrity by using cryptographic techniques to prevent the undetected tampering of messages sent over an open communication channel. Note that we cannot hope to prevent adversarial tampering of messages altogether, as that can only be defended against at the physical level. Instead, what we will guarantee is that any such tampering will be detected by the honest parties.
4.1.2 Encryption vs. Message Authentication
Just as the goals of secrecy and message integrity are different, so are the techniques and tools for achieving them. Unfortunately, secrecy and integrity are often confused and unnecessarily intertwined, so let us be clear up front: encryption does not (in general) provide any integrity, and encryption should never be used with the intent of achieving message authentication unless it is specifically designed with that purpose in mind (something we will return to in Section 4.5).
One might mistakenly think that encryption solves the problem of message authentication. (In fact, this is a common error.) This is due to the fuzzy, and incorrect, reasoning that since a ciphertext completely hides the contents of the message, an adversary cannot possibly modify an encrypted message in any meaningful way. Despite its intuitive appeal, this reasoning is completely false. We illustrate this point by showing that all the encryption schemes we have seen thus far do not provide message integrity.
Message Authentication Codes 109
Encryption using stream ciphers. Consider the simple encryption scheme in which Enck(m) computes the ciphertext c := G(k) ⊕ m, where G is a pseudorandom generator. Ciphertexts in this case are very easy to manipulate: flipping any bit in the ciphertext c results in the same bit being flipped in the message that is recovered upon decryption. Thus, given a ciphertext c that encrypts a (possibly unknown) message m, it is possible to generate a modified ciphertext c′ such that m′ := Deck(c′) is the same as m but with one (or more) of the bits flipped. This simple attack can have severe consequences. As an example, consider the case of a user encrypting some dollar amount he wants to transfer from his bank account, where the amount is represented in binary. Flipping the least significant bit has the effect of changing this amount by only $1, but flipping the 11th least significant bit changes the amount by more than $1,000! (Interestingly, the adversary in this example does not necessarily learn whether it is increasing or decreasing the initial amount, i.e., whether it is flipping a 0 to a 1 or vice versa. But if the adversary has some partial knowledge about the amount—say, that it is less than $1,000 to begin with— then the modifications it introduces can have a predictable effect.) We stress that this attack does not contradict the secrecy of the encryption scheme (in the sense of Definition 3.8). In fact, the exact same attack applies to the one-time pad encryption scheme, showing that even perfect secrecy is not sufficient to ensure the most basic level of message integrity.
Encryption using block ciphers. The attack described above utilizes the fact that flipping a single bit in a ciphertext keeps the underlying plaintext unchanged except for the corresponding bit (which is also flipped). The same attack applies to the OFB- and CTR-mode encryption schemes, which also encrypt messages by XORing them with a pseudorandom stream (albeit one that changes each time a message is encrypted). We thus see that even using CPA-secure encryption is not enough to prevent message tampering.
One may hope that attacking ECB- or CBC-mode encryption would be
more difficult since decryption in these cases involves inverting a (strong)
pseudorandom permutation F, and we expect that F−1(x) and F−1(x′) will kk
be completely uncorrelated even if x and x′ differ in only a single bit. (Of course, ECB mode does not even guarantee the most basic notion of secrecy, but that does not matter for the present discussion.) Nevertheless, single- bit modifications of a ciphertext still cause partially predictable changes in the plaintext. For example, when using ECB mode, flipping a bit in the ith block of the ciphertext affects only the ith block of the plaintext—all other blocks remain unchanged. Although the effect on the ith block of the plaintext may be impossible to predict, changing that one block (while leaving everything else unchanged) may represent a harmful attack. Moreover, the order of plaintext blocks can be changed (without garbling any block) by simply changing the order of the corresponding ciphertext blocks, and the message can be truncated by just dropping ciphertext blocks.
Similarly, when using CBC mode, flipping the jth bit of the IV changes only
110 Introduction to Modern Cryptography
the jth bit of the first message block m1 (since m1 := F −1(c1)⊕IV ′, where IV ′
is the modified IV ); all plaintext blocks other than the first remain unchanged (since the ith block of the plaintext is computed as mi := F −1(ci) ⊕ ci−1, and
k
blocks ci and ci−1 have not been modified). Therefore, the first block of a
CBC-encrypted message can be changed arbitrarily. This a serious concern in practice since the first block often contains important header information. Finally, note that all encryption schemes we have seen thus far have the property that every ciphertext (perhaps satisfying some length constraint) corresponds to some valid message. So it is trivial for an adversary to “spoof” a message on behalf of one of the communicating parties—by sending some arbitrary ciphertext—even if the adversary has no idea what the underlying message will be. As we will see when we formally define authenticated en-
cryption in Section 4.5, even an attack of this sort should be ruled out.
4.2 Message Authentication Codes – Definitions
We have seen that, in general, encryption does not solve the problem of message integrity. Rather, an additional mechanism is needed that will enable the communicating parties to know whether or not a message was tampered with. The right tool for this task is a message authentication code (MAC).
The aim of a message authentication code is to prevent an adversary from modifying a message sent by one party to another, or from injecting a new message, without the receiver detecting that the message did not originate from the intended party. As in the case of encryption, this is only possible if the communicating parties share some secret that the adversary does not know (otherwise nothing can prevent an adversary from impersonating the party sending the message). Here, we will continue to consider the private- key setting where the parties share the same secret key.1
The Syntax of a Message Authentication Code
Before formally defining security of a message authentication code, we first define what a MAC is and how it is used. Two users who wish to communicate in an authenticated manner begin by generating and sharing a secret key k in advance of their communication. When one party wants to send a message m to the other, she computes a MAC tag (or simply a tag) t based on the message and the shared key, and sends the message m and the tag t to the other party. The tag is computed using a tag-generation algorithm denoted by Mac; thus, rephrasing what we have already said, the sender of a message m computes t ← Mack (m) and transmits (m, t) to the receiver. Upon receiving (m, t), the
1In the web-cookie example discussed earlier, the merchant is (in effect) communicating “with itself” with the user acting as a communication channel. In that setting, the server alone needs to know the key since it is acting as both sender and receiver.
k
Message Authentication Codes 111
second party verifies whether t is a valid tag on the message m (with respect to the shared key) or not. This is done by running a verification algorithm Vrfy that takes as input the shared key as well as a message m and a tag t, and indicates whether the given tag is valid. Formally:
DEFINITION 4.1 A message authentication code (or MAC) consists of three probabilistic polynomial-time algorithms (Gen, Mac, Vrfy) such that:
1. The key-generation algorithm Gen takes as input the security parameter 1n and outputs a key k with |k| ≥ n.
2. The tag-generation algorithm Mac takes as input a key k and a message m ∈ {0, 1}∗, and outputs a tag t. Since this algorithm may be random- ized, we write this as t ← Mack(m).
3. The deterministic verification algorithm Vrfy takes as input a key k, a message m, and a tag t. It outputs a bit b, with b = 1 meaning valid and b = 0 meaning invalid. We write this as b := Vrfyk (m, t).
It is required that for every n, every key k output by Gen(1n), and every m ∈ {0, 1}∗, it holds that Vrfyk (m, Mack (m)) = 1.
If there is a function l such that for every k output by Gen(1n), algorithm Mack is only defined for messages m ∈ {0, 1}l(n), then we call the scheme a fixed-length MAC for messages of length l(n).
As with private-key encryption, Gen(1n) almost always simply chooses a uniform key k ∈ {0, 1}n, and we omit Gen in that case.
Canonical verification. For deterministic message authentication codes (that is, where Mac is a deterministic algorithm), the canonical way to perform verification is to simply re-compute the tag and check for equality. In other words, Vrfyk(m,t) first computes t ̃ := Mack(m) and then outputs 1 if and only if t ̃ = t. Even for deterministic MACs, however, it is useful to define a separate Vrfy algorithm in order to explicitly distinguish the semantics of authenticating a message vs. verifying its authenticity.
Security of Message Authentication Codes
We now define the default notion of security for message authentication codes. The intuitive idea behind the definition is that no efficient adversary should be able to generate a valid tag on any “new” message that was not previously sent (and authenticated) by one of the communicating parties.
As with any security definition, to formalize this notion we have to define both the adversary’s power as well as what should be considered a “break.” As usual, we consider only probabilistic polynomial-time adversaries2 and
2See Section 4.6 for a discussion of information-theoretic message authentication, where no computational restrictions are placed on the adversary.
112 Introduction to Modern Cryptography
so the real question is how we model the adversary’s interaction with the communicating parties. In the setting of message authentication, an adversary observing the communication between the honest parties may be able to see all the messages sent by these parties along with their corresponding MAC tags. The adversary may also be able to influence the content of these messages, whether directly or indirectly (if, e.g., external actions of the adversary affect the messages sent by the parties). This is true, for example, in the web cookie example from earlier, where the user’s own actions influence the contents of the cookie being stored on his computer.
To formally model the above we allow the adversary to request MAC tags for any messages of its choice. Formally, we give the adversary access to a MAC oracle Mack(·); the adversary can repeatedly submit any message m of its choice to this oracle, and is given in return a tag t ← Mack(m). (For a fixed-length MAC, only messages of the correct length can be submitted.)
We will consider it a “break” of the scheme if the adversary is able to output any message m along with a tag t such that: (1) t is a valid tag on the message m (i.e., Vrfyk(m,t) = 1), and (2) the adversary had not previously requested a MAC tag on the message m (i.e., from its oracle). The first condition means that if the adversary were to send (m, t) to one of the honest parties, then this party would be mistakenly fooled into thinking that m originated from the legitimate party since Vrfyk(m,t) = 1. The second condition is required because it is always possible for the adversary to just copy a message and MAC tag that were previously sent by one of the legitimate parties (and, of course, these would be accepted as valid). Such a replay attack is not considered a “break” of the message authentication code. This does not mean that replay attacks are not a security concern; they are, and we will have more to say about them below.
A MAC satisfying the level of security specified above is said to be exis- tentially unforgeable under an adaptive chosen-message attack. “Existential unforgeability” refers to the fact that the adversary must not be able to forge a valid tag on any message, and “adaptive chosen-message attack” refers to the fact that the adversary is able to obtain MAC tags on arbitrary messages chosen adaptively during its attack.
Toward the formal definition, consider the following experiment for a mes- sage authentication code Π = (Gen, Mac, Vrfy), an adversary A, and value n for the security parameter:
The message authentication experiment Mac-forgeA,Π(n):
1. A key k is generated by running Gen(1n).
2. The adversary A is given input 1n and oracle access to Mack(·). The adversary eventually outputs (m, t). Let Q denote the set of all queries that A asked its oracle.
3. A succeeds if and only if (1) Vrfyk(m,t) = 1 and (2) m ̸∈ Q. In that case the output of the experiment is defined to be 1.
Message Authentication Codes 113 A MAC is secure if no efficient adversary can succeed in the above experi-
ment with non-negligible probability:
DEFINITION 4.2 A message authentication code Π = (Gen,Mac,Vrfy) is existentially unforgeable under an adaptive chosen-message attack, or just se- cure, if for all probabilistic polynomial-time adversaries A, there is a negligible function negl such that:
Pr[Mac-forgeA,Π(n) = 1] ≤ negl(n).
Is the definition too strong? The above definition is rather strong in two respects. First, the adversary is allowed to request MAC tags for any messages of its choice. Second, the adversary is considered to have “broken” the scheme if it can output a valid tag on any previously unauthenticated message. One might object that both these components of the definition are unrealistic and overly strong: in “real-world” usage of a MAC, the honest parties would only authenticate “meaningful” messages (over which the adversary might have only limited control), and similarly it should only be considered a breach of security if the adversary can forge a valid tag on a “meaningful” message. Why not tailor the definition to capture this?
The crucial point is that what constitutes a meaningful message is en- tirely application dependent. While some applications of a MAC may only ever authenticate English-text messages, other applications may authenticate spreadsheet files, others database entries, and others raw data. Protocols may also be designed where anything will be authenticated—in fact, certain protocols for entity authentication do exactly this. By making the definition of security for MACs as strong as possible, we ensure that secure MACs are broadly applicable for a wide range of purposes, without having to worry about compatibility of the MAC with the semantics of the application.
Replay attacks. We emphasize that the above definition, and message au- thentication codes on their own, offer no protection against replay attacks whereby a previously sent message (and its MAC tag) are replayed to an hon- est party. Nevertheless, replay attacks are a serious concern! Consider again the scenario where a user (say, Alice) sends a request to her bank to transfer $1,000 from her account to some other user (say, Bob). In doing so, Alice can compute a MAC tag and append it to the request so the bank knows the request is authentic. If the MAC is secure, Bob will be unable to intercept the request and change the amount to $10,000 because this would involve forg- ing a valid tag on a previously unauthenticated message. However, nothing prevents Bob from intercepting Alice’s message and replaying it ten times to the bank. If the bank accepts each of these messages, the net effect is that $10,000 will be transferred to Bob’s account rather than the desired $1,000.
Despite the real threat that replay attacks represent, a MAC by itself can- not protect against such attacks since the definition of a MAC (Definition 4.1)
114 Introduction to Modern Cryptography
does not incorporate any notion of state into the verification algorithm (and so every time a valid pair (m,t) is presented to the verification algorithm, it will always output 1). Rather, protection against replay attacks—if such protection is necessary at all—must be handled by some higher-level applica- tion. The reason the definition of a MAC is structured this way is, once again, because we are unwilling to assume any semantics regarding applications that use MACs; in particular, the decision as to whether or not a replayed message should be treated as “valid” may be application dependent.
Two common techniques for preventing replay attacks are to use sequence numbers (also known as counters) or time-stamps. The first approach, de- scribed (in a more general context) in Section 4.5.3, requires the communicat- ing users to maintain (synchronized) state, and can be problematic when users communicate over a lossy channel where messages are occasionally dropped (though this problem can be mitigated). In the second approach, using time- stamps, the sender prepends the current time T (say, to the nearest mil- lisecond) to the message before authenticating, and sends T along with the message and the resulting tag t. When the receiver obtains T,m,t, it verifies that t is a valid tag on T ∥m and that T is within some acceptable clock skew from the current time T ′ at the receiver. This method has certain drawbacks as well, including the need for the sender and receiver to maintain closely syn- chronized clocks, and the possibility that a replay attack can still take place if it is done quickly enough (specifically, within the acceptable time window).
Strong MACs. As defined, a secure MAC ensures that an adversary cannot generate a valid tag on a new message that was never previously authenticated. But it does not rule out the possibility that an attacker might be able to generate a new tag on a previously authenticated message. That is, a MAC guarantees that if an attacker learns tags t1, . . . on messages m1, . . ., then it will not be able to forge a valid tag t on any message m ̸∈ {m1, . . .}. However, it may be possible for an adversary to ”forge” a different valid tag t′1 ̸= t1 on the message m1. In general, this type of adversarial behavior is not a concern. Nevertheless, in some settings it is useful to consider a stronger definition of security for MACs where such behavior is ruled out.
Formally, we consider a modified experiment Mac-sforge that is defined in exactly the same way as Mac-forge, except that now the set Q contains pairs of oracle queries and their associated responses. (That is, (m, t) ∈ Q if A queried Mack(m) and received in response the tag t.) The adversary A succeeds (and experiment Mac-sforge evaluates to 1) if and only if A outputs (m,t) such that Vrfyk(m, t) = 1 and (m, t) ∈/ Q.
DEFINITION 4.3 A message authentication code Π = (Gen,Mac,Vrfy) is strongly secure, or a strong MAC, if for all probabilistic polynomial-time adversaries A, there is a negligible function negl such that:
Pr[Mac-sforgeA,Π(n) = 1] ≤ negl(n).
Message Authentication Codes 115
It is not hard to see that if a secure MAC uses canonical verification then it is also strongly secure. This is important since all real-world MACs use canonical verification. We leave the proof of the following as an exercise.
PROPOSITION 4.4 Let Π = (Gen, Mac, Vrfy) be a secure MAC that uses canonical verification. Then Π is a strong MAC.
Verification Queries
Definitions 4.2 and 4.3 give the adversary access to a MAC oracle, which corresponds to a real-world adversary who can influence an honest sender to generate a tag for some message m. One could also consider an adversary who interacts with an honest receiver, sending m′,t′ to the receiver to learn whether Vrfyk(m′,t′) = 1. Such an adversary could be captured formally in the natural way by giving the adversary in the above definitions access to a verification oracle as well.
A definition that incorporates a verification oracle in this way is, perhaps, the “right” way to define security for message authentication codes. It turns out, however, that for MACs that use canonical verification it makes no differ- ence: any such MAC that satisfies Definition 4.2 also satisfies the definitional variant in which verification queries are allowed. Similarly, any strong MAC automatically also remains secure even in a setting where verification queries are possible. (This, in fact, serves as one motivation for the definition of strong security for MACs.) For MACs that do not use canonical verification, however, allowing verification queries can make a difference; see Exercises 4.2 and 4.3. Since most MACs covered in this book (as well as MACs used in practice) use canonical verification, we use the traditional definitions that omit access to a verification oracle.
A potential timing attack. One issue not addressed by the above is the possibility of carrying out a timing attack on MAC verification. Here, we consider an adversary who can send message/tag pairs to the receiver—thus using the receiver as a verification oracle—and learn not only whether the receiver accepts or rejects, but also the time it takes for the receiver to make this decision. We show that if such an attack is possible then a natural im- plementation of MAC verification leads to an easily exploitable vulnerability.
(Note that in our usual cryptographic definitions of security, the attacker learns only the output of the oracles it has access to, but nothing else. The attack we describe here, which is an example of a side-channel attack, shows that certain real-world attacks are not captured by the usual definitions.)
Concretely, assume a MAC using canonical verification. To verify a tag t on a message m, the receiver computes t′ := Mack(m) and then compares t′ to t, outputting 1 if and only if t′ and t are equal. Assume this comparison is implemented using a standard routine (like strcmp in C) that compares t and t′ one byte at a time, and rejects as soon as the first unequal byte is
116 Introduction to Modern Cryptography
encountered. The observation is that, when implemented in this way, the time to reject differs depending on the position of the first unequal byte.
It is possible to use this seemingly inconsequential information to forge a tag on any desired message m. The basic idea is this: say the attacker knows the first i bytes of the correct tag for m. (At the outset, i = 0.) The attacker will learn the next byte of the correct tag by sending (m, t0), . . . , (m, t255) to the receiver, where tj is the string with the first i bytes set correctly, the (i + 1)st byte equal to j (in hexadecimal), and the remaining bytes set to 0x00. All of these tags will likely be rejected (if not, then the attacker succeeds anyway); however, for exactly one of these tags the first (i + 1) bytes will match the correct tag and rejection will take slightly longer than the rest. If tj is the tag that caused rejection to take the longest, the attacker learns that the (i + 1)st byte of the correct tag is j. In this way, the attacker learns each byte of the correct tag using at most 256 queries to the verification oracle. For a 16-byte tag, this attack requires only 4096 queries in the worst case.
One might wonder whether this attack is realistic, as it requires access to a verification oracle as well as the ability to measure the difference in time taken to compare i vs. i + 1 bytes. In fact, exactly such attacks have been carried out against real systems! As just one example, MACs were used to verify code updates in the Xbox 360, and the implementation of MAC verification used there had a difference of 2.2 milliseconds between rejection times. Attackers were able to exploit this and load pirated games onto the hardware.
Based on the above, we conclude that MAC verification should use time- independent string comparison that always compares all bytes.
4.3 Constructing Secure Message Authentication Codes
4.3.1 A Fixed-Length MAC
Pseudorandom functions are a natural tool for constructing secure message authentication codes. Intuitively, if the MAC tag t is obtained by applying a pseudorandom function to the message m, then forging a tag on a previously unauthenticated message requires the adversary to correctly guess the value of the pseudorandom function at a “new” input point. The probability of guessing the value of a random function on a new point is 2−n (if the output length of the function is n). The probability of guessing such a value for a pseudorandom function can be only negligibly greater.
The above idea, shown in Construction 4.5, works for constructing a secure fixed-length MAC for messages of length n (since our pseudorandom functions by default have n-bit block length). This is useful, but falls short of our goal. In Section 4.3.2, we show how to extend this to handle messages of arbitrary length. We explore more efficient constructions of MACs for arbitrary-length messages in Sections 4.4 and 5.3.2.
Message Authentication Codes 117
CONSTRUCTION 4.5
Let F be a pseudorandom function. Define a fixed-length MAC for messages of length n as follows:
• Mac: on input a key k ∈ {0,1}n and a message m ∈ {0,1}n, output the tag t := Fk(m). (If |m| ̸= |k| then output nothing.)
• Vrfy: on input a key k ∈ {0,1}n, a message m ∈ {0,1}n, and a
tag t ∈ {0,1}n, output 1 if and only if t =? Fk(m). (If |m| ≠ |k|, then output 0.)
A fixed-length MAC from any pseudorandom function.
THEOREM 4.6 If F is a pseudorandom function, then Construction 4.5 is a secure fixed-length MAC for messages of length n.
PROOF As in previous uses of pseudorandom functions, this proof follows the paradigm of first analyzing the security of the scheme using a truly ran- dom function, and then considering the result of replacing the truly random function with a pseudorandom one.
Let A be a probabilistic polynomial-time adversary. Consider the message
authentication code Π = (Gen, Mac, Vrfy) which is the same as Π = (Mac, Vrfy) in Construction 4.5 except that a truly random function f is used instead of
n
the pseudorandom function Fk. That is, Gen(1 ) works by choosing a uniform
function f ∈ Funcn, and Mac computes a tag just as Mac does except that f is used instead of Fk. It is immediate that
Pr[Mac-forgeA,Π (n) = 1] ≤ 2−n (4.1)
because for any message m ∈/ Q, the value t = f(m) is uniformly distributed in {0, 1}n from the point of view of the adversary A.
We next show that there is a negligible function negl such that Pr[Mac-forgeA,Π (n) = 1] − Pr[Mac-forgeA,Π (n) = 1] ≤ negl(n);
combined with Equation (4.1), this shows that Pr[Mac-forgeA,Π(n) = 1] ≤ 2−n + negl(n),
(4.2)
proving the theorem.
To prove Equation (4.2), we construct a polynomial-time distinguisher D
that is given oracle access to some function, and whose goal is to determine whether this function is pseudorandom (i.e., equal to Fk for uniform k ∈ {0,1}n) or random (i.e., equal to f for uniform f ∈ Funcn). To do this, D emulates the message authentication experiment for A and observes whether A succeeds in outputting a valid tag on a “new” message. If so, D guesses that its oracle is a pseudorandom function; otherwise, D guesses that its oracle is a random function. In detail:
118
Introduction to Modern Cryptography
Distinguisher D:
D is given input 1n and access to an oracle O : {0, 1}n → {0, 1}n, and works as follows:
1. Run A(1n). Whenever A queries its MAC oracle on a message m (i.e., whenever A requests a tag on a message m), answer this query in the following way:
Query O with m and obtain response t; return t to A.
2. When A outputs (m, t) at the end of its execution, do:
ˆ
(a) Query O with m and obtain response t.
(b) If (1) tˆ = t and (2) A never queried its MAC oracle on m, then output 1; otherwise, output 0.
It is clear that D runs in polynomial time.
Notice that if D’s oracle is a pseudorandom function, then the view of
A when run as a sub-routine by D is distributed identically to the view of A in experiment Mac-forgeA,Π(n). Furthermore, D outputs 1 exactly when Mac-forgeA,Π(n) = 1. Therefore
PrDFk(·)(1n) = 1 = PrMac-forgeA,Π(n) = 1,
where k ∈ {0, 1}n is chosen uniformly in the above. If D’s oracle is a random function, then the view of A when run as a sub-routine by D is distributed identically to the view of A in experiment Mac-forgeA,Π(n), and again D outputs 1 exactly when Mac-forgeA,Π (n) = 1. Thus,
PrDf(·)(1n) = 1 = PrMac-forgeA,Π(n) = 1,
where f ∈ Funcn is chosen uniformly.
Since F is a pseudorandom function and D runs in polynomial time, there
exists a negligible function negl such that
P r D F k ( · ) ( 1 n ) = 1 − P r D f ( · ) ( 1 n ) = 1 ≤ n e g l ( n ) .
This implies Equation (4.2), completing the proof of the theorem.
4.3.2 Domain Extension for MACs
Construction 4.5 is important in that it shows a general paradigm for con- structing secure message authentication codes from pseudorandom functions. Unfortunately, the construction is only capable of handling fixed-length mes- sages that are furthermore rather short.3 These limitations are unacceptable
3Given a pseudorandom function taking arbitrary-length inputs, Construction 4.5 would yield a secure MAC for messages of arbitrary length. Likewise, a pseudorandom function
Message Authentication Codes 119
in most applications. We show here how a general MAC, handling arbitrary- length messages, can be constructed from any fixed-length MAC for messages of length n. The construction we show is not very efficient and is unlikely to be used in practice. Indeed, far more efficient constructions of secure MACs are known, as we discuss in Sections 4.4 and 5.3.2. We include the present construction for its simplicity and generality.
Let Π′ = (Mac′, Vrfy′) be a secure fixed-length MAC for messages of length n. Before presenting the construction of a MAC for arbitrary-length messages based on Π′, we rule out some simple ideas and describe some canonical attacks that must be prevented. Below, we parse the message m to be au- thenticated as a sequence of blocks m1, . . . , md; note that, since our aim is to handle messages of arbitrary length, d can vary from message to message.
1. A natural first idea is to simply authenticate each block separately, i.e., compute ti := Mac′k(mi) for all i, and output ⟨t1, . . . , td⟩ as the tag. This prevents an adversary from sending any previously unauthenticated block without being detected. However, it does not prevent a block re- ordering attack in which the attacker shuffles the order of blocks in an authenticated message. Specifically, if ⟨t1,t2⟩ is a valid tag on the mes- sage m1, m2 (with m1 ̸= m2), then ⟨t2, t1⟩ is a valid tag on the (different) message m2, m1 (something that is not allowed by Definition 4.2).
2. We can prevent the previous attack by authenticating a block index along with each block. That is, we now compute ti = Mac′k(i∥mi) for all i, and output ⟨t1, . . . , td⟩ as the tag. (Note that the block length |mi| will have to change.) This does not prevent a truncation attack whereby an attacker simply drops blocks from the end of the message (and drops the corresponding blocks of the tag as well).
3. The truncation attack can be thwarted by additionally authenticating the message length along with each block. (Authenticating the message length as a separate block does not work. Do you see why?) That is, compute ti = Mac′k(l∥i∥mi) for all i, where l denotes the length of the message in bits. (Once again, the block length |mi| will need to decrease.) This scheme is vulnerable to a “mix-and-match” attack where the adversary combines blocks from different messages. For example, if the adversary obtains tags ⟨t1,…,td⟩ and ⟨t′1,…,t′d⟩ on messages m = m1,…,md and m′ = m′1,…,m′d, respectively, it can output the valid tag ⟨t1, t′2, t3, t′4, . . .⟩ on the message m1, m′2, m3, m′4, . . ..
We can prevent this last attack by also including a random “message identi- fier” along with each block that prevents blocks from different messages from being combined. This leads us to Construction 4.7.
with a larger domain would yield a secure MAC for longer messages. However, existing practical pseudorandom functions (i.e., block ciphers) take short, fixed-length inputs.
120 Introduction to Modern Cryptography
CONSTRUCTION 4.7
Let Π′ = (Mac′,Vrfy′) be a fixed-length MAC for messages of length n. Define a MAC as follows:
• Mac: oninputakeyk∈{0,1}n andamessagem∈{0,1}∗ of (nonzero) length l < 2n/4, parse m into d blocks m1, . . . , md, each of length n/4. (The final block is padded with 0s if necessary.) Choose a uniform identifier r ∈ {0, 1}n/4 .
For i = 1, . . . , d, compute ti ← Mac′k(r∥l∥i∥mi), where i, l are en- coded as strings of length n/4.† Output the tag t := ⟨r, t1, . . . , td⟩.
• Vrfy: on input a key k ∈ {0,1}n, a message m ∈ {0,1}∗ of length l < 2n/4, and a tag t = ⟨r,t1,...,td′⟩, parse m into d blocks m1, . . . , md, each of length n/4. (The final block is padded with 0s if necessary.) Output 1 if and only if d′ = d and Vrfy′k(r∥l∥i∥mi, ti) = 1 for 1 ≤ i ≤ d.
† Note that i and l can be encoded using n/4 bits because i,l < 2n/4.
A MAC for arbitrary-length messages from any fixed-length MAC.
(Technically, the scheme only handles messages of length less than 2n/4. Asymptotically, since this is an exponential bound, honest parties will not authenticate messages that long and any polynomial-time adversary could not submit messages that long to its MAC oracle. In practice, when a concrete value of n is fixed, one must ensure that this bound is acceptable.)
THEOREM 4.8 If Π′ is a secure fixed-length MAC for messages of length n, then Construction 4.7 is a secure MAC (for arbitrary-length messages).
PROOF The intuition is that as long as Π′ is secure, an adversary cannot introduce a new block with a valid tag. Furthermore, the extra information in- cluded in each block prevents the various attacks (dropping blocks, re-ordering blocks, etc.) sketched earlier. We will prove security by essentially showing that these attacks are the only ones possible.
Let Π be the MAC given by Construction 4.7, and let A be a probabilistic polynomial-time adversary. We show that Pr[Mac-forgeA,Π(n) = 1] is negli- gible. We first introduce some notation that will be used in the proof. Let Repeat denote the event that the same random identifier appears in two of the tags returned by the MAC oracle in experiment Mac-forgeA,Π(n). Let- ting (m,t = ⟨r,t1,...⟩) denote the final output of A, where m = m1,... has length l, we let NewBlock be the event that at least one of the blocks r∥l∥i∥mi was never previously authenticated by Mac′ in the course of answering A’s Mac queries. (Note that, by construction of Π, it is easy to tell exactly which blocks are authenticated by Mac′k when computing Mack(m).) Informally, NewBlock is the event that A tries to output a valid tag on a block that was
Message Authentication Codes 121 never authenticated by the underlying fixed-length MAC Π′.
We have
Pr[Mac-forgeA,Π(n) = 1] = Pr[Mac-forgeA,Π(n) = 1 ∧ Repeat]
+ Pr[Mac-forgeA,Π(n) = 1 ∧ Repeat ∧ NewBlock]
+ Pr[Mac-forgeA,Π(n) = 1 ∧ Repeat ∧ NewBlock] ≤ Pr[Repeat] (4.3)
+ Pr[Mac-forgeA,Π(n) = 1 ∧ NewBlock]
+ Pr[Mac-forgeA,Π(n) = 1 ∧ Repeat ∧ NewBlock]. We show that the first two terms of Equation (4.3) are negligible, and the
final term is 0. This implies Pr[Mac-forgeA,Π(n) = 1] is negligible, as desired.
CLAIM 4.9 Pr[Repeat] is negligible.
PROOF Let q(n) be the number of MAC oracle queries made by A. To answer the ith oracle query of A, the oracle chooses ri uniformly from a set of size 2n/4. The probability of event Repeat is exactly the probability that ri = rj for some i ̸= j. Applying the “birthday bound” (Lemma A.15), we
have that Pr[Repeat] ≤ q(n)2 . Since A makes only polynomially many queries,
2n/4 this value is negligible.
We next consider the final term on the right-hand side of Equation (4.3). We argue that if Mac-forgeA,Π(n) = 1, but Repeat did not occur, then it must be the case that NewBlock occurred. That is, Mac-forgeA,Π(n) = 1 ∧ Repeat implies NewBlock, and so
Pr[Mac-forgeA,Π(n) = 1 ∧ Repeat ∧ NewBlock] = 0.
This is, in some sense, the heart of the proof.
Again let q = q(n) denote the number of MAC oracle queries made by A,
and let ri denote the random identifier used to answer the ith oracle query of A. If Repeat does not occur then the values r1, . . . , rq are distinct. Let (m,t = ⟨r,t1,...⟩) be the output of A, with m = m1,.... If r ̸∈ {r1,...,rq}, then NewBlock clearly occurs. If not, then r = rj for some unique j, and the blocks r∥l∥1∥m1, . . . could then not possibly have been authenticated during the course of answering any Mac queries other than the jth such query. Let m(j) be the message that was used by A for its jth oracle query, and let lj be its length. There are two cases to consider:
Case 1: l ̸= lj. The blocks authenticated when answering the jth Mac query all have lj ̸= l in the second position. So r∥l∥1∥m1, in particular, was never authenticated in the course of answering the jth Mac query, and NewBlock occurs.
122 Introduction to Modern Cryptography
Case 2: l = lj. If Mac-forgeA,Π(n) = 1, then we must have m ̸= m(j). Let
m(j) = m(j), . . .. Since m and m(j) have equal length, there must be at 1
least one index i for which mi ̸= m(j). The block r∥l∥i∥mi was then i
never authenticated in the course of answering the jth Mac query. (Be- cause i is included in the third position of the block, the block r∥l∥i∥mi
could only possibly have been authenticated if r∥l∥i∥mi = rj∥lj∥i∥m(j),
but this is not true since mi ̸= m(j).) i
To complete the proof of the theorem, we bound the second term on the right-hand side of Equation (4.3):
CLAIM 4.10 Pr[Mac-forgeA,Π(n) = 1 ∧ NewBlock] is negligible.
The claim relies on security of Π′. We construct a ppt adversary A′ who attacks the fixed-length MAC Π′ and succeeds in outputting a valid forgery on a previously unauthenticated message with probability
Pr[Mac-forgeA′ ,Π′ (n) = 1] ≥ Pr[Mac-forgeA,Π (n) = 1 ∧ NewBlock]. (4.4)
Security of Π′ means that the left-hand side is negligible, proving the claim. The construction of A′ is the obvious one and so we describe it briefly. A′ runs A as a sub-routine, and answers the request by A for a tag on m by choosing r ← {0,1}n/4 itself, parsing m appropriately, and making the necessary queries to its own MAC oracle Mac′k(·). When A outputs (m,t = ⟨r, t1, . . .⟩), then A′ checks whether NewBlock occurs (this is easy to do since A′ can keep track of all the queries it makes to its own oracle). If so, then A′ finds the first block r∥l∥i∥mi that was never previously authenticated by
Mac′ and outputs (r∥l∥i∥mi, ti). (If not, A′ outputs nothing.)
The view of A when run as a sub-routine by A′ is distributed identically to the view of A in experiment Mac-forgeA,Π(n), and so the probabilities of events Mac-forgeA,Π(n) = 1 and NewBlock do not change. If NewBlock occurs then A′ outputs a block r∥l∥i∥mi that was never previously authenticated by its own MAC oracle; if Mac-forgeA,Π(n) = 1 then the tag on every block is valid (with respect to Π′), and so in particular this is true for the block output by A′. This means that whenever Mac-forgeA,Π(n) = 1 and NewBlock
occur we have Mac-forgeA′ ,Π′ (n) = 1, proving Equation (4.4).
4.4 CBC-MAC
Theorems 4.6 and 4.8 show that it is possible to construct a secure mes- sage authentication code for arbitrary-length messages from a pseudorandom
i
Message Authentication Codes 123
function taking inputs of fixed length n. This demonstrates, in principle, that secure MACs can be constructed from block ciphers. Unfortunately, the re- sulting construction is extremely inefficient: to compute a tag on a message of length dn, the block cipher is evaluated 4d times; the tag is more than 4dn bits long. Fortunately, far more efficient constructions are available. We explore one such construction here that relies solely on block ciphers, and another in Section 5.3.2 that uses an additional cryptographic primitive.
4.4.1 The Basic Construction
CBC-MAC is a standardized message authentication code used widely in practice. A basic version of CBC-MAC, secure when authenticating messages of any fixed length, is given as Construction 4.11. (See also Figure 4.1.) We caution that this basic scheme is not secure in the general case when messages of different lengths may be authenticated; see further discussion below.
CONSTRUCTION 4.11
Let F be a pseudorandom function, and fix a length function l > 0. The basic CBC-MAC construction is as follows:
• Mac: on input a key k ∈ {0, 1}n and a message m of length l(n)·n, do the following (we set l = l(n) in what follows):
1. Parse m as m = m1,…,ml where each mi is of length n. 2. Sett0 :=0n. Then,fori=1tol:
Set ti := Fk(ti−1 ⊕ mi). Output tl as the tag.
• Vrfy: oninputakeyk∈{0,1}n,amessagem,andatagt,do: If m is not of length l(n) · n then output 0. Otherwise, output 1 if
and only if t =? Mack (m).
Basic CBC-MAC (for fixed-length messages).
THEOREM 4.12 Let l be a polynomial. If F is a pseudorandom function, then Construction 4.11 is a secure MAC for messages of length l(n) · n.
The proof of Theorem 4.12 is somewhat involved. In the following section we will prove a more general result from which the above theorem follows.
Although Construction 4.11 can be extended in the obvious way to han- dle messages whose length is an arbitrary multiple of n, the construction is only secure when the length of the messages being authenticated is fixed and agreed upon in advance by the sender and receiver. (See Exercise 4.13.)
124 Introduction to Modern Cryptography
The advantage of this construction over Construction 4.5, which also gives a fixed-length MAC, is that the present construction can authenticate longer messages. Compared to Construction 4.7, CBC-MAC is much more efficient, requiring only d block-cipher evaluations for a message of length dn, and with a tag of length n only.
CBC-MAC vs. CBC-mode encryption. CBC-MAC is similar to the CBC mode of operation. There are, however, some important differences:
1. CBC-mode encryption uses a random IV and this is crucial for security. In contrast, CBC-MAC uses no IV (alternately, it can be viewed as using the fixed value IV = 0n) and this is also crucial for security. Specifically, CBC-MAC using a random IV is not secure.
2. In CBC-mode encryption all intermediate values ti (called ci in the case of CBC-mode encryption) are output by the encryption algorithm as part of the ciphertext, whereas in CBC-MAC only the final block is output as the tag. If CBC-MAC is modified to output all the {ti} obtained during the course of the computation then it is no longer secure.
In Exercise 4.14 you are asked to verify that the modifications of CBC-MAC discussed above are insecure. These examples illustrate the fact that harmless- looking modifications to cryptographic constructions can render them inse- cure. One should always implement a cryptographic construction exactly as specified and not introduce any variations (unless the variations themselves can be proven secure). Furthermore, it is essential to understand the con- struction being used. In many cases a cryptographic library provides a pro- grammer with a “CBC function,” but does not distinguish between the use of this function for encryption or message authentication.
Secure CBC-MAC for arbitrary-length messages. We briefly describe two ways Construction 4.11 can be modified, in a provably secure fashion, to handle arbitrary-length messages. (Here for simplicity we assume that all messages being authenticated have length a multiple of n, and that Vrfy rejects
FIGURE 4.1: Basic CBC-MAC (for fixed-length messages).
Message Authentication Codes 125 any message whose length is not a multiple of n. In the following section we
treat the more general case where messages can have arbitrary length.)
1. Prepend the message m with its length |m| (encoded as an n-bit string), and then compute basic CBC-MAC on the result; see Figure 4.2. Secu- rity of this variant follows from the results proved in the next section.
Note that appending |m| to the end of the message and then computing the basic CBC-MAC is not secure.
2. Change the scheme so that key generation chooses two independent, uniform keys k1 ∈ {0,1}n and k2 ∈ {0,1}n. Then to authenticate a message m, first compute the basic CBC-MAC of m using k1 and let t be the result; output the tag tˆ := Fk2 (t).
The second option has the advantage of not needing to know the message length in advance (i.e., when beginning to compute the tag). However, it has the drawback of using two keys for F. Note that, at the expense of two additional applications of the pseudorandom function, it is possible to store a single key k and then derive keys k1 := Fk(1) and k2 := Fk(2) at the beginning of the computation. Despite this, in practice, the operation of initializing a key for a block cipher is considered relatively expensive. Therefore, requiring two different keys—even if they are derived on the fly—is less desirable.
4.4.2 *Proof of Security
In this section we prove security of different variants of CBC-MAC. We begin by summarizing the results, and then give the details of the proof. Before beginning, we remark that the proof in this section is quite involved, and is intended for advanced readers.
Throughout this section, fix a keyed function F that, for security parame- ter n, maps n-bit keys and n-bit inputs to n-bit outputs. We define a keyed
FIGURE 4.2: A variant of CBC-MAC secure for authenticating arbitrary-length messages.
126 Introduction to Modern Cryptography
function CBC that, for security parameter n, maps n-bit keys and inputs in ({0, 1}n)∗ (i.e., strings whose length is a multiple of n) to n-bit outputs. This function is defined as
def
CBCk(x1,…,xl) = Fk (Fk (···Fk(Fk(x1)⊕x2)⊕···)⊕xl),
where |x1| = ··· = |xl| = |k| = n. (We leave CBCk undefined on the empty string.) Note that CBC is exactly basic CBC-MAC, although here we consider inputs of different lengths.
A set of strings P ⊂ ({0, 1}n)∗ is prefix-free if it does not contain the empty string, and no string X ∈ P is a prefix of any other string X′ ∈ P. We show:
THEOREM 4.13 If F is a pseudorandom function, then CBC is a pseu- dorandom function as long as the set of inputs on which it is queried is prefix-free. Formally, for all probabilistic polynomial-time distinguishers D that query their oracle on a prefix-free set of inputs, there is a negligible func- tion negl such that
Pr[DCBCk(·)(1n) = 1] − Pr[Df(·)(1n) = 1] ≤ negl(n),
where k is chosen uniformly from {0, 1}n and f is chosen uniformly from the set of functions mapping ({0,1}n)∗ to {0,1}n (i.e., the value of f at each input is uniform and independent of the values of f at all other inputs).
Thus, we can convert a pseudorandom function F for fixed-length inputs into a pseudorandom function CBC for arbitrary-length inputs (subject to a constraint on which inputs can be queried)! To use this for message authen- tication, we adapt the idea of Construction 4.5 as follows: to authenticate a message m, first apply some encoding function encode to obtain a (nonempty) string encode(m) ∈ ({0,1}n)∗; then output the tag CBCk(encode(m)). For this to be secure (cf. the proof of Theorem 4.6), the encoding needs to be prefix-free, namely, to have the property that for any distinct (legal) messages m1,m2, the string encode(m1) is not a prefix of encode(m2). This implies that for any set of (legal) messages {m1,…}, the set of encoded messages {encode(m1), . . .} is prefix-free.
We now examine two concrete applications of this idea:
• Fix l, and let the set of legal messages be {0,1}l(n)·n. Then we can take the trivial encoding encode(m) = m, which is prefix-free since one string cannot be a prefix of a different string of the same length. This is exactly basic CBC-MAC, and what we have said above implies that basic CBC-MAC is secure for messages of any fixed length (cf. Theorem 4.12).
• One way of handling arbitrary-length (nonempty) messages (technically, messages of length less than 2n) is to encode a string m ∈ {0,1}∗ by prepending its length |m| (encoded as an n-bit string), and then appending as many 0s as needed to make the length of the resulting
Message Authentication Codes 127
string a multiple of n. (This is essentially what is shown in Figure 4.2.) This encoding is prefix-free, and we therefore obtain a secure MAC for arbitrary-length messages.
The rest of this section is devoted to a proof of Theorem 4.13. In proving the theorem, we analyze CBC when it is “keyed” with a random function g rather than a random key k for some underlying pseudorandom function F. That is, we consider the keyed function CBCg defined as
def
CBCg(x1,…,xl) = g(g(···g(g(x1)⊕x2)⊕···)⊕xl)
where, for security parameter n, the function g maps n-bit inputs to n-bit outputs, and |x1| = ··· = |xl| = n. Note that CBCg as defined here is not efficient (since the representation of g requires space exponential in n); nevertheless, it is still a well-defined, keyed function.
We show that if g is chosen uniformly from Funcn, then CBCg is indistin- guishable from a random function mapping ({0, 1}n)∗ to n-bit strings, as long as a prefix-free set of inputs is queried. More precisely:
CLAIM 4.14 Fix any n ≥ 1. For all distinguishers D that query their oracle on a prefix-free set of q inputs, where the longest such input contains l blocks, it holds that:
C B C g ( · ) n f ( · ) n q 2 l 2 Pr[D (1 )=1]−Pr[D (1 )=1]≤ 2n ,
where g is chosen uniformly from Funcn, and f is chosen uniformly from the set of functions mapping ({0, 1}n)∗ to {0, 1}n.
(The claim is unconditional, and does not impose any constraints on the running time of D. Thus we may take D to be deterministic.) The above implies Theorem 4.13 using standard techniques that we have already seen. In particular, for any D running in polynomial time we must have q(n), l(n) = poly(n) and so q(n)2l(n)2 · 2−n is negligible.
PROOF (of Claim 4.14) Fix some n ≥ 1. The proof proceeds in two steps: We first define a notion of smoothness and prove that CBC is smooth; we then show that smoothness implies the claim.
Let P =∗{X1,…,Xq} be a prefix-free set of q inputs, where each Xi is in ({0, 1}n) and the longest string in P contains l blocks (i.e., each Xi ∈ P contains at most l blocks of length n). Note that for any t1, . . . , tq ∈ {0, 1}n it holds that Pr[∀i : f(Xi) = ti] = 2−nq, where the probability is over uni- form choice of the function f from the set of functions mapping ({0,1}n)∗ to {0, 1}n. We say that CBC is (q, l, δ)-smooth if for every prefix-free set P = {X1,…,Xq} as above and every t1,…,tq ∈ {0,1}n, it holds that
Pr[∀i : CBCg(Xi) = ti] ≥ (1 − δ) · 2−nq, where the probability is over uniform choice of g ∈ Funcn.
128 Introduction to Modern Cryptography
In words, CBC is smooth if for every fixed set of input/output pairs {(Xi, ti)}, where the {Xi} form a prefix-free set, the probability that CBCg(Xi) = ti for all i is δ-close to the probability that f(Xi) = ti for all i (where g is a random function from {0, 1}n to {0, 1}n, and f is a random function from ({0, 1}n)∗ to {0, 1}n).
CLAIM 4.15 CBCg is (q, l, δ)-smooth, for δ = q2l2 · 2−n.
PROOF For any X ∈ ({0,1}n)∗, with X = x1,… and xi ∈ {0,1}n, let Cg(X) denote the set of inputs on which g is evaluated during the computation of CBCg(X); i.e., if X ∈ ({0,1}n)m then
def
Cg(X) = (x1, CBCg(x1)⊕x2, …, CBCg(x1,…,xm−1)⊕xm).
For X ∈ ({0,1}n)m and X′ ∈ ({0,1}n)m′, with Cg(X) = (I1,…,Im) and Cg(X′) = (I1′,…,Im′ ′), say there is a non-trivial collision in X if Ii = Ij for some i ̸= j, and say that there is a non-trivial collision between X and X′ if Ii = Ij′ but (x1,…,xi) ̸= (x′1,…,x′j) (in this latter case i may equal j). We say that there is a non-trivial collision in P if there is a non-trivial collision in some X ∈ P or between some pair of strings X,X′ ∈ P. Let Coll be the event that there is a non-trivial collision in P .
We prove the claim in two steps. First, we show that conditioned on there not being a non-trivial collision in P, the probability that CBCg(Xi) = ti for all i is exactly 2−nq. Next, we show that the probability that there is a non-trivial collision in P is less than δ = q2l2 · 2−n.
Consider choosing a uniform g by choosing, one-by-one, uniform values for the outputs of g on different inputs. Determining whether there is a non-trivial collision between two strings X,X′ ∈ P can be done by first choosing the values of g(I1) and g(I1′ ) (if I1′ = I1, these values are the same), then choosing values for g(I2) and g(I2′ ) (note that I2 = g(I1) ⊕ x2 and I2′ = g(I1′ ) ⊕ x′2 are defined once g(I1),g(I1′) have been fixed), and continuing in this way until we choose values for g(Im−1) and g(Im′ ′−1). In particular, the values of g(Im), g(Im′ ′ ) need not be chosen in order to determine whether there is a non-trivial collision between X and X′. Continuing this line of reasoning, it is possible to determine whether Coll occurs by choosing the values of g on all but the final entries of each of Cg(X1),…,Cg(Xq).
Assume Coll has not occurred after fixing the values of g on various inputs as described above. Consider the final entries in each of Cg(X1),…,Cg(Xq). These entries are all distinct (this is immediate from the fact that Coll has not occurred), and we claim that the value of g on each of those points has not yet been fixed. Indeed, the only way the value of g could already be fixed on any of those points is if the final entry Im of some Cg(X) is equal to a non-final entry Ij of some Cg(X′). But since Coll has not occurred, this can
Pr[Coll] ≤
i,j: i
Steps 1 through t − 1 (if t > 1): In each step i, choose a uniform value for g(Ii), thus defining Ii+1 and Ii′+1 (which are equal).
Step t: Choose a uniform value for g(It), thus defining It+1 and It′+1. Stepst+1tol−1(ift
t−1 2l−2 Pr[Colli,j]≤2−n · k+2t+ (k+1)
k=1 k=t+1
tion (4.5) we see that
2l−1 k=2
= 2−n ·
From Equation (4.6) we get Pr[Coll] < q2l2 · 2−n = δ. Finally, using Equa-
= 2−nq ·Pr[Coll] ≥ (1−δ)·2−nq,
k = 2−n ·(2l+1)·(l−1) < 2l2 ·2−n. Pr[∀i:CBCg(Xi)=ti]≥Pr∀i:CBCg(Xi)=ti |Coll·Pr[Coll]
as claimed.
We now show that smoothness implies the theorem. Assume without loss of generality that D always makes q (distinct) queries, each containing at most l blocks. D may choose its queries adaptively (i.e., depending on the answers to previous queries), but the set of D’s queries must be prefix-free.
For distinct X1,...,Xq ∈ ({0,1}n)∗ and arbitrary t1,...,tq ∈ {0,1}n, de- fine α(X1,...,Xq;t1,...,tq) to be 1 if and only if D outputs 1 when mak- ing queries X1,...,Xq and getting responses t1,...,tq. (If, say, D does not make query X1 as its first query, then α(X1,...;...) = 0.) Letting X⃗ = (X1,...,Xq) and ⃗t = (t1,...,tq), we then have
P r [ D C B C g ( · ) ( 1 n ) = 1 ] = α ( X⃗ , ⃗t ) · P r [ ∀ i : C B C g ( X i ) = t i ] ⃗⃗
≥ α ( X⃗ , ⃗t ) · ( 1 − δ ) · P r [ ∀ i : f ( X i ) = t i ] X⃗ prefix-free;⃗t
= (1 − δ) · Pr[Df(·)(1n) = 1],
where, above, g is chosen uniformly from Funcn, and f is chosen uniformly
from the set of functions mapping ({0, 1}n)∗ to {0, 1}n. This implies Pr[Df(·)(1n) = 1] − Pr[DCBCg(·)(1n) = 1] ≤ δ · Pr[Df(·)(1n) = 1] ≤ δ.
A symmetric argument for when D outputs 0 completes the proof.
X prefix-free; t
Message Authentication Codes 131
4.5 Authenticated Encryption
In Chapter 3, we studied how it is possible to obtain secrecy in the private- key setting using encryption. In this chapter, we have shown how to ensure integrity using message authentication codes. One might naturally want to achieve both goals simultaneously, and this is the problem we turn to now.
It is best practice to always ensure secrecy and integrity by default in the private-key setting. Indeed, in many applications where secrecy is required it turns out that integrity is essential also. Moreover, a lack of integrity can sometimes lead to a breach of secrecy.
4.5.1 Definitions
We begin, as usual, by defining precisely what we wish to achieve. At an abstract level, our goal is to realize an “ideally secure” communication channel that provides both secrecy and integrity. Pursuing a definition of this sort is beyond the scope of this book. Instead, we provide a simpler set of definitions that treat secrecy and integrity separately. These definitions and our subsequent analysis suffice for understanding the key issues at hand. (We caution the reader, however, that—in contrast to encryption and message authentication codes—the field has not yet settled on standard terminology and definitions for authenticated encryption.)
Let Π = (Gen, Enc, Dec) be a private-key encryption scheme. As mentioned already, we define security by separately defining secrecy and integrity. The notion of secrecy we consider is one we have seen before: we require that Π be secure against chosen-ciphertext attacks, i.e., that it be CCA-secure. (Refer to Section 3.7 for a discussion and definition of CCA-security.) We are concerned about chosen-ciphertext attacks here because we are explicitly considering an active adversary who can modify the data sent from one honest party to the other. Our notion of integrity will be essentially that of existential unforgeability under an adaptive chosen-message attack. Since Π does not satisfy the syntax of a message authentication code, however, we introduce a definition specific to this case. Consider the following experiment defined for a private-key encryption scheme Π = (Gen, Enc, Dec), adversary A, and value n for the security parameter:
The unforgeable encryption experiment Enc-ForgeA,Π(n): 1. Run Gen(1n) to obtain a key k.
2. The adversary A is given input 1n and access to an encryp- tion oracle Enck(·). The adversary outputs a ciphertext c.
3. Let m := Deck(c), and let Q denote the set of all queries that A asked its encryption oracle. The output of the experiment is 1 if and only if (1) m̸=⊥ and (2) m̸∈Q.
132 Introduction to Modern Cryptography
DEFINITION 4.16 A private-key encryption scheme Π is unforgeable if for all probabilistic polynomial-time adversaries A, there is a negligible func- tion negl such that:
Pr[Enc-ForgeA,Π(n) = 1] ≤ negl(n).
Paralleling our discussion about verification queries following Definition 4.2, here one could also consider a stronger definition in which A is additionally given access to a decryption oracle. One can verify that the secure construc- tion we present below also satisfies that stronger definition.
We now define a (secure) authenticated encryption scheme. DEFINITION 4.17 A private-key encryption scheme is an authenticated
encryption scheme if it is CCA-secure and unforgeable.
4.5.2 Generic Constructions
It may be tempting to think that any reasonable combination of a secure encryption scheme and a secure message authentication code should result in an authenticated encryption scheme. In this section we show that this is not the case. This demonstrates that even secure cryptographic tools can be combined in such a way that the result is insecure, and highlights once again the importance of definitions and proofs of security. On the positive side, we show how encryption and message authentication can be combined properly to achieve joint secrecy and integrity.
Throughout, let ΠE = (Enc, Dec) be a CPA-secure encryption scheme and let ΠM = (Mac, Vrfy) denote a message authentication code, where key gener- ation in both schemes simply involves choosing a uniform n-bit key. There are three natural approaches to combining encryption and message authentication using independent keys4 kE and kM for ΠE and ΠM , respectively:
1. Encrypt-and-authenticate: In this method, encryption and message au- thentication are computed independently in parallel. That is, given a plaintext message m, the sender transmits the ciphertext ⟨c, t⟩ where:
c←EnckE(m) and t←MackM(m).
The receiver decrypts c to recover m; assuming no error occurred, it then verifies the tag t. If VrfykM (m, t) = 1, the receiver outputs m; otherwise, it outputs an error.
4Independent cryptographic keys should always be used when different schemes are com- bined. We return to this point at the end of this section.
Message Authentication Codes 133
2. Authenticate-then-encrypt: Here a MAC tag t is first computed, and then the message and tag are encrypted together. That is, given a message m, the sender transmits the ciphertext c computed as:
t←MackM(m) and c←EnckE(m∥t).
The receiver decrypts c to obtain m∥t; assuming no error occurs, it then verifies the tag t. As before, if VrfykM (m, t) = 1 the receiver outputs m; otherwise, it outputs an error.
3. Encrypt-then-authenticate: In this case, the message m is first encrypted and then a MAC tag is computed over the result. That is, the ciphertext is the pair ⟨c, t⟩ where:
c←EnckE(m) and t←MackM(c).
(See also Construction 4.18.) If VrfykM (c, t) = 1, then the receiver
decrypts c and outputs the result; otherwise, it outputs an error.
We analyze each of the above approaches when they are instantiated with “generic” secure components, i.e., an arbitrary CPA-secure encryption scheme and an arbitrary (strongly) secure MAC. We want an approach that provides joint secrecy and integrity when using any (secure) components, and we will therefore reject as “unsafe” any approach for which there exists even a single counterexample of a secure encryption scheme/MAC for which the combi- nation is insecure. This “all-or-nothing” approach reduces the likelihood of implementation flaws. Specifically, an authenticated encryption scheme might be implemented by making calls to an “encryption subroutine” and a “mes- sage authentication subroutine,” and the implementation of those subroutines may be changed at some later point in time. (This commonly occurs when cryptographic libraries are updated, or when standards are modified.) Im- plementing an approach whose security depends on how its components are implemented (rather than on the security they provide) is therefore dangerous.
We stress that if an approach is rejected this does not mean that it is insecure for all possible instantiations of the components; it does, however, mean that any instantiation of the approach must be analyzed and proven secure before it is used.
Encrypt-and-authenticate. Recall that in this approach encryption and message authentication are carried out independently. Given a message m, the transmitted value is ⟨c, t⟩ where
c←EnckE(m) and t←MackM(m).
This approach may not achieve even the most basic level of secrecy. To see this, note that a secure MAC does not guarantee any secrecy and so it is pos- sible for the tag MackM (m) to leak information about m to an eavesdropper.
134 Introduction to Modern Cryptography
(As a trivial example, consider a secure MAC where the first bit of the tag is always equal to the first bit of the message.) So the encrypt-and-authenticate approach may yield a scheme that does not even have indistinguishable en- cryptions in the presence of an eavesdropper.
In fact, the encrypt-and-authenticate approach is likely to be insecure against chosen-plaintext attacks even when instantiated with standard components (unlike the contrived counterexample in the previous paragraph). In partic- ular, if a deterministic MAC like CBC-MAC is used, then the tag computed on a message (for some fixed key kM) is the same every time. This allows an eavesdropper to identify when the same message is sent twice, and so the scheme is not CPA-secure. Most MACs used in practice are deterministic, so this represents a real concern.
Authenticate-then-encrypt. Here, a MAC tag t ← MackM (m) is first com- puted; then m∥t is encrypted and the resulting value EnckE (m∥t) is trans- mitted. We show that this combination also does not necessarily yield an authenticated encryption scheme.
Actually, we have already encountered a CPA-secure encryption scheme for which this approach is insecure: the CBC-mode-with-padding scheme dis- cussed in Section 3.7.2. (We assume in what follows that the reader is familiar with that section.) Recall that this scheme works by first padding the plain- text (which in our case will be m∥t) in a specific way so the result is a multiple of the block length, and then encrypting the result using CBC mode. During decryption, if an error in the padding is detected after performing the CBC- mode decryption, then a “bad padding” error is returned. With regard to authenticate-then-encrypt, this means there are now two sources of potential decryption failure: the padding may be incorrect, or the MAC tag may not verify. Schematically, the decryption algorithm Dec′ in the combined scheme works as follows:
Dec′kE ,kM (c):
1. Compute m ̃ := DeckE (c). If an error in the padding is de-
tected, return “bad padding” and stop.
2. Parse m ̃ as m∥t. If VrfykM (m, t) = 1 return m; else, output “authentication failure.”
Assuming the attacker can distinguish between the two error messages, the at- tacker can apply the same chosen-ciphertext attack described in Section 3.7.2 to the above scheme to recover the entire original plaintext from a given ci- phertext. (This is due to the fact that the padding-oracle attack shown in Section 3.7.2 relies only on the ability to learn whether or not there was a padding error, something that is revealed by this scheme.) This type of at- tack has been carried out successfully in the real world in various settings, e.g., in configurations of IPsec that use authenticate-then-encrypt.
Message Authentication Codes 135
One way to fix the above scheme would be to ensure that only a single error message is returned, regardless of the source of decryption failure. This is an unsatisfying solution for several reasons: (1) there may be legitimate reasons (e.g., usability, debugging) to have multiple error messages; (2) forcing the error messages to be the same means that the combination is no longer truly generic, i.e., it requires the implementer of the authenticate-then-encrypt ap- proach to be aware of what error messages are returned by the underlying CPA-secure encryption scheme; (3) most of all, it is extraordinarily hard to ensure that the different errors cannot be distinguished since, e.g., even a dif- ference in the time to return each of these errors may be used by an adversary to distinguish between them (cf. our earlier discussion of timing attacks at the end of Section 4.2). Some versions of SSL tried using only a single er- ror message in conjunction with an authenticate-then-encrypt approach, but a padding-oracle attack was still successfully carried out using timing infor- mation of this sort. We conclude that authenticate-then-encrypt does not provide authenticated encryption in general, and should not be used.
Encrypt-then-authenticate. In this approach, the message is first en- crypted and then a MAC is computed over the result. That is, the message is the pair ⟨c, t⟩ where
c←EnckE(m) and t←MackM(c).
Decryption of ⟨c, t⟩ is done by outputting ⊥ if VrfykM (c, t) ̸= 1, and otherwise
outputting DeckE (c). See Construction 4.18 for a formal description.
CONSTRUCTION 4.18
Let ΠE = (Enc, Dec) be a private-key encryption scheme and let ΠM = (Mac,Vrfy) be a message authentication code, where in each case key generation is done by simply choosing a uniform n-bit key. Define a private-key encryption scheme (Gen′,Enc′,Dec′) as follows:
• Gen′: on input 1n, choose independent, uniform kE,kM ∈ {0,1}n and output the key (kE , kM ).
• Enc′ : on input a key (kE , kM ) and a plaintext message m, compute c ← EnckE (m) and t ← MackM (c). Output the ciphertext ⟨c, t⟩.
• Dec′: on input a key (kE,kM) and a ciphertext ⟨c,t⟩, first check
whether VrfykM (c, t) =? 1. If yes, then output DeckE (c); if no, then output ⊥.
A generic construction of an authenticated encryption scheme.
This approach is sound, as long as the MAC is strongly secure, as in Defini- tion 4.3. As intuition for the security of this approach, say a ciphertext ⟨c, t⟩ is valid if t is a valid MAC tag on c. Strong security of the MAC ensures that an
136 Introduction to Modern Cryptography
adversary will be unable to generate any valid ciphertext that it did not receive from its encryption oracle. This immediately implies that Construction 4.18 is unforgeable. As for CCA-security, the MAC computed over the ciphertext has the effect of rendering the decryption oracle useless since for every cipher- text ⟨c, t⟩ the adversary submits to its decryption oracle, the adversary either already knows the decryption (if it received ⟨c, t⟩ from its own encryption or- acle) or else can expect the result to be an error (since the adversary cannot generate any new, valid ciphertexts). This means that CCA-security of the combined scheme reduces to the CPA-security of ΠE. Observe also that the MAC is verified before decryption takes place; thus, MAC verification cannot leak anything about the plaintext (in contrast to the padding-oracle attack we saw for the authenticate-then-encrypt approach). We now formalize the above arguments.
THEOREM 4.19 Let ΠE be a CPA-secure private-key encryption scheme, and let ΠM be a strongly secure message authentication code. Then Construc- tion 4.18 is an authenticated encryption scheme.
PROOF Let Π′ denote the scheme resulting from Construction 4.18. We need to show that Π′ is unforgeable, and that it is CCA-secure. Following the intuition given above, say a ciphertext ⟨c,t⟩ is valid (with respect to some fixed secret key (kE , kM )) if VrfykM (c, t) = 1. We show that strong security of ΠM implies that (except with negligible probability) any “new” ciphertexts the adversary submits to the decryption oracle will be invalid. As discussed already, this immediately implies unforgeability. (In fact, it is stronger than unforgeability.) This fact also renders the decryption oracle useless, and means that CCA-security of Π′ = (Gen′, Enc′, Dec′) reduces to the CPA-security of ΠE.
In more detail, let A be a probabilistic polynomial-time adversary attacking Construction 4.18 in a chosen-ciphertext attack (cf. Definition 3.33). Say a ciphertext ⟨c, t⟩ is new if A did not receive ⟨c, t⟩ from its encryption oracle or as the challenge ciphertext. Let ValidQuery be the event that A submits a new ciphertext ⟨c, t⟩ to its decryption oracle which is valid, i.e., for which VrfykM (c, t) = 1. We prove:
CLAIM 4.20 Pr[ValidQuery] is negligible.
PROOF Intuitively, this is due to the fact that if ValidQuery occurs then the adversary has forged a new, valid pair (c,t) in the Mac-sforge experi- ment. Formally, let q(·) be a polynomial upper bound on the number of decryption-oracle queries made by A, and consider the following adversary AM attacking the message authentication code ΠM (i.e., running in experi- ment Mac-sforgeAM ,ΠM (n)):
Message Authentication Codes 137 Adversary AM :
AM is given input 1n and has access to a MAC oracle MackM (·).
1. Choose uniform kE ∈ {0,1}n and i ∈ {1,...,q(n)}.
2. Run A on input 1n. When A makes an encryption-oracle query for the message m, answer it as follows:
(i) Compute c ← EnckE (m).
(ii) Query c to the MAC oracle and receive t in response.
Return ⟨c, t⟩ to A.
The challenge ciphertext is prepared in the exact same way (with a uniform bit b ∈ {0, 1} chosen to select the message mb that gets encrypted).
When A makes a decryption-oracle query for the ciphertext ⟨c, t⟩, answer it as follows: If this is the ith decryption-oracle query, output (c, t). Otherwise:
(i) If ⟨c,t⟩ was a response to a previous encryption-oracle query for a message m, return m.
(ii) Otherwise, return ⊥.
In essence, AM is “guessing” that the ith decryption-oracle query of A will be the first new, valid query A makes. In that case, AM outputs a valid forgery on a message c that it had never previously submitted to its own MAC oracle.
Clearly AM runs in probabilistic polynomial time. We now analyze the probability that AM produces a good forgery. The key point is that the view of A when run as a subroutine by AM is distributed identically to the view
of A in experiment PrivKcca A,Π
(n) until event ValidQuery occurs. To see this, note that the encryption-oracle queries of A (as well as computation of the challenge ciphertext) are simulated perfectly by AM . As for the decryption- oracle queries of A, until ValidQuery occurs these are all simulated properly. In case (i) this is obvious. As for case (ii), if the ciphertext ⟨c, t⟩ submitted to the decryption oracle is new, then as long as ValidQuery has not yet occurred the correct answer to the decryption-oracle query is indeed ⊥. (Note that case (i) is exactly the case that ⟨c,t⟩ is not new, and case (ii) is exactly the case that ⟨c, t⟩ is new.) Recall that A is disallowed from submitting the challenge
′
ciphertext to the decryption oracle.
Because the view of A when run as a subroutine by AM is distributed
identicallytotheviewofAinexperimentPrivKcca ′(n)untileventValidQuery A,Π
occurs, the probability of event ValidQuery in experiment Mac-forgeAM ,ΠM (n)
is the same as the probability of that event in experiment PrivKcca A,Π
(n).
If AM correctly guesses the first index i when ValidQuery occurs, then AM outputs (c, t) for which VrfykM (c, t) = 1 (since ⟨c, t⟩ is valid) and for which
it was never given tag t in response to the query MackM (c) (since ⟨c, t⟩ is new). In this case, then, AM succeeds in experiment Mac-sforgeAM ,ΠM (n).
′
138 Introduction to Modern Cryptography
The probability that AM guesses i correctly is 1/q(n). Therefore
Pr[Mac-sforgeAM ,ΠM (n) = 1] ≥ Pr[ValidQuery]/q(n).
Since ΠM is a strongly secure MAC and q is polynomial, we conclude that
Pr[ValidQuery] is negligible.
We use Claim 4.20 to prove security of Π′. The easier case is to prove that Π′ is unforgeable. This follows immediately from the claim, and so we just provide informal reasoning rather than a formal proof. Observe first that the adversary A′ in the unforgeable encryption experiment is a restricted version of the adversary in the chosen-ciphertext experiment (in the former, the adversary only has access to an encryption oracle). When A′ outputs a ciphertext ⟨c, t⟩ at the end of its experiment, it “succeeds” only if ⟨c, t⟩ is valid and new. But the previous claim shows precisely that the probability of such an event is negligible.
It is slightly more involved to prove that Π′ is CCA-secure. Let A again be a probabilistic polynomial-time adversary attacking Π′ in a chosen-ciphertext attack. We have
Pr[PrivKcca ′(n)=1] A,Π
≤Pr[ValidQuery]+Pr[PrivKcca ′(n)=1∧ValidQuery]. (4.8) A,Π
We have already shown that Pr[ValidQuery] is negligible. The following claim thus completes the proof of the theorem.
CLAIM 4.21 There exists a negligible function negl such that Pr[PrivKcca ′(n)=1∧ValidQuery]≤1+negl(n).
A,Π 2
To prove this claim, we rely on CPA-security of ΠE . Consider the following
adversary AE attacking ΠE in a chosen-plaintext attack: Adversary AE:
AE is given input 1n and has access to EnckE (·).
1. Choose uniform kM ∈ {0, 1}n.
2. Run A on input 1n. When A makes an encryption-oracle query for the message m, answer it as follows:
(i) Query m to EnckE (·) and receive c in response. (ii) Compute t ← MackM (c) and return ⟨c, t⟩ to A.
When A makes a decryption-oracle query for the ciphertext ⟨c, t⟩, answer it as follows:
Message Authentication Codes 139 • If ⟨c,t⟩ was a response to a previous encryption-oracle
query for a message m, return m. Otherwise, return ⊥.
3. When A outputs messages (m0, m1), output these same mes- sages and receive a challenge ciphertext c in response. Com- pute t ← MackM (c), and return ⟨c, t⟩ as the challenge cipher- text for A. Continue answering A’s oracle queries as above.
4. Output the same bit b′ that is output by A.
Notice that AE does not need a decryption oracle because it simply as- sumes that any decryption query by A that was not the result of a previous encryption-oracle query is invalid.
Clearly, AE runs in probabilistic polynomial time. Furthermore, the view of A when run as a subroutine by AE is distributed identically to the view
of A in experiment PrivKcca A,Π
(n) as long as event ValidQuery never occurs. Therefore, the probability that AE succeeds when ValidQuery does not occur is the same as the probability that A succeeds when ValidQuery does not
AE ,ΠE A,Π implying that
′
occur; i.e.,
Pr[PrivKcpa (n)=1∧ValidQuery]=Pr[PrivKcca ′(n)=1∧ValidQuery],
Pr[PrivKcpa (n) = 1] ≥ Pr[PrivKcpa (n) = 1 ∧ ValidQuery] AE ,ΠE AE ,ΠE
=Pr[PrivKcca ′(n)=1∧ValidQuery]. A,Π
Since ΠE is CPA-secure, there exists a negligible function negl such that Pr[PrivKcpa (n) = 1] ≤ 1 + negl(n). This proves the claim.
The need for independent keys. We conclude this section by stressing a
basic principle of cryptography: different instances of cryptographic primitives
should always use independent keys. To illustrate this here, consider what can
happen to the encrypt-then-authenticate methodology when the same key k is
used for both encryption and authentication. Let F be a strong pseudorandom
permutation. It follows that F −1 is a strong pseudorandom permutation also.
Define Enck(m) = Fk(m∥r) for m ∈ {0,1}n/2 and a uniform r ∈ {0,1}n/2,
and define Mack(c) = F−1(c). It can be shown that this encryption scheme is k
CPA-secure (in fact, it is even CCA-secure; see Exercise 4.25), and we know that the given message authentication code is a secure MAC. However, the encrypt-then-authenticate combination using the same key k as applied to the message m yields:
Enck(m),Mack(Enck(m)) = Fk(m∥r),F−1(Fk(m∥r)) = Fk(m∥r),m∥r, k
and the message m is revealed in the clear! This does not in any way contradict Theorem 4.19, since Construction 4.18 expressly requires that kM , kE are chosen (uniformly and) independently. We encourage the reader to examine where this independence is used in the proof of Theorem 4.19.
AE,ΠE 2
140 Introduction to Modern Cryptography 4.5.3 Secure Communication Sessions
We briefly describe the application of authenticated encryption to the set- ting of two parties who wish to communicate “securely”—namely, with joint secrecy and integrity—over the course of a communication session. (For the purposes of this section, a communication session is simply a period of time during which the communicating parties maintain state.) In our treatment here we are deliberately informal; a formal definition is quite involved, and this topic arguably lies more in the area of network security than cryptography.
Let Π = (Enc, Dec) be an authenticated encryption scheme. Consider two parties A and B who share a key k and wish to use this key to secure their communication over the course of a session. The obvious thing to do here is to use Π: Whenever, say, A wants to transmit a message m to B, it computes c ← Enck(m) and sends c to B; in turn, B decrypts c to recover the result (ignoring the result if decryption returns ⊥). Likewise, the same procedure is followed when B wants to send a message to A. This simple approach, however, does not suffice, as there are various potential attacks:
Re-ordering attack An attacker can swap the order of messages. For ex- ample, if A transmits c1 (an encryption of m1) and subsequently trans- mits c2 (an encryption of m2), an attacker who has some control over the network can deliver c2 before c1 and thus cause B to output the messages in the wrong order. This causes a mismatch between the two parties’ views of their communication session.
Replay attack An attacker can replay a (valid) ciphertext c sent previously by one of the parties. Again, this causes a mismatch between what is sent by one party and received by the other.
Reflection attack An attacker can take a ciphertext c sent from A to B and send it back to A. This again can cause a mismatch between the two parties’ transcripts of their communication session: A may output a message m, even though B never sent such a message.
Fortunately, the above attacks are easy to prevent using counters to address the first two and a directionality bit to prevent the third.5 We describe these in tandem. Each party maintains two counters ctrA,B and ctrB,A keeping track of the number of messages sent from A to B (resp., B to A) during the session. These counters are initialized to 0 and incremented each time a party sends or receives a (valid) message. The parties also agree on a bit bA,B, and define bB,A to be its complement. (One way to do this is to set bA,B = 0 iff the identity of A is lexicographically smaller than the identity of B.)
5In practice, the issue of directionality is often solved by simply having separate keys for each direction (i.e., the parties use a key kA for messages sent from A to B, and a different key kB for messages sent from B to A).
Message Authentication Codes 141
When A wants to transmit a message m to B, she computes the ciphertext c ← Enck (bA,B ∥ctrA,B ∥m) and sends c; she then increments ctrA,B . Upon receiving c, party B decrypts; if the result is ⊥, he immediately rejects. Otherwise, he parses the decrypted message as b∥ctr∥m. If b = bA,B and ctr = ctrA,B, then B outputs m and increments ctrA,B; otherwise, B rejects. The above steps, mutatis mutandis, are applied when B sends a message to A.
We remark that since the parties are anyway maintaining state (namely, the counters ctrA,B and ctrB,A), the parties could easily use a stateful au- thenticated encryption scheme Π.
4.5.4 CCA-Secure Encryption
It follows directly from the definition that any authenticated encryption scheme is also secure against chosen-ciphertext attacks. Can there be CCA- secure private-key encryption schemes that are not unforgeable? Indeed, there are; see Exercise 4.25.
One can imagine applications where CCA-security is needed but authenti- cated encryption is not. One example might be when private-key encryption is used for key transport. As a concrete example, say a server gives a tamper- proof hardware token to a user, where embedded in the token is a long-term key k. The server can upload a fresh, short-term key k′ to this token by giving the user Enck(k′); the user is supposed to give this ciphertext to the token, which will decrypt it and use k′ for the next time period. A chosen-ciphertext attack in this setting could allow the user to learn k′, something the user is not supposed to be able to do. (Note that here a padding-oracle attack, which only relies on the user’s ability to determine whether a decryption failure oc- curs, could potentially be carried out rather easily.) On the other hand, not much harm is done if the user can generate a “valid” ciphertext that causes the token to use an arbitrary (unrelated) key k′′ for the next time period. (Of course, this depends on what the token does with this short-term key.)
Notwithstanding the above, for private-key encryption most “natural” con- structions of CCA-secure schemes that we know anyway satisfy the stronger definition of authenticated encryption. Put differently, there is no real rea- son to ever use a CCA-secure scheme that is not an authenticated encryption scheme, simply because we don’t really have any constructions satisfying the former that are more efficient than constructions achieving the latter.
From a conceptual point of view, however, it is important to keep the notions of CCA-security and authenticated encryption distinct. With regard to CCA- security we are not interested in message integrity per se; rather, we wish to ensure privacy even against an active adversary who can interfere with the communication as it goes from sender to receiver. In contrast, with regard to authenticated encryption we are interested in the twin goals of secrecy and integrity. We stress this here because in the public-key setting that we study later in the book, the difference between authenticated encryption and CCA-security is more pronounced.
142 Introduction to Modern Cryptography
4.6 *Information-Theoretic MACs
In previous sections we have explored message authentication codes with computational security, i.e., where bounds on the attacker’s running time are assumed. Recalling the results of Chapter 2, it is natural to ask whether message authentication in the presence of an unbounded adversary is possible. In this section, we show under which conditions information-theoretic (as opposed to computational) security is attainable.
A first observation is that it is impossible to achieve “perfect” security in this context: namely, we cannot hope to have a message authentication code for which the probability that an adversary outputs a valid tag on a previously unauthenticated message is 0. The reason is that an adversary can simply guess a valid tag t on any message and the guess will be correct with probability (at least) 1/2|t|, where |t| denotes the tag length of the scheme.
The above example tells us what we can hope to achieve: a MAC with tags of length |t| where the probability of forgery is at most 1/2|t|, even for unbounded adversaries. We will see that this is achievable, but only under restrictions on how many messages are authenticated by the honest parties.
We first define information-theoretic security for message authentication codes. A starting point is to take experiment Mac-forgeA,Π(n) that is used to define security for computationally secure MACs (cf. Definition 4.2), but to drop the security parameter n and require that Pr[Mac-forgeA,Π = 1] should be “small” for all adversaries A (and not just adversaries running in polynomial time). As mentioned above (and as will be proved formally in Section 4.6.2), however, such a definition is impossible to achieve. Rather, information-theoretic security can be achieved only if we place some bound on the number of messages authenticated by the honest parties. We look here at the most basic setting, where the parties authenticate just a single message. We refer to this as one-time message authentication. The following experiment modifies Mac-forgeA,Π(n) following the above discussion:
The one-time message authentication experiment Mac-forge1-time: A,Π
1. A key k is generated by running Gen.
2. The adversary A outputs a message m′, and is given in return
a tag t′ ← Mack(m′).
3. A outputs (m, t).
4. The output of the experiment is defined to be 1 if and only if (1) Vrfyk(m,t) = 1 and (2) m ̸= m′.
DEFINITION 4.22 A message authentication code Π = (Gen, Mac, Vrfy) is one-time ε-secure, or just ε-secure, if for all (even unbounded) adversaries A:
Pr[Mac-forge1-time = 1] ≤ ε. A,Π
Message Authentication Codes 143 4.6.1 Constructing Information-Theoretic MACs
In this section we show how to build an ε-secure MAC based on any strongly universal function.6 We then show a simple construction of the latter.
Leth:K×M→T beakeyedfunctionwhosefirstinputisakeyk∈Kand whose second input is taken from some domain M. As usual, we write hk(m) instead of h(k,m). Then h is strongly universal (or pairwise-independent) if for any two distinct inputs m,m′ the values hk(m) and hk(m′) are uniformly and independently distributed in T when k is a uniform key. This is equivalent to saying that the probability that hk(m), hk(m′) take on any particular values t, t′ is exactly 1/|T |2. That is:
DEFINITION 4.23 A function h : K × M → T is strongly universal if for all distinct m,m′ ∈ M and all t,t′ ∈ T it holds that
Pr[hk(m)=t∧hk(m′)=t′]= 1 , |T |2
where the probability is taken over uniform choice of k ∈ K.
The above should motivate the construction of a one-time message authen- tication code from any strongly universal function h. The tag t on a message m is obtained by computing hk(m), where the key k is uniform; see Construc- tion 4.24. Intuitively, even after an adversary observes the tag t′ := hk(m′) for any message m′, the correct tag hk(m) for any other message m is still uniformly distributed in T from the adversary’s point of view. Thus, the ad- versary can do nothing more than blindly guess the tag, and this guess will be correct only with probability 1/|T |.
CONSTRUCTION 4.24
Let h : K × M → T be a strongly universal function. Define a MAC for messages in M as follows:
• Gen: choose uniform k ∈ K and output it as the key.
• Mac: oninputakeyk∈Kandamessagem∈M,outputthetag
t := hk(m).
• Vrfy: oninputakeyk∈K,amessagem∈M,andatagt∈T, output 1 if and only if t =? hk(m). (If m ̸∈ M, then output 0.)
A MAC from any strongly universal function.
6These are often called strongly universal hash functions, but in cryptographic contexts the term “hash” has another meaning that we will see later in the book.
144 Introduction to Modern Cryptography
The above construction can be viewed as analogous to Construction 4.5. This is because a strongly universal function h is identical to a random func- tion, as long as it is only evaluated twice.
THEOREM 4.25 Let h : K × M → T be a strongly universal function. Then Construction 4.24 is a 1/|T |-secure MAC for messages in M.
PROOF Let A be an adversary. As usual in the information-theoretic
setting, we may assume A is deterministic without loss of generality. So the
message m′ that A outputs at the outset of the experiment is fixed. Fur-
thermore, the pair (m,t) that A outputs at the end of the experiment is a ′′
deterministic function of the tag t on m that A receives. We thus have Pr[Mac-forge1-time = 1] = Pr[Mac-forge1-time = 1 ∧ h (m′) = t′]
A,Π A,Π k t′ ∈T
= Pr[hk(m) = t ∧ hk(m′) = t′] t′∈T ′
( m , t ) : = A ( t )
= 1=1.
t′∈T ′|T|2 |T| (m, t) := A(t )
This proves the theorem.
We now turn to a classical construction of a strongly universal function.
We assume some basic knowledge about arithmetic modulo a prime number;
readers may refer to Sections 8.1.1 and 8.1.2 for the necessary background.
def
def
(and so [X mod p] ∈ Zp always).
THEOREM 4.26 For any prime p, the function h is strongly universal.
PROOF Fix any distinct m,m′ ∈ Zp and any t,t′ ∈ Zp. For which keys (a, b) does it hold that both ha,b(m) = t and ha,b(m′) = t′? This holds only if
a·m+b=tmodp and a·m′ +b=t′ modp.
Fix some prime p, and let Zp = {0,...,p−1}. We take as our message space M = Zp; the space of possible tags will also be T = Zp. A key (a, b) consists ofapairofelementsfromZp;thus,K=Zp ×Zp. Definehas
ha,b(m) = [a·m+bmodp],
where the notation [X mod p] refers to the reduction of the integer X modulo p
Message Authentication Codes 145
We thus have two linear equations in the two unknowns a,b. These two equations are both satisfied exactly when a = [(t−t′)·(m−m′)−1 mod p] and b = [t−a·m mod p]; note that [(m−m′)−1 mod p] exists because m ̸= m′ and so m − m′ ̸= 0 mod p. Restated, this means that for any m, m′, t, t′ as above there is a unique key (a,b) with ha,b(m) = t and ha,b(m′) = t′. Since there are |T |2 keys, we conclude that the probability (over choice of the key) that ha,b(m) = t and ha,b(m′) = t′ is exactly 1/|K| = 1/|T |2 as required.
Parameters of Construction 4.24. We briefly discuss the parameters of Construction 4.24 when instantiated with the strongly universal function described above, ignoring the fact that p is not a power of 2. The construction is a 1/|T |-secure MAC with tags of length log |T |; the tag length is optimal for the level of security achieved.
Let M be some fixed message space for which we want to construct a one- time secure MAC. The construction above gives a 1/|M|-secure MAC with keys that are twice the length of the messages. The reader may notice two problems here, at opposite ends of the spectrum: First, if |M| is small then a 1/|M| probability of forgery may be unacceptably large. On the flip side, if |M| is large then a 1/|M| probability of forgery may be overkill; one might be willing to accept a (somewhat) larger probability of forgery if that level of security can be achieved with shorter keys. The first problem (when |M| is small) is easy to deal with by simply embedding M into a larger message space M′ by, for example, padding messages with 0s. The second problem can be addressed as well; see the references at the end of this chapter.
4.6.2 Limitations on Information-Theoretic MACs
In this section we explore limitations on information-theoretic message au- thentication. We show that any 2−n-secure MAC must have keys of length at least 2n. An extension of the proof shows that any l-time 2−n-secure MAC (where security is defined via a natural modification of Definition 4.23) re- quires keys of length at least (l + 1) · n. A corollary is that no MAC with bounded-length keys can provide information-theoretic security when authen- ticating an unbounded number of messages.
In the following, we assume the message space contains at least two mes- sages; if not, there is no point in communicating, let alone authenticating.
THEOREM 4.27 Let Π = (Gen, Mac, Vrfy) be a 2−n-secure MAC where all keys output by Gen are the same length. Then the keys output by Gen must have length at least 2n.
PROOF Fix two distinct messages m0 , m1 in the message space. The intuition for the proof is that there must be at least 2n possibilities for the tag
146 Introduction to Modern Cryptography
of m0 (or else the adversary could guess it with probability better than 2−n); furthermore, even conditioned on the value of the tag for m0, there must be 2n possibilities for the tag of m1 (or else the adversary could forge a tag on m1 with probability better than 2−n). Since each key defines tags for m0 and m1, this means there must be at least 2n × 2n keys. We make this formal below.
Let K denote the key space (i.e., the set of all possible keys that can be output by Gen). For any possible tag t0, let K(t0) denote the set of keys for which t0 is a valid tag on m0; i.e.,
def
K(t0) = {k | Vrfyk(m0, t0) = 1}.
For any t0 we must have |K(t0)| ≤ 2−n · |K|. Otherwise the adversary could simply output (m0, t0) as its forgery; this would be a valid forgery with prob- ability at least |K(t0)|/|K| > 2−n, contradicting the claimed security.
Consider now the adversary A who requests a tag on the message m0, receives in return a tag t0, chooses a uniform key k ∈ K(t0), and outputs (m1,Mack(m1)) as its forgery. The probability that A outputs a valid forgery is at least
Pr[Mack(m0) = t0] · 1 ≥ Pr[Mack(m0) = t0] · 2n t0 |K(m0, t0)| t0 |K|
2n = |K|.
By the claimed security of the scheme, the probability that the adversary can output a valid forgery is at most 2−n. Thus, we must have |K| ≥ 22n. Since all keys have the same length, each key must have length at least 2n.
References and Additional Reading
The definition of security for message authentication codes was adapted by Bellare et al. [18] from the definition of security for digital signatures given by Goldwasser et al. [81] (see Chapter 12). For more on the definitional variant where verification queries are allowed, see [17].
The paradigm of using pseudorandom functions for message authentication (as in Construction 4.5) was introduced by Goldreich et al. [77]. Construc- tion 4.7 is due to Goldreich [76].
CBC-MAC was standardized in the early 1980s [94, 178] and is still used widely today. Basic CBC-MAC was proven secure (for authenticating fixed- length messages) by Bellare et al. [18]. Bernstein [26, 27] gives a more direct (though perhaps less intuitive) proof, and also discusses some generalized
Message Authentication Codes 147
versions of CBC-MAC. As noted in this chapter, basic CBC-MAC is insecure when used to authenticate messages of different lengths. One way to fix this is to prepend the length to the message. This has the disadvantage of not being able to cope with streaming data, where the length of the message is not known in advance. Petrank and Rackoff [137] suggest an alternate, “on-line” approach addressing this issue. Further improvements were given by Black and Rogaway [32] and Iwata and Kurosawa [95]; these led to a new proposed standard called CMAC.
The importance of authenticated encryption was first explicitly highlighted in [100, 19], who propose definitions similar to what we have given here. Bellare and Namprempre [19] analyze the three generic approaches discussed here, although the idea of using encrypt-then-authenticate for achieving CCA- security goes back at least to the work of Dolev et al. [61]. Krawczyk [108] examines other methods for achieving secrecy and authentication, and also an- alyzes the authenticate-then-encrypt approach used by SSL. Degabriele and Paterson [54] show an attack on IPsec when configured to authenticate-then- encrypt (the default for authenticated encryption is actually encrypt-then- authenticate; however it is possible to achieve authenticate-then-encrypt in some configurations). Several nongeneric schemes for authenticated encryp- tion have also been proposed; see [110] for a detailed comparison.
Information-theoretic MACs were first studied by Gilbert et al. [73]. Weg- man and Carter [177] introduced the notion of strongly universal functions, and noted their application to one-time message authentication. They also show how to reduce the key length for this task by using an almost strongly universal function. Specifically, the construction we give here achieves 2−n- security for messages of length n with keys of length O(n); Wegman and Carter show how to construct a 2−n-secure MAC for messages of length l with keys of (essentially) length O(n·log l). The simple construction of a strongly universal function that we give here is (with minor differences) due to Carter and Weg- man [42]. The reader interested in learning more about information-theoretic MACs is referred to the paper by Stinson [166], the survey by Simmons [162], or the first edition of Stinson’s textbook [167, Chapter 10].
Exercises
4.1 Say Π = (Gen, Mac, Vrfy) is a secure MAC, and for k ∈ {0, 1}n the tag- generation algorithm Mack always outputs tags of length t(n). Prove that t must be super-logarithmic or, equivalently, that if t(n) = O(log n) then Π cannot be a secure MAC.
Hint: Consider the probability of randomly guessing a valid tag.
148 Introduction to Modern Cryptography
4.2 Consider an extension of the definition of secure message authentication
where the adversary is provided with both a Mac and a Vrfy oracle.
(a) Provide a formal definition of security for this case.
(b) Assume Π is a deterministic MAC using canonical verification that satisfies Definition 4.2. Prove that Π also satisfies your definition from part (a).
4.3 Assume secure MACs exist. Give a construction of a MAC that is secure with respect to Definition 4.2 but that is not secure when the adversary is additionally given access to a Vrfy oracle (cf. the previous exercise).
4.4 Prove Proposition 4.4.
4.5 Assume secure MACs exist. Prove that there exists a MAC that is
secure (by Definition 4.2) but is not strongly secure (by Definition 4.3).
4.6 Consider the following MAC for messages of length l(n) = 2n − 2 us-
ing a pseudorandom function F : On input a message m0 ∥m1 (with |m0| = |m1| = n − 1) and key k ∈ {0, 1}n, algorithm Mac outputs t = Fk(0∥m0)∥Fk(1∥m1). Algorithm Vrfy is defined in the natural way. Is (Gen, Mac, Vrfy) secure? Prove your answer.
4.7 Let F be a pseudorandom function. Show that each of the following MACs is insecure, even if used to authenticate fixed-length messages. (In each case Gen outputs a uniform k ∈ {0,1}n. Let ⟨i⟩ denote an n/2-bit encoding of the integer i.)
(a) To authenticate a message m = m1,…,ml, where mi ∈ {0,1}n, compute t := Fk(m1) ⊕ · · · ⊕ Fk(ml).
(b) To authenticate a message m = m1, . . . , ml, where mi ∈ {0, 1}n/2, compute t := Fk(⟨1⟩∥m1) ⊕ · · · ⊕ Fk(⟨l⟩∥ml).
(c) To authenticate a message m = m1, . . . , ml, where mi ∈ {0, 1}n/2, choose uniform r ← {0, 1}n, compute
t := Fk(r) ⊕ Fk(⟨1⟩∥m1) ⊕ · · · ⊕ Fk(⟨l⟩∥ml), and let the tag be ⟨r, t⟩.
4.8 Let F be a pseudorandom function. Show that the following MAC for messages of length 2n is insecure: Gen outputs a uniform k ∈ {0, 1}n. To authenticate a message m1∥m2 with |m1| = |m2| = n, compute the tag Fk(m1)∥Fk(Fk(m2)).
4.9 Given any deterministic MAC (Mac, Vrfy), we may view Mac as a keyed function. In both Constructions 4.5 and 4.11, Mac is a pseudorandom function. Give a construction of a secure, deterministic MAC in which Mac is not a pseudorandom function.
Message Authentication Codes 149
4.10 Is Construction 4.5 necessarily secure when instantiated using a weak
pseudorandom function (cf. Exercise 3.26)? Explain.
4.11 Prove that Construction 4.7 is secure even when the adversary is addi-
tionally given access to a Vrfy oracle (cf. Exercise 4.2).
4.12 Prove that Construction 4.7 is secure if it is changed as follows: Set ti := Fk(r∥b∥i∥mi) where b is a single bit such that b = 0 in all blocks but the last one, and b = 1 in the last block. (Assume for simplicity that the length of all messages being authenticated is always an integer multiple of n/2 − 1.) What is the advantage of this modification?
4.13 We explore what happens when the basic CBC-MAC construction is used with messages of different lengths.
(a) Say the sender and receiver do not agree on the message length
in advance (and so Vrfyk(m,t) = 1 iff t =? Mack(m), regardless of the length of m), but the sender is careful to only authenticate messages of length 2n. Show that an adversary can forge a valid tag on a message of length 4n.
(b) Say the receiver only accepts 3-block messages (so Vrfyk (m, t) = 1
only if m has length 3n and t =? Mack(m)), but the sender au- thenticates messages of any length a multiple of n. Show that an adversary can forge a valid tag on a new message.
4.14 Prove that the following modifications of basic CBC-MAC do not yield a secure MAC (even for fixed-length messages):
(a) Mac outputs all blocks t1, . . . , tl, rather than just tl. (Verification only checks whether tl is correct.)
(b) Arandominitialblockisusedeachtimeamessageisauthenticated. That is, choose uniform t0 ∈ {0, 1}n, run basic CBC-MAC over the “message” t0, m1, . . . , ml, and output the tag ⟨t0, tl⟩. Verification is done in the natural way.
4.15 Show that appending the message length to the end of the message before applying basic CBC-MAC does not result in a secure MAC for arbitrary-length messages.
4.16 Show that the encoding for arbitrary-length messages described in Sec- tion 4.4.2 is prefix-free.
4.17 Consider the following encoding that handles messages whose length is less than n · 2n: We encode a string m ∈ {0, 1}∗ by first appending as many 0s as needed to make the length of the resulting string mˆ a nonzero multiple of n. Then we prepend the number of blocks in mˆ (equivalently, prepend the integer |mˆ |/n), encoded as an n-bit string. Show that this encoding is not prefix-free.
150 Introduction to Modern Cryptography
4.18 Prove that the following modification of basic CBC-MAC gives a secure MAC for arbitrary-length messages (for simplicity, assume all messages have length a multiple of the block length). Mack(m) first computes kl = Fk(l), where l is the length of m. The tag is then computed using basic CBC-MAC with key kl. Verification is done in the natural way.
4.19 Let F be a keyed function that is a secure (deterministic) MAC for messages of length n. (Note that F need not be a pseudorandom per- mutation.) Show that basic CBC-MAC is not necessarily a secure MAC (even for fixed-length messages) when instantiated with F .
4.20 Show that Construction 4.7 is strongly secure.
4.21 Show that Construction 4.18 might not be CCA-secure if it is instanti-
ated with a secure MAC that is not strongly secure.
4.22 Prove that Construction 4.18 is unforgeable when instantiated with any encryption scheme (even if not CPA-secure) and any secure MAC (even if the MAC is not strongly secure).
4.23 Considerastrengthenedversionofunforgeability(Definition4.16)where A is additionally given access to a decryption oracle.
(a) Write a formal definition for this version of unforgeability.
(b) Prove that Construction 4.18 satisfies this stronger definition if ΠM is a strongly secure MAC.
(c) Show by counterexample that Construction 4.18 need not satisfy this stronger definition if ΠM is a secure MAC that is not strongly secure. (Compare to the previous exercise.)
4.24 Prove that the authenticate-then-encrypt approach, instantiated with any CPA-secure encryption scheme and any secure MAC, yields a CPA- secure encryption scheme that is unforgeable.
4.25 Let F be a strong pseudorandom permutation, and define the following fixed-length encryption scheme: On input a message m ∈ {0, 1}n/2 and key k ∈ {0,1}n, algorithm Enc chooses a uniform r ∈ {0,1}n/2 and computes c := Fk(m∥r). (See Exercise 3.18.) Prove that this scheme is CCA-secure, but is not an authenticated encryption scheme.
4.26 Show a CPA-secure private-key encryption scheme that is unforgeable but is not CCA-secure.
4.27 Fixl>0andaprimep. LetK=Zl+1,M=Zl,andT =Z . Define ppp
h:K×M→T as
hk0,k1,…,kl(m1,…,ml)=k0 +ikimi modp.
Prove that h is strongly universal.
Message Authentication Codes 151
4.28 Fix l, n > 0. Let K = {0, 1}l×n × {0, 1}l (interpreted as a boolean l × n matrix and an l-dimensional vector), let M = {0, 1}n, and let T ={0,1}l. Defineh:K×M→T ashK,v(m)=K·m⊕v,whereall operations are performed modulo 2. Prove that h is strongly universal.
4.29 A Toeplitz matrix K is a matrix in which Ki,j = Ki−1,j−1 when i, j > 1; i.e., the values along any diagonal are equal. So an l×n Toeplitz matrix (for l > n) has the form
K 1 , 1 K 1 , 2 K 1 , 3 · · · K 1 , n K2,1 K1,1 K1,2 ··· K1,n−1 . . . … . .
Kl,1 Kl−1,1 Kl−2,1 ··· Kl−n+1,1
Let K = Tl×n × {0,1}l (where Tl×n denotes the set of l × n Toeplitz matrices), let M = {0,1}n, and let T = {0,1}l. Define h : K×M → T as hK,v(m) = K · m ⊕ v, where all operations are performed modulo 2. Prove that h is strongly universal. What is the advantage here as com- pared to the construction in the previous exercise?
4.30 Define an appropriate notion of a two-time ε-secure MAC, and give a construction that meets your definition.
4.31 Let {hn : Kn × {0, 1}10·n → {0, 1}n}n∈N be such that hn is strongly universal for all n, and let F be a pseudorandom function. (When K ∈ Kn we write hK (·) instead of hn,K (·).) Consider the following MAC: Gen(1n) chooses uniform K ∈ Kn and k ∈ {0,1}n, and out- puts (K, k). To authenticate a message m ∈ {0, 1}10·n, choose uniform r ∈ {0, 1}n and output ⟨r, hK (m) ⊕ Fk (r)⟩. Verification is done in the natural way. Prove that this gives a (computationally) secure MAC for messages of length 10n.
Chapter 5
Hash Functions and Applications
In this chapter we introduce cryptographic hash functions and explore a few of their applications. At the most basic level, a hash function provides a way to map a long input string to a shorter output string sometimes called a digest. The primary requirement is to avoid collisions, or two inputs that map to the same digest. Collision-resistant hash functions have numerous uses. One example that we will see here is another approach—standardized as HMAC—for achieving domain extension for message authentication codes.
Beyond that, hash functions have become ubiquitous in cryptography, and they are often used in scenarios that require properties much stronger than collision resistance. It has become common to model cryptographic hash functions as being “completely unpredictable” (a.k.a., random oracles), and we discuss this framework—and the controversy that surrounds it—in detail later in the chapter. We touch on only a few applications of the random- oracle model here, but will encounter it again when we turn to the setting of public-key cryptography.
Hash functions are intriguing in that they can be viewed as lying between the worlds of private- and public-key cryptography. On the one hand, as we will see in Chapter 6, they are (in practice) constructed using symmetric-key techniques, and many of the canonical applications of hash functions are in the symmetric-key setting. From a theoretical point of view, however, the existence of collision-resistant hash functions appears to represent a qualita- tively stronger assumption than the existence of pseudorandom functions (yet a weaker assumption than the existence of public-key encryption).
5.1 Definitions
Hash functions are simply functions that take inputs of some length and compress them into short, fixed-length outputs. The classic use of hash func- tions is in data structures, where they can be used to build hash tables that enable O(1) lookup time when storing a set of elements. Specifically, if the range of the hash function H is of size N, then element x is stored in row H(x) of a table of size N. To retrieve x, it suffices to compute H(x) and probe
153
154 Introduction to Modern Cryptography
that row of the table for the elements stored there. A “good” hash function for this purpose is one that yields few collisions, where a collision is a pair of distinct items x and x′ for which H(x) = H(x′); in this case we also say that x and x′ collide. (When a collision occurs, two elements end up being stored in the same cell, increasing the lookup time.)
Collision-resistant hash functions are similar in spirit. Again, the goal is to avoid collisions. However, there are fundamental differences. For one, the desire to minimize collisions in the setting of data structures becomes a requirement to avoid collisions in the setting of cryptography. Furthermore, in the context of data structures we can assume that the set of data elements is chosen independently of the hash function and without any intention to cause collisions. In the context of cryptography, in contrast, we are faced with an adversary who may select elements with the explicit goal of causing collisions. This means that collision-resistant hash functions are much harder to design.
5.1.1 Collision Resistance
Informally, a function H is collision resistant if it is infeasible for any prob- abilistic polynomial-time algorithm to find a collision in H. We will only be interested in hash functions whose domain is larger than their range. In this case collisions must exist, but such collisions should be hard to find.
Formally, we consider keyed hash functions. That is, H is a two-input function that takes as input a key s and a string x, and outputs a string
s def
H (x) = H(s,x). The requirement is that it must be hard to find a collision
in Hs for a randomly generated key s. There are at least two differences between keys in this context and keys as we have used them until now. First, not all strings necessarily correspond to valid keys (i.e., Hs may not be de- fined for certain s), and therefore the key s will typically be generated by an algorithm Gen rather than being chosen uniformly. Second, and perhaps more importantly, this key s is (generally) not kept secret, and collision resistance is required even when the adversary is given s. In order to emphasize this, we superscript the key and write Hs rather than Hs.
DEFINITION 5.1 A hash function (with output length l) is a pair of probabilistic polynomial-time algorithms (Gen,H) satisfying the following:
• Gen is a probabilistic algorithm which takes as input a security parame- ter 1n and outputs a key s. We assume that 1n is implicit in s.
• H takes as input a key s and a string x ∈ {0,1}∗ and outputs a string Hs(x) ∈ {0,1}l(n) (where n is the value of the security parameter im- plicit in s).
If Hs is defined only for inputs x ∈ {0, 1}l′(n) and l′(n) > l(n), then we say that (Gen, H ) is a fixed-length hash function for inputs of length l′ . In this case, we also call H a compression function.
Hash Functions and Applications 155
In the fixed-length case we require that l′ be greater than l. This ensures that the function compresses its input. In the general case the function takes as input strings of arbitrary length. Thus, it also compresses (albeit only strings of length greater than l(n)). Note that without compression, collision resistance is trivial (since one can just take the identity function Hs(x) = x).
We now proceed to define security. As usual, we first define an experiment for a hash function Π = (Gen, H ), an adversary A, and a security parameter n:
The collision-finding experiment Hash-collA,Π(n):
1. A key s is generated by running Gen(1n).
2. The adversary A is given s and outputs x,x′. (If Π is a fixed-length hash function for inputs of length l′(n), then we require x, x′ ∈ {0, 1}l′(n).)
3. The output of the experiment is defined to be 1 if and only if x̸=x′ andHs(x)=Hs(x′). InsuchacasewesaythatA has found a collision.
The definition of collision resistance states that no efficient adversary can find a collision in the above experiment except with negligible probability.
DEFINITION 5.2 A hash function Π = (Gen,H) is collision resistant if for all probabilistic polynomial-time adversaries A there is a negligible function negl such that
Pr [Hash-collA,Π(n) = 1] ≤ negl(n).
For simplicity, we sometimes refer to H or Hs as a “collision-resistant hash function,” even though technically we should only say that (Gen,H) is. This should not cause any confusion.
Cryptographic hash functions are designed with the explicit goal of being collision resistant (among other things). We will discuss some common real- world hash functions in Chapter 6. In Section 8.4.2 we will see how it is possible to construct hash functions with proven collision resistance based on an assumption about the hardness of a certain number-theoretic problem.
Unkeyed hash functions. Cryptographic hash functions used in practice generally have a fixed output length (just as block ciphers have a fixed key length) and are usually unkeyed, meaning that the hash function is just a fixed function H : {0, 1}∗ → {0, 1}l. This is problematic from a theoretical standpoint since for any such function there is always a constant-time algo- rithm that outputs a collision in H: the algorithm simply outputs a colliding pair (x,x′) hardcoded into the algorithm itself. Using keyed hash functions solves this technical issue since it is impossible to hardcode a colliding pair for every possible key using a reasonable amount of space (and in an asymptotic setting, it would be impossible to hardcode a colliding pair for every value of the security parameter).
156 Introduction to Modern Cryptography
Notwithstanding the above, the (unkeyed) cryptographic hash functions used in the real world are collision resistant for all practical purposes since col- liding pairs are unknown (and computationally difficult to find) even though they must exist. Proofs of security for some construction based on collision resistance of a hash function are meaningful even when an unkeyed hash function H is used, as long as the proof shows that any efficient adversary “breaking” the primitive can be used to efficiently find a collision in H. (All the proofs in this book satisfy this condition.) In this case, the interpreta- tion of the proof of security is that if an adversary can break the scheme in practice, then it can be used to find a collision in practice, something that we believe is hard to do.
5.1.2 Weaker Notions of Security
In some applications it suffices to rely on security requirements weaker than collision resistance. These include:
• Second-preimage or target-collision resistance: Informally, a hash func- tion is second preimage resistant if given s and a uniform x it is infeasible for a ppt adversary to find x′ ̸= x such that Hs(x′) = Hs(x).
• Preimage resistance: Informally, a hash function is preimage resistant if given s and a uniform y it is infeasible for a ppt adversary to find a value x such that Hs(x) = y. (Looking ahead to Chapter 7, this essentially means that Hs is one-way.)
Any hash function that is collision resistant is also second preimage resistant. This holds since if, given a uniform x, an adversary can find x′ ̸= x for which Hs(x′) = Hs(x), then it can clearly find a colliding pair x and x′. Likewise, any hash function that is second preimage resistant is also preimage resistant. This is due to the fact that if it were possible, given y, to find an x such that Hs(x) = y, then one could also take a given input x′, compute y := Hs(x′), and then obtain an x with Hs(x) = y. With high probability x′ ̸= x (relying on the fact that H compresses, and so multiple inputs map to the same output), in which case a second preimage has been found.
We do not formally define the above notions or prove the above implications, since they are not used in the rest of the book. You are asked to formalize the above in Exercise 5.1.
5.2 Domain Extension: The Merkle-Damg ̊ard Transform
Hash functions are often constructed by first designing a collision-resistant compression function handling fixed-length inputs, and then using domain
Hash Functions and Applications 157
extension to handle arbitrary-length inputs. In this section, we show one solution to the problem of domain extension. We return to the question of designing collision-resistant compression functions in Section 6.3.
The Merkle–Damg ̊ard transform is a common approach for extending a compression function to a full-fledged hash function, while maintaining the collision-resistance property of the former. It is used extensively in practice for hash functions including MD5 and the SHA family (see Section 6.3). The existence of this transform means that when designing collision-resistant hash functions, we can restrict our attention to the fixed-length case. This, in turn, makes the job of designing collision-resistant hash functions much easier. The Merkle–Damg ̊ard transform is also interesting from a theoretical point of view since it implies that compressing by a single bit is as easy (or as hard) as compressing by an arbitrary amount.
For concreteness, assume the compression function (Gen, h) compresses its input by half; say its input length is 2n and its output length is n. (The construction works regardless of the input/output lengths, as long as h com- presses.) We construct a collision-resistant hash function (Gen,H) that maps inputs of arbitrary length to outputs of length n. (Gen remains unchanged.) The Merkle–Damg ̊ard transform is defined in Construction 5.3 and depicted in Figure 5.1. The value z0 used in step 2 of the construction, called the initialization vector or IV , is arbitrary and can be replaced by any constant.
CONSTRUCTION 5.3
Let (Gen, h) be a fixed-length hash function for inputs of length 2n and with output length n. Construct hash function (Gen,H) as follows:
• Gen: remains unchanged.
• H: oninputakeysandastringx∈{0,1}∗ oflengthL<2n,do
the following:
1. Set B := L (i.e., the number of blocks in x). Pad x with
n
zeros so its length is a multiple of n. Parse the padded result as the sequence of n-bit blocks x1,...,xB. Set xB+1 := L, where L is encoded as an n-bit string.
2. Set z0 := 0n. (This is also called the IV .)
3. Fori=1,...,B+1,computezi :=hs(zi−1∥xi).
4. Output zB+1.
The Merkle–Damg ̊ard transform.
THEOREM 5.4 If (Gen, h) is collision resistant, then so is (Gen, H).
PROOF We show that for any s, a collision in Hs yields a collision in hs. Let x and x′ be two different strings of length L and L′, respectively, such that Hs(x) = Hs(x′). Let x1,...,xB be the B blocks of the padded x, and
158 Introduction to Modern Cryptography
FIGURE 5.1: The Merkle–Damg ̊ard transform.
let x′1,...,x′B′ be the B′ blocks of the padded x′. Recall that xB+1 = L and x′B′+1 = L′. There are two cases to consider:
1. Case 1: L ̸= L′. In this case, the last step of the computation of Hs(x) is zB+1 := hs(zB∥L), and the last step of the computation of
Hs(x′)iszB′+1sB′′ s s′
′
:= h (z
′
∥L). Since H (x) = H (x) it follows that
′ ′ B B′ ∥L). However,L̸=L andsoz ∥Landz
′
hs(zB∥L) = hs(zB′
two different strings that collide under hs.
∥L are 2. Case 2: L = L′. This means that B = B′. Let z0, . . . , zB+1 be the values
′
′
s def
defined during the computation of H (x), let Ii = zi−1∥xi denote the
s def ′′
ith input to h , and set IB+2 = zB+1. Define I1, . . . , IB+2 analogously
with respect to x′. Let N be the largest index for which IN ̸= IN′ . Since |x|=|x′|butx̸=x′,thereisaniwithxi ̸=x′i andsosuchanN certainly exists. Because
IB+2 =zB+1 =Hs(x)=Hs(x′)=zB′ +1 =IB′ +2,
wehaveN ≤B+1. BymaximalityofN,wehaveIN+1 =IN′ +1 andin
particular zN = zN′ . But this means that IN , IN′ are a collision in hs. We leave it as an exercise to turn the above into a formal reduction.
5.3 Message Authentication Using Hash Functions
In the previous chapter, we presented two constructions of message authen- tication codes for arbitrary-length messages. The first approach was generic, but inefficient. The second, CBC-MAC, was based on pseudorandom func- tions. Here we will see another approach, which we call “hash-and-MAC,” that relies on collision-resistant hashing along with any message authentica- tion code. We then discuss a standardized and widely used construction called HMAC that can be viewed as a specific instantiation of this approach.
Hash Functions and Applications 159 5.3.1 Hash-and-MAC
The idea behind the hash-and-MAC approach is simple. First, an arbitrar- ily long message m is hashed down to a fixed-length string Hs(m) using a collision-resistant hash function. Then, a (fixed-length) MAC is applied to the result. See Construction 5.5 for a formal description.
CONSTRUCTION 5.5
Let Π = (Mac,Vrfy) be a MAC for messages of length l(n), and let ΠH = (GenH,H) be a hash function with output length l(n). Construct a MAC Π′ = (Gen′, Mac′, Vrfy′) for arbitrary-length messages as follows:
• Gen′: on input 1n, choose uniform k ∈ {0, 1}n and run GenH (1n) to obtain s; the key is k′ := ⟨k, s⟩.
• Mac′: on input a key ⟨k,s⟩ and a message m ∈ {0,1}∗, output t ← Mack(Hs(m)).
• Vrfy′: on input a key ⟨k,s⟩, a message m ∈ {0,1}∗, and a MAC tag t, output 1 if and only if Vrfyk(Hs(m),t) =? 1.
The hash-and-MAC paradigm.
Construction 5.5 is secure if Π is a secure MAC for fixed-length messages and (Gen,H) is collision resistant. Intuitively, since the hash function is col- lision resistant, authenticating Hs(m) is as good as authenticating m itself: if the sender can ensure that the receiver obtains the correct value Hs(m), collision resistance guarantees that the attacker cannot find a different mes- sage m′ that hashes to the same value. A bit more formally, say a sender uses Construction 5.5 to authenticate some set of messages Q, and an attacker A is then able to forge a valid tag on a new message m∗ ̸∈ Q. There are two possible cases:
Case 1: there is a message m ∈ Q such that Hs(m∗) = Hs(m). Then A has found a collision in H s , contradicting the collision resistance of (Gen, H ).
Case 2: for every message m ∈ Q it holds that Hs(m∗) ̸= Hs(m). Let s def s s ∗ s
H (Q) = {H (m) | m ∈ Q}. Then H (m ) ∈/ H (Q). In this case, A has forged a valid tag on the “new message” Hs(m∗) with respect to the fixed-length message authentication code Π. This contradicts the assumption that Π is a secure MAC.
We now turn the above into a formal proof.
THEOREM 5.6 If Π is a secure MAC for messages of length l and ΠH is collision resistant, then Construction 5.5 is a secure MAC (for arbitrary- length messages).
160 Introduction to Modern Cryptography
PROOF Let Π′ denote Construction 5.5, and let A′ be a ppt adversary
attacking Π′. In an execution of experiment Mac-forgeA ,Π ′
denote the MAC key, let Q denote the set of messages whose tags were re-
quested by A′, and let (m∗,t) be the final output of A′. We assume without
loss of generality that m∗ ̸∈ Q. Define coll to be the event that, in experiment
s∗s (n),thereisanm∈QforwhichH (m )=H (m). Wehave
Pr[Mac-forgeA′ ,Π′ (n) = 1]
= Pr[Mac-forgeA′ ,Π′ (n) = 1 ∧ coll] + Pr[Mac-forgeA′ ,Π′ (n) = 1 ∧ coll]
≤ Pr[coll] + Pr[Mac-forgeA′ ,Π′ (n) = 1 ∧ coll]. (5.1)
We show that both terms in Equation (5.1) are negligible, thus completing the proof. Intuitively, the first term is negligible by collision resistance of ΠH , and the second term is negligible by security of Π.
Consider the following algorithm C for finding a collision in ΠH : Algorithm C:
The algorithm is given s as input (with n implicit).
• Choose uniform k ∈ {0, 1}n.
• Run A′(1n). When A′ requests a tag on the ith message mi ∈ {0,1}∗, compute ti ← Mack(Hs(mi)) and give ti to A′.
• When A′ outputs (m∗,t), then if there exists an i for which Hs(m∗) = Hs(mi), output (m∗, mi).
It is clear that C runs in polynomial time. Let us analyze its behavior. When the input to C is generated by running GenH(1n) to obtain s, the view of A′ when run as a subroutine by C is distributed identically to the view of A′ in
the same distribution as the tags that A′ receives in Mac-forgeA ,Π ′′
C outputs a collision exactly when coll occurs, we have Pr[Hash-collC,ΠH (n) = 1] = Pr[coll].
Because ΠH is collision resistant, we conclude that Pr[coll] is negligible.
We now proceed to prove that the second term in Equation (5.1) is negli-
gible. Consider the following adversary A attacking Π in Mac-forgeA,Π(n): Adversary A:
The adversary is given access to a MAC oracle Mack(·).
• Compute GenH(1n) to obtain s.
• Run A′(1n). When A′ requests a tag on the ith message mi ∈ {0, 1}∗, then: (1) compute mˆ i := Hs(mi); (2) obtain a tag ti on mˆ i from the MAC oracle; and (3) give ti to A′.
• When A′ outputs (m∗, t), then output (Hs(m∗), t).
Mac-forgeA ,Π ′′
′′
(n),letk =⟨k,s⟩
experiment Mac-forgeA ,Π ′
′′
(n). In particular, the tags given to A by C have
(n). Since
Hash Functions and Applications 161
Clearly A runs in polynomial time. Consider experiment Mac-forgeA,Π(n). In that experiment, the view of A′ when run as a subroutine by A is distributed
(n). Furthermore, when- (n) = 1 and coll do not occur, A outputs a valid forgery. (In that case t is a valid tag on H (m ) in scheme Π with respect to k. The fact that coll did not occur means that Hs(m∗) was never asked by
A to its own MAC oracle and so this is indeed a forgery.) Therefore, Pr[Mac-forgeA,Π (n) = 1] = Pr[Mac-forgeA′ ,Π′ (n) ∧ coll],
and security of Π implies that the former probability is negligible. This con- cludes the proof of the theorem.
5.3.2 HMAC
All the constructions of message authentication codes we have seen so far are ultimately based on some block cipher. Is it possible to construct a secure MAC (for arbitrary-length messages) based directly on a hash function? A first thought might be to define Mack(m) = H(k∥m); we might expect that if H is a “good” hash function then it should be difficult for an attacker to predict the value of H(k∥m′) given the value of H(k∥m), for any m′ ̸= m, as- suming k is chosen at random (and unknown to the attacker). Unfortunately, if H is constructed using the Merkle–Damg ̊ard transform—as most real-world hash functions are—then a MAC designed in this way is completely insecure, as you are asked to show in Exercise 5.10.
Instead, we can try using two layers of hashing. See Construction 5.7 for a standardized scheme called HMAC based on this idea.
identically to its view in experiment Mac-forgeA ,Π ′′
ever both Mac-forgeA ,Π s ∗ ′′
CONSTRUCTION 5.7
Let (GenH,H) be a hash function constructed by applying the Merkle– Damg ̊ard transform to a compression function (GenH,h) taking inputs of length n + n′. (See text.) Let opad and ipad be fixed constants of length n′. Define a MAC as follows:
• Gen: on input 1n, run GenH(1n) to obtain a key s. Also choose uniform k ∈ {0, 1}n′ . Output the key ⟨s, k⟩.
• Mac: on input a key ⟨s,k⟩ and a message m ∈ {0,1}∗, output t := Hs(k ⊕ opad) ∥ Hs (k ⊕ ipad) ∥ m .
• Vrfy: on input a key ⟨s,k⟩, a message m ∈ {0,1}∗, and a tag t, output1ifandonlyift=? Hs(k⊕opad)∥Hs(k⊕ipad)∥m.
HMAC.
162 Introduction to Modern Cryptography
FIGURE 5.2: HMAC, pictorially.
Why should we have any confidence that HMAC is secure? One reason is that we can view HMAC as a specific instantiation of the hash-and-MAC paradigm from the previous section. To see this, we will look “under the hood” at what happens when a message is authenticated; see Figure 5.2. We must also specify parameters more carefully and go into a bit more detail regarding the way the Merkle–Damg ̊ard transform is implemented in practice.
Say (GenH , H ) is constructed based on a compression function (GenH , h) in which h maps inputs of length n + n′ to outputs of length n (where, formally, n′ is a function of n). When we described the Merkle–Damg ̊ard transform in Section 5.2, we assumed n′ = n, but that need not always be the case. We also said that the length of the message being hashed was encoded as an extra message block that is appended to the message. In practice, the length is instead encoded in a portion of a block using l < n′ bits. That is, computation of Hs(x) begins by padding x with zeros to a string of length exactly l less than a multiple of n′; it then appends the length L = |x|, encoded using exactly l bits. The hash of the resulting sequence of n′-bit blocks x1, . . . is then computed as in Construction 5.3. We will assume that n + l ≤ n′. This means, in particular, that if we hash an input x of length n′ + n then the padded result (including the length) will be exactly 2n′ bits long. The proof of Theorem 5.4, showing that (GenH , H) is collision resistant if (GenH , h) is collision resistant, remains unchanged.
Coming back to HMAC, and looking at Figure 5.2, we can see that the general form of HMAC involves hashing an arbitrary-length message down
def s
to a short string y = H ( (k ⊕ ipad) ∥ m), and then computing the (secretly
keyed) function Hs((k ⊕ opad) ∥ y) of the result. But we can say more than this. Note first that the “inner” computation
s def s
H (m) = H ((k⊕ipad)∥m)
Hash Functions and Applications 163 is collision resistant (assuming h is), for any value of k ⊕ ipad. Moreover, the
first step in the “outer” computation Hs((k⊕opad)∥y) is to compute a value def s s
kout = h (I V ∥ (k ⊕ opad)). Then, we evaluate h (kout ∥ yˆ) where yˆ refers to the padded value of y (i.e., including the length of (k ⊕ opad) ∥ y, which is always n′ + n bits, encoded using exactly l bits). Thus, if we treat kout as uniform—we will be more formal about this below—and assume that
def s
Mack(y) = h (k∥yˆ) (5.2)
is a secure fixed-length MAC, then HMAC can be viewed as an instantiation of the hash-and-MAC approach with
̃s HMACs,k(m)=Mackout(H (m)) (5.3)
(where kout = hs(IV ∥ (k ⊕ opad))). Because of the way the compression function h is typically designed (see Section 6.3.1), the assumption that Mac is a secure fixed-length MAC is a reasonable one.
The roles of ipad and opad. Given the above, one might wonder why it
is necessary to incorporate k in the “inner” computation H s ( (k ⊕ ipad) ∥ m).
(In particular, for the hash-and-MAC approach to be secure we require colli-
sion resistance in the first step, which does not require any secret key.) The
reason is that this allows security of HMAC to be based on the potentially
weaker assumption that (GenH,H) is weakly collision resistant, where weak
collision resistance is defined by the following experiment: a key s is generated
using GenH and a uniform secret kin ∈ {0, 1}n is chosen. Then the adversary
is allowed to interact with a “hash oracle” that returns Hs (m) in response kin
to the query m, where Hs refers to computation of Hs using the Merkle– kin
Damg ̊ard transform applied to hs, but using the secret value kin as the IV .
(Refer again to Figure 5.2.) The adversary succeeds if it can output distinct
values m,m′ such that Hs (m) = Hs (m′), and we say that (GenH,H) is kin kin
weakly collision resistant if every ppt A succeeds in this experiment with only negligible probability. If (GenH , H ) is collision resistant then it is clearly weakly collision resistant; the latter, however, is a weaker condition that is potentially easier to satisfy. This is a good example of sound security engi- neering. This defensive design strategy paid off when it was discovered that the hash function MD5 (see Section 6.3.2) was not collision resistant. The collision-finding attacks on MD5 did not violate weak collision resistance, and HMAC-MD5 was not broken even though MD5 was. This gave developers time to replace MD5 in HMAC implementations, without immediate fear of attack. (Despite this, HMAC-MD5 should no longer be used now that weak- nesses in MD5 are known.)
The above discussion suggests that independent keys should be used in the outer and inner computations. For reasons of efficiency, a single key k is used for HMAC, but the key is used in combination with ipad and opad to derive
164 Introduction to Modern Cryptography two other keys. Define
s def s s
G (k) = h IV∥(k⊕opad) ∥h IV∥(k⊕ipad) =kout∥kin. (5.4)
If we assume that Gs is a pseudorandom generator for any s, then kout and kin can be treated as independent and uniform keys when k is uniform. Security of HMAC then reduces to the security of the following construction:
Macs,kin ,kout (m) = hs kout ∥ Hks (m) . in
(Compare to Equation (5.3).) As noted earlier, this construction can be proven secure (using a variant of the proof for the hash-and-MAC approach) if H is weakly collision resistant and the MAC defined in Equation (5.2) is a secure fixed-length MAC.
THEOREM 5.8 Assume Gs as defined in Equation (5.4) is a pseudo- random generator for any s, the MAC defined in Equation (5.2) is a secure fixed-length MAC for messages of length n, and (GenH,H) is weakly collision resistant. Then HMAC is a secure MAC (for arbitrary-length messages).
HMAC in practice. HMAC is an industry standard and is widely used in practice. It is highly efficient and easy to implement, and is supported by a proof of security based on assumptions that are believed to hold for practical hash functions. The importance of HMAC is partially due to the timeliness of its appearance. Before the introduction of HMAC, many practitioners refused to use CBC-MAC (with the claim that it was “too slow”) and instead used heuristic constructions that were insecure. HMAC provided a standardized, secure way of doing message authentication based on hash functions.
5.4 Generic Attacks on Hash Functions
What is the best security we can hope for a hash function H to provide? We explore this question by showing two attacks that are generic in the sense that they apply to arbitrary hash functions. The existence of these attacks implies lower bounds on the output length of H needed to achieve some desired level of security, and therefore has important practical ramifications.
5.4.1 Birthday Attacks for Finding Collisions
Let H : {0, 1}∗ → {0, 1}l be a hash function. (Here and in the rest of the chapter, we drop explicit mention of the hash key s since it is not directly relevant. One can also view s as being generated and fixed before these
Hash Functions and Applications 165
algorithms are applied.) There is a trivial collision-finding attack running in time O(2l): simply evaluate H on 2l + 1 distinct inputs; by the pigeonhole principle, two of the outputs must be equal. Can we do better?
Generalizing the above algorithm, say we choose q distinct inputs x1, . . . , xq, compute yi := H(xi), and check whether any of the two yi values are equal. What is the probability that this algorithm finds a collision? As we have just said, if q > 2l then a collision occurs with probability 1. What is the probability of a collision when q is smaller? It is somewhat difficult to analyze this probability exactly, and so we will instead analyze an idealized case in which H is treated as a random function.1 That is, for each i we assume that the value yi = H(xi) is uniformly distributed in {0,1}l and independent of any of the previous output values {yj}j1−N.
Continuing in this way, the probability that H(t−1)(SPi) is new is at least
i·t2 i·tt i·tN N
i·t −it2/N 1−N=1−N≈e.
The thing to notice here is that when it2 ≤ N/2, this probability is at least 1/2; on the other hand, once it2 > N the probability is rather small. Considering the last row, when i = s, this means that we will not gain much additional coverage if st2 > N. A good setting of the parameters is thus st2 = N/2. Assuming this, the expected number of distinct points in the table is
s t−1 s t−1 PrH(j)(SPi) is new ≥ 1 = st. i=1 j=0 i=1 j=0 2 2
The probability that x is “covered” is then at least st = 1 . 2N 4t
This gives a weak time/space tradeoff, in which we can use more space (and consequently less time) at the expense of decreasing the probability of invert- ing y. But we can do better by generating T = 4t “independent” tables. (This increases both the space and time by a factor of T .) As long as we can treat the probabilities of x being in each of the associated tables as independent, the probability that at least one of these tables contains x is
1 4t 1−Pr[notablecontainsx]=1− 1−4t ≈1−e−1 =0.63.
The only remaining question is how to generate an independent table. (Note that generating a table exactly as before is the same as adding s additional
Hash Functions and Applications 173
rows to our original table, which we have already seen does not help.) We
can do this for the ith such table by applying some function Fi after every
evaluation of H, where F1, . . . , FT are all distinct. (A good choice might
be to set Fi(x) = x ⊕ ci for some fixed constant ci that is different in each def
table.) Let Hi = Fi ◦ H, i.e., Hi(x) = Fi(H(x)). Then for the ith table
we again choose s random starting points, but for each such point we now
compute Hi(SP),H(2)(SP), and so on. Upon receiving a value y = H(x) i
to invert, the attacker first computes y′ = Fi(y) and then checks if any of y′, Hi(y′), . . . , H(t−1)(y′) corresponds to an endpoint in the ith table; this is
repeated for i = 1, . . . , T . (We omit further details.) While it is difficult to argue independence formally, this approach leads to good results in practice.
Choosing parameters. Summarizing the above discussion, we see that as long as st2 = N/2 we have an algorithm that stores O(s · T) = O(s · t) = O(N/t) points during a preprocessing phase, and can then invert y with constant probability in time O(t·T ) = O(t2). One setting of the parameters is t = N1/3 = 2l/3, in which case we have an algorithm storing O(22l/3) points that finds preimages using O(22l/3) hash computations. If a hash function with 80 bits of output is used, then this is feasible in practice.
Handling different domain and range. In practice, it is common to be faced with a situation in which the domain and range of H are different. One example is in the context of password cracking (see Section 5.6.3), where an attacker has H(pw) but |pw| ≪ l. In the general case, say x is chosen from some domain D which may be larger or smaller than {0, 1}l. While it is, of course, possible to artificially expand the domain/range to make them match, this will not be useful for the attack described above. To see why, consider the password example. For the attack to succeed we want pw to be in some table of values generated during preprocessing. If we generate each row of the table by simply computing H(SP ), H(2)(SP ), . . ., for SP ∈ D, then none of these values (except possibly SP itself) will be equal to pw.
We can address this by applying a function Fi, as before, between each evaluation of H, though now we choose Fi mapping {0,1}l to D. This solves the above issue, since Fi(H(SP)),(Fi ◦H)(2)(SP),… now all lie in D.
Applications to key-recovery attacks. Time/space tradeoffs give at-
tacks on cryptographic primitives other than hash functions. One canonical
application—in fact, the application originally considered by Hellman—is an
fine H(k) = Fk(x) where x is some arbitrary, but fixed, input that will be used for building the table. If an attacker can obtain Fk(x) for an unknown key k—either via a chosen-plaintext attack or by choosing x such that Fk(x) is likely to be obtained in a known-plaintext attack—then by inverting H the attacker learns (a candidate value for) k. Note that it is possible for the key length of F to differ from its block length, but in this case we can use the technique just described for handling H with different domain and range.
i
attack on an arbitrary block cipher F that leads to recovery of the key. De- def
174 Introduction to Modern Cryptography
5.5 The Random-Oracle Model
There are several examples of constructions based on cryptographic hash functions that cannot be proven secure based only on the assumption that the hash function is collision or preimage resistant. (We will see some in the following section.) In many cases, there appears to be no simple and reasonable assumption regarding the hash function that would be sufficient for proving the construction secure.
Faced with this situation, there are several options. One is to look for schemes that can be proven secure based on some reasonable assumption about the underlying hash function. This is a good approach, but it leaves open the question of what to do until such schemes are found. Also, provably secure constructions may be significantly less efficient than other approaches that have not been proven secure. (This is a prominent issue that we will encounter in the setting of public-key cryptography.)
Another possibility, of course, is to use an existing cryptosystem even if it has no justification for its security other than, perhaps, the fact that the designers tried to attack it and were unsuccessful. This flies in the face of ev- erything we have said about the importance of the rigorous, modern approach to cryptography, and it should be clear that this is unacceptable.
An approach that has been hugely successful in practice, and which offers a “middle ground” between a fully rigorous proof of security on the one hand and no proof whatsoever on the other, is to introduce an idealized model in which to prove the security of cryptographic schemes. Although the idealization may not be an accurate reflection of reality, we can at least derive some measure of confidence in the soundness of a scheme’s design from a proof within the idealized model. As long as the model is reasonable, such proofs are certainly better than no proofs at all.
The most popular example of this approach is the random-oracle model, which treats a cryptographic hash function H as a truly random function. (We have already seen an example of this in our discussion of time/space tradeoffs, although there we were analyzing an attack rather than a construction.) More specifically, the random-oracle model posits the existence of a public, random function H that can be evaluated only by “querying” an oracle—which can be thought of as a “black box”—that returns H(x) when given input x. (We will discuss how this is to be interpreted in the following section.) To differentiate things, the model we have been using until now (where no random oracle is present) is often called the “standard model.”
No one claims that a random oracle exists, although there have been sugges- tions that a random oracle could be implemented in practice using a trusted party (i.e., some server on the Internet). Rather, the random-oracle model provides a formal methodology that can be used to design and validate cryp- tographic schemes using the following two-step approach:
Hash Functions and Applications 175
1. First, a scheme is designed and proven secure in the random-oracle model. That is, we assume the world contains a random oracle, and construct and analyze a cryptographic scheme within this model. Stan- dard cryptographic assumptions of the type we have seen until now may be utilized in the proof of security as well.
2. When we want to implement the scheme in the real world, a random oracle is not available. Instead, the random oracle is instantiated with an appropriately designed cryptographic hash function Hˆ . (We return to this point at the end of this section.) That is, at each point where the scheme dictates that a party should query the oracle for the value H(x), the party instead computes Hˆ(x) on its own.
The hope is that the cryptographic hash function used in the second step is “sufficiently good” at emulating a random oracle, so that the security proof given in the first step will carry over to the real-world instantiation of the scheme. The difficulty here is that there is no theoretical justification for this hope, and in fact there are (contrived) schemes that can be proven secure in the random-oracle model but are insecure no matter how the random oracle is instantiated in the second step. Furthermore, it is not clear (mathematically or heuristically) what it means for a hash function to be “sufficiently good” at emulating a random oracle, nor is it clear that this is an achievable goal. In particular, no concrete instantiation Hˆ can ever behave like a random function, since Hˆ is deterministic and fixed. For these reasons, a proof of security in the random-oracle model should be viewed as providing evidence that a scheme has no “inherent design flaws,” but is not a rigorous proof that any real-world instantiation of the scheme is secure. Further discussion on how to interpret proofs in the random-oracle model is given in Section 5.5.2.
5.5.1 The Random-Oracle Model in Detail
Before continuing, let us pin down exactly what the random-oracle model entails. A good way to think about the random-oracle model is as follows: The “oracle” is simply a box that takes a binary string as input and returns a binary string as output. The internal workings of the box are unknown and inscrutable. Everyone—honest parties as well as the adversary—can interact with the box, where such interaction consists of feeding in a binary string x as input and receiving a binary string y as output; we refer to this as querying the oracle on x, and call x itself a query made to the oracle. Queries to the oracle are assumed to be private so that if some party queries the oracle on input x then no one else learns x, or even learns that this party queried the oracle at all. This makes sense, because calls to the oracle correspond (in the real-world instantiation) to local evaluations of a cryptographic hash function.
An important property of this “box” is that it is consistent. That is, if the box ever outputs y for a particular input x, then it always outputs the same answer y when given the same input x again. This means that we can view the
176 Introduction to Modern Cryptography
box as implementing a well-defined function H; i.e., we define the function H in terms of the input/output characteristics of the box. For convenience, we thus speak of “querying H” rather than querying the box. No one “knows” the entire function H (except the box itself); at best, all that is known are the values of H on the strings that have been explicitly queried thus far.
We have already discussed in Chapter 3 what it means to choose a random function H. We only reiterate here that there are two equivalent ways to think about the uniform selection of H: either picture H being chosen “in one shot” uniformly from the set of all functions on some specified domain and range, or imagine generating outputs for H “on-the-fly,” as needed. Specifically, in the second case we can view the function as being defined by a table that is initially empty. When the oracle receives a query x it first checks whether x = xi for some pair (xi,yi) in the table; if so, the corresponding value yi is returned. Otherwise, a uniform string y ∈ {0, 1}l is chosen (for some specified l), the answer y is returned, and the oracle stores (x,y) in its table. This second viewpoint is often conceptually easier to reason about, and is also technically easier to deal with if H is defined over an infinite domain (e.g., {0, 1}∗).
When we defined pseudorandom functions in Section 3.5.1, we also consid- ered algorithms having oracle access to a random function. Lest there be any confusion, we note that the usage of a random function there is very different from the usage of a random function here. There, a random function was used as a way of defining what it means for a (concrete) keyed function to be pseudorandom. In the random-oracle model, in contrast, the random function is used as part of a construction itself and must somehow be instantiated in the real world if we want a concrete realization of the construction. A pseu- dorandom function is not a random oracle because it is only pseudorandom if the key is secret. However, in the random-oracle model all parties need to be able to compute the function; thus there can be no secret key.
Definitions and Proofs in the Random-Oracle Model
Definitions in the random-oracle model are slightly different from their counterparts in the standard model because the probability spaces consid- ered in each case are not the same. In the standard model a scheme Π is secure if for all ppt adversaries A the probability of some event is below some threshold, where this probability is taken over the random choices of the par- ties running Π and those of the adversary A. Assuming the honest parties who use Π in the real world make random choices as directed by the scheme, satisfying a definition of this sort guarantees security for real-world usage of Π.
In the random-oracle model, in contrast, a scheme Π may rely on an or- acle H. As before, Π is secure if for all ppt adversaries A the probability of some event is below some threshold, but now this probability is taken over random choice of H as well as the random choices of the parties running Π and those of the adversary A. When using Π in the real world, some (instan- tiation of) H must be fixed. Unfortunately, security of Π is not guaranteed
Hash Functions and Applications 177
for any particular choice of H. This indicates one reason why it is difficult to argue that any concrete instantiation of the oracle H by a deterministic function yields a secure scheme. (An additional, technical, difficulty is that once a concrete function H is fixed, the adversary A is no longer restricted to querying H as an oracle but can instead look at and use the code of H in the course of its attack.)
Proofs in the random-oracle model can exploit the fact that H is chosen at random, and that the only way to evaluate H(x) is to explicitly query x to H. Three properties in particular are especially useful; we sketch them informally here, and show some simple applications of them below and in the next section, but caution that a full understanding will likely have to wait until we present formal proofs in the random-oracle model in later chapters.
The first useful property of the random-oracle model is:
If x has not been queried to H, then the value of H(x) is uniform.
This may seem superficially similar to the guarantee provided by a pseudo- random generator, but is actually much stronger. If G is a pseudorandom generator then G(x) is pseudorandom to an observer assuming x is chosen uniformly at random and is completely unknown to the observer. If H is a random oracle, however, then H(x) is truly uniform to an observer as long as the observer has not queried x. This is true even if x is known, or if x is not uniform but is hard to guess. (For example, if x is an n-bit string where the first half of x is known and the last half is random then G(x) might be easy to distinguish from random but H(x) will not be.)
The remaining two properties relate explicitly to proofs by reduction in the random-oracle model. (It may be helpful here to review Section 3.3.2.) As part of the reduction, the random oracle that the adversary A interacts with must be simulated. That is: A will submit queries to, and receive answers from, what it believes to be the oracle, but the reduction itself must now answer these queries. This turns out to give a lot of power. For starters:
If A queries x to H, the reduction can see this query and learn x.
This is sometimes called “extractability.” (This does not contradict the fact, mentioned earlier, that queries to the random oracle are “private.” While that is true in the random-oracle model itself, here we are using A as a subroutine within a reduction that is simulating the random oracle for A.) Finally:
The reduction can set the value of H(x) (i.e., the response to query x) to a value of its choice, as long as this value is correctly distributed, i.e., uniform.
This is called “programmability.” There is no counterpart to extractability or programmability once H is instantiated with any concrete function.
178 Introduction to Modern Cryptography Simple Illustrations of the Random-Oracle Model
At this point some examples may be helpful. The examples given here are relatively simple, and do not use the full power that the random-oracle model affords. Rather, these examples are presented merely to provide a gentle introduction to the model. In what follows, we assume a random oracle mapping lin-bit inputs to lout-bit outputs, where lin,lout > n, the security parameter (so lin, lout are functions of n).
A random oracle as a pseudorandom generator. We first show that, for lout > lin, a random oracle can be used as a pseudorandom generator. (We do not say that a random oracle is a pseudorandom generator, since a random oracle is not a fixed function.) Formally, we claim that for any ppt adversary A, there is a negligible function negl such that
Pr[AH(·)(y) = 1] − Pr[AH(·)(H(x)) = 1] ≤ negl(n),
where in the first case the probability is taken over uniform choice of H, uniform choice of y ∈ {0,1}lout(n), and the randomness of A, and in the second case the probability is taken over uniform choice of H, uniform choice of x ∈ {0,1}lin(n), and the randomness of A. We have explicitly indicated that A has oracle access to H in each case; once H has been chosen then A can freely make queries to it.
Let S denote the set of points on which A has queried H; of course, |S| is polynomial in n. Observe that in the second case, the probability that x ∈ S is negligible. This holds since A starts with no information about x (note that H(x) by itself reveals nothing about x because H is a random function), and because S is exponentially smaller than {0, 1}lin . Moreover, conditioned on x ̸∈ S in the second case, A’s input in each case is a uniform string that is independent of the answers to A’s queries.
A random oracle as a collision-resistant hash function. If lout < lin, a random oracle is collision resistant. That is, the success probability of any ppt adversary A in the following experiment is negligible:
1. A random function H is chosen.
2. A succeeds if it outputs distinct x,x′ with H(x) = H(x′).
To see this, assume without loss of generality that A only outputs values x, x′ that it had previously queried to the oracle, and that A never makes the same query to the oracle twice. Letting the oracle queries of A be x1,...,xq, with q = poly(n), it is clear that the probability that A succeeds is upper- bounded by the probability that H(xi) = H(xj) for some i ̸= j. But this is exactly equal to the probability that if we pick q strings y1, . . . , yq ∈ {0, 1}lout independently and uniformly at random, we have yi = yj for some i ̸= j. This is exactly the birthday problem, and so using the results of Appendix A.4 we have that A succeeds with negligible probability O(q2/2lout ).
Hash Functions and Applications 179
Constructing a pseudorandom function from a random oracle. It is also rather easy to construct a pseudorandom function in the random-oracle model. Suppose lin(n) = 2n and lout(n) = n, and define
where |k| = |x| = n. In Exercise 5.11 you are asked to show that this is a pseudorandom function, namely, that for any polynomial-time A the success probability of A in the following experiment is not more than 1/2 plus a negligible function:
1. A function H and values k ∈ {0, 1}n and b ∈ {0, 1} are chosen uniformly.
2. If b = 0, the adversary A is given access to an oracle for Fk(·) = H(k∥·). If b = 1, then A is given access to a random function mapping n-bit inputs to n-bit outputs. (This random function is independent of H.)
3. A outputs a bit b′, and succeeds if b′ = b.
In step 2, A can access H in addition to the function oracle provided to it by the experiment. (A pseudorandom function in the random-oracle model must be indistinguishable from a random function that is independent of H.)
An interesting aspect of all the above claims is that they make no compu- tational assumptions; they hold even for computationally unbounded adver- saries as long as those adversaries are limited to making polynomially many queries to the oracle. This has no counterpart in the real world, where we have seen that computational assumptions are necessary.
5.5.2 Is the Random-Oracle Methodology Sound?
Schemes designed in the random-oracle model are implemented in the real world by instantiating H with some concrete function. With the mechanics of the random-oracle model behind us, we turn to a more fundamental question:
What do proofs of security in the random-oracle model guarantee as far as security of any real-world instantiation?
This question does not have a definitive answer: there is currently debate within the cryptographic community regarding how to interpret proofs in the random-oracle model, and an active area of research is to determine what, precisely, a proof of security in the random-oracle model implies vis-a-vis the real world. We can only hope to give a flavor of both sides of the debate.
Objections to the random-oracle model. The starting point for argu- ments against using random oracles is simple: as we have already noted, there is no formal or rigorous justification for believing that a proof of security for some scheme Π in the random-oracle model says anything about the security of Π in the real world, once the random oracle H has been instantiated with
def
Fk(x) = H(k∥x),
180 Introduction to Modern Cryptography
any particular hash function Hˆ . This is more than just theoretical uneasiness. A little thought shows that no concrete hash function can ever act as a “true” random oracle. For example, in the random-oracle model the value H(x) is “completely random” if x was not explicitly queried. The counterpart would be to require that Hˆ (x) is random (or pseudorandom) if Hˆ was not explicitly evaluated on x. How are we to interpret this in the real world? It is not even clear what it means to “explicitly evaluate” Hˆ : what if an adversary knows some shortcut for computing Hˆ that does not involve running the actual code for Hˆ ? Moreover, Hˆ (x) cannot possibly be random (or even pseudorandom) since once the adversary learns the description of Hˆ , the value of that function on all inputs is immediately determined.
Limitations of the random-oracle model become clearer once we examine the proof techniques introduced earlier. Recall that one proof technique is to use the fact that a reduction can “see” the queries that an adversary A makes to the random oracle. If we replace the random oracle by a particular hash function Hˆ , this means we must provide a description of Hˆ to the adversary at the beginning of the experiment. But then A can evaluate Hˆ on its own, without making any explicit queries, and so a reduction will no longer have the ability to “see” any queries made by A. (In fact, as noted previously, the notion of A performing explicit evaluations of Hˆ may not be true and certainly cannot be formally defined.) Likewise, proofs of security in the random-oracle model allow the reduction to choose the outputs of H as it wishes, something that is clearly not possible when a concrete function is used.
Even if we are willing to overlook the above theoretical concerns, a practical problem is that we do not currently have a very good understanding of what it means for a concrete hash function to be “sufficiently good” at instantiating a random oracle. For concreteness, say we want to instantiate the random oracle using some appropriate modification of SHA-1 (SHA-1 is a cryptographic hash function discussed in Section 6.3.3). While for some particular scheme Π it might be reasonable to assume that Π is secure when instantiated using SHA-1, it is much less reasonable to assume that SHA-1 can take the place of a random oracle in every scheme designed in the random-oracle model. Indeed, as we have said earlier, we know that SHA-1 is not a random oracle. And it is not hard to design a scheme that is secure in the random-oracle model, but is insecure when the random oracle is replaced by SHA-1.
We emphasize that an assumption of the form “SHA-1 acts like a random oracle” is qualitatively different from assumptions such as “SHA-1 is collision resistant” or “AES is a pseudorandom function.” The problem lies partly with the fact that there is no satisfactory definition of what the first statement means, while we do have such definitions for the latter two statements.
Because of this, using the random-oracle model to prove security of a scheme is qualitatively different from, e.g., introducing a new cryptographic assump- tion in order to prove a scheme secure in the standard model. Therefore, proofs of security in the random-oracle model are less satisfying than proofs of security in the standard model.
Hash Functions and Applications 181
Support for the random-oracle model. Given all the problems with the random-oracle model, why use it at all? More to the point: why has the random-oracle model been so influential in the development of modern cryptography (especially current practical usage of cryptography), and why does it continue to be so widely used? As we will see, the random-oracle model enables the design of substantially more efficient schemes than those we know how to construct in the standard model. As such, there are few (if any) public-key cryptosystems used today having proofs of security in the standard model, while there are numerous deployed schemes having proofs of security in the random-oracle model. In addition, proofs in the random-oracle model are almost universally recognized as lending confidence to the security of schemes being considered for standardization.
The fundamental reason for this is the belief that:
A proof of security in the random-oracle model is significantly bet- ter than no proof at all.
Although some disagree, we offer the following in support of this assertion:
• A proof of security for a scheme in the random-oracle model indicates that the scheme’s design is “sound,” in the sense that the only possible attacks on a real-world instantiation of the scheme are those that arise due to a weakness in the hash function used to instantiate the random oracle. Thus, if a “good enough” hash function is used to instantiate the random oracle, we should have confidence in the security of the scheme. Moreover, if a given instantiation of the scheme is successfully attacked, we can simply replace the hash function being used with a “better” one.
• Importantly, there have been no successful real-world attacks on schemes proven secure in the random-oracle model, when the random oracle was instantiated properly. (We do not include here attacks on “contrived” schemes, but remark that great care must be taken in instantiating the random oracle, as indicated by Exercise 5.10.) This gives evidence to the usefulness of the random-oracle model in designing practical schemes.
Nevertheless, the above ultimately represent only intuitive speculation as to the usefulness of proofs in the random-oracle model and—all else being equal—proofs without random oracles are preferable.
Instantiating the Random Oracle
Properly instantiating a random oracle is subtle, and a full discussion is beyond the scope of this book. Here we only alert the reader that using an “off- the-shelf” cryptographic hash function without modification is not, generally speaking, a sound approach. For one thing, most cryptographic hash functions are constructed using the Merkle–Damg ̊ard paradigm (cf. Section 5.2), which can be distinguished easily from a random oracle when variable-length inputs
182 Introduction to Modern Cryptography
are allowed. (See Exercise 5.10.) Also, in some constructions it is necessary for the output of the random oracle to lie in a certain range (e.g., the oracle should output elements of some group), which results in additional complications.
5.6 Additional Applications of Hash Functions
We conclude this chapter with a brief discussion of some additional applica- tions of cryptographic hash functions in cryptography and computer security.
5.6.1 Fingerprinting and Deduplication
When using a collision-resistant hash function H, the hash (or digest) of a file serves as a unique identifier for that file. (If any other file is found to have the same identifier, this implies a collision in H). The hash H(x) of a file x is like a fingerprint, and one can check whether two files are equal by comparing their digests. This simple idea has many applications.
• Virus fingerprinting: Virus scanners identify viruses and block or quar- antine them. One of the most basic steps toward this goal is to store a database containing the hashes of known viruses, and then to look up the hash of a downloaded application or email attachment in this database. Since only a short string needs to be recorded (and/or distributed) for each virus, the overhead involved is feasible.
• Deduplication: Data deduplication is used to eliminate duplicate copies of data, especially in the context of cloud storage where multiple users rely on a single cloud service to store their data. The observation here is that if multiple users wish to store the same file (e.g., a popular video), then the file only needs to be stored once and need not be uploaded separately by each user. Deduplication can be achieved by first having a user upload a hash of the new file they want to store; if a file with this hash is already stored in the cloud, then the cloud-storage provider can simply add a pointer to the existing file to indicate that this specific user has also stored this file. This saves both communication and storage, and the soundness of the methodology follows from the collision resistance of the hash function.
• Peer-to-peer (P2P) file sharing: In P2P file-sharing systems, tables are held by servers to provide a file-lookup service. These tables contain the hashes of the available files, once again providing a unique identifier without using much memory.
Hash Functions and Applications 183
It may be surprising that a small digest can uniquely identify every file in the world. But this is the guarantee provided by collision-resistant hash functions, which makes them useful in the settings above.
5.6.2 Merkle Trees
Consider a client who uploads a file x to a server. When the client later retrieves x, it wants to make sure that the server returns the original, unmod- ified file. The client could simply store x and check that the retrieved file is equal to x, but that defeats the purpose of using the server in the first place. We are looking for a solution in which the storage of the client is small.
A natural solution is to use the “fingerprinting” approach described above. The client can locally store the short digest h := H(x); when the server
returns a candidate file x′ the client need only check that H(x′) =? h.
What happens if we want to extend this solution to multiple files x1, . . . , xt? There are two obvious ways of doing this. One is to simply hash each file independently; the client will locally store the digests h1,...,ht, and verify retrieved files as before. This has the disadvantage that the client’s storage grows linearly in t. Another possibility is to hash all the files together. That is, the client can compute h := H(x1, . . . , xt) and store only h. The drawback now is that when the client wants to retrieve and verify correctness of the ith
file xi, it needs to retrieve all the files in order to recompute the digest. Merkle trees, introduced by Ralph Merkle, give a tradeoff between these extremes. A Merkle tree computed over input values x1, . . . , xt is simply a binary tree of depth log t in which the inputs are placed at the leaves, and the value of each internal node is the hash of the values of its two children; see Figure 5.5. (We assume t is a power of 2; if not, then we can fix some input values to null or use an incomplete binary tree, depending on the application.)
h1
h1
h5
H(x1,x2)
H(x x
H(x x
H(x x
x1
x
2
x3
x
4
x
5
x
6
x
7
x
8
FIGURE 5.5: A Merkle tree.
FE DC 8777 BA @? :999 >= <; 6555
184 Introduction to Modern Cryptography
Fixing some hash function H, we denote by MT t the function that takes t input values x1,...,xt, computes the resulting Merkle tree, and outputs the value of the root of the tree. (A keyed hash function yields a keyed function MT t in the obvious way.) We have:
THEOREM 5.11 Let (GenH , H) be collision resistant. Then (GenH , MT t) is also collision resistant for any fixed t.
Merkle trees thus provide an alternative to the Merkle–Damg ̊ard transform for achieving domain extension for collision-resistant hash functions. (As de- scribed, however, Merkle trees are not collision resistant if the number of input values t is allowed to vary.)
Merkle trees provide an efficient solution to our original problem, since they allow verification of any of the original t inputs using O(log t) communication. The client computes h := MT t(x1,...,xt), uploads x1,...,xt to the server, and stores h (along with the number of files t) locally. When the client retrieves the ith file, the server sends xi along with a “proof” πi that this is the correct value. This proof consists of the values of the nodes in the Merkle tree adjacent to the path from xi to the root. From these values the client can recompute the value of the root and verify that it is equal to the stored value h. As an example, consider the Merkle tree in Figure 5.5. The clientcomputesh1...8 :=MT8(x1,...,x8),uploadsx1,...,x8 totheserver, and stores h1...8 locally. When the client retrieves x3, the server sends x3 along with x4, h1...2 = H(x1, x2), and h5...8 = H(H(x5, x6), H(x7, x8)). (If files are large we may wish to avoid sending any file other than the one the client has requested. That can easily be done if we define the Merkle tree over the hashes of the files rather than the files themselves. We omit the details.) The client computes h′1...4 := H(h1...2, H(x3, x4)) and h′1...8 := H(h′1...4, h5...8), and then
verifies that h1...8 =? h′1...8.
If H is collision resistant, it is infeasible for the server to send an incor-
rect file (and any proof) that will cause verification to succeed. Using this approach, the client’s local storage is constant (independent of the number of files t), and the communication from server to client is proportional to log t.
5.6.3 Password Hashing
One of the most common and important uses of hash functions in computer security is for password protection. Consider a user typing in a password before using their laptop. To authenticate the user, some form of the user’s password must be stored somewhere on their laptop. If the user’s password is stored in the clear, then an adversary who steals the laptop can read the user’s password off the hard drive and then login as that user. (It may seem pointless to try to hide one’s password from an attacker who can already read the contents of the hard drive. However, files on the hard drive may be
Hash Functions and Applications 185
encrypted with a key derived from the user’s password, and would thus only be accessible after the password is entered. In addition, the user is likely to use the same password at other sites.)
This risk can be mitigated by storing a hash of the password instead of the password itself. That is, the hard drive stores the value hpw = H(pw) in a password file; later, when the user enters its password pw, the operating
system checks whether H(pw) =? hpw before granting access. The same basic approach is also used for password-based authentication on the web. Now, if an attacker steals the hard drive (or breaks into a web server), all it obtains is the hash of the password and not the password itself.
If the password is chosen from some relatively small space D of possibilities (e.g., D might be a dictionary of English words, in which case |D| ≈ 80, 000), an attacker can enumerate all possible passwords pw1, pw2, . . . ∈ D and, for each candidate pwi, check whether H(pwi) = hpw. We would like to claim that an attacker can do no better than this. (This would also ensure that the adversary could not learn the password of any user who chose a strong password from a large space.) Unfortunately, preimage resistance (i.e., one- wayness) of H is not sufficient to imply what we want. For one thing, preimage resistance only says that H(x) is hard to invert when x is chosen uniformly from a large domain like {0, 1}n. It says nothing about the hardness of invert- ing H when x is chosen from some other space, or when x is chosen according to some other distribution. Moreover, preimage resistance says nothing about the concrete amount of time needed to find a preimage. For example, a hash function H for which computing x ∈ {0,1}n given H(x) requires time 2n/2 could still qualify as preimage resistant, yet this would mean that a 30-bit password could be recovered in only 215 time.
If we model H as a random oracle, then we can formally prove the security we want, namely, recovering pw from hpw (assuming pw is chosen uniformly from D) requires |D|/2 evaluations of H, on average.
The above discussion assumes no preprocessing is done by the attacker. As we have seen in Section 5.4.3, though, preprocessing can be used to generate large tables that enable inversion (even of a random function!) faster than ex- haustive search. This is a significant concern in practice: even if a user chooses their password as a random combination of 8 alphanumeric characters—giving a password space of size N = 628 ≈ 247.6—there is an attack using time and space N2/3 ≈ 232 that will be highly effective. The tables only need to be generated once, and can be used to crack hundreds of thousands of passwords in case of a server breach. Such attacks are routinely carried out in practice.
Mitigation. We briefly describe two mechanisms used to mitigate the threat of password cracking; further discussion can be found in texts on computer security. One technique is to use “slow” hash functions, or to slow down existing hash functions by using multiple iterations (i.e., computing H(I)(pw) for I ≫ 1). This has the effect of slowing down legitimate users by a factor of I, which is not a problem if I is set to some “moderate” value (e.g., 1,000).
186 Introduction to Modern Cryptography
On the other hand, it has a significant impact on an adversary attempting to crack thousands of passwords at once.
A second mechanism is to introduce a salt. When a user registers their pass- word, the laptop/server will generate a long random value s (a “salt”) unique to that user, and store (s,hpw = H(s,pw)) instead of merely storing H(pw) as before. Since s is unknown to the attacker in advance, preprocessing is ineffective and the best an attacker can do is to wait until it obtains the pass- word file and then do a linear-time exhaustive search over the domain D as discussed before. Note also that since a different salt is used for each stored password, a separate brute-force search is needed to recover each password.
5.6.4 Key Derivation
All the symmetric-key cryptosystems we have seen require a uniformly dis- tributed bit-string for the secret key. Often, however, it is more convenient for two parties to rely on shared information such as a password or biometric data that is not uniformly distributed. (Jumping ahead, in Chapter 10 we will see how parties can interact to generate a high-entropy shared secret that is not uniformly distributed.) The parties could try to use their shared infor- mation directly as a secret key, but in general this will not be secure (since, e.g., private-key schemes all assume a uniformly distributed key). Moreover, the shared data may not even have the correct format to be used as a secret key (it may be too long, for example).
Truncating the shared secret, or mapping it in some other ad hoc way to a string of the correct length, may lose a significant amount of entropy. (We define one notion of entropy more formally below, but for now one can think of entropy as the logarithm of the space of possible shared secrets.) For example, imagine two parties share a password composed of 28 random upper- case letters, and want to use a cryptosystem with a 128-bit key. Since there are 26 possibilities for each character, there are 2628 > 2130 possible passwords. If the password is shared in ASCII format, each character is stored using 8 bits, and so the total length of the password is 224 bits. If the parties truncate their password to the first 128 bits, they will be using only the first 16 characters of their password. However, this will not be a uniformly distributed 128-bit string! In fact, the ASCII representations of the letters A–Z lie between 0x41 and 0x5A; in particular, the first 3 bits of every byte are always 010. This means that 37.5% of the bits of the resulting key will be fixed, and the 128-bit key the parties derive will have only about 75 bits of entropy (i.e., there are only 275 or so possibilities for the key).
What we need is a generic solution for deriving a key from a high-entropy (but not necessarily uniform) shared secret. Before continuing, we define the notion of entropy we consider here.
Hash Functions and Applications 187
DEFINITION 5.12 A probability distribution X has m bits of min-entropy if for every fixed value x it holds that PrX←X [X = x] ≤ 2−m. That is, even the most likely outcome occurs with probability at most 2−m.
The uniform distribution over a set of size S has min-entropy logS. A dis- tribution in which one element occurs with probability 1/10 and 90 elements each occur with probability 1/100 has min-entropy log10 ≈ 3.3. The min- entropy of a distribution measures the probability with which an attacker can guess a value sampled from that distribution; the attacker’s best strategy is to guess the most likely value, and so if the distribution has min-entropy m the attacker guesses correctly with probability at most 2−m. This explains why min-entropy (rather than other notions of entropy) is useful in our context. An extension of min-entropy, called computational min-entropy, is defined as above except that the distribution is only required to be computationally in- distinguishable from a distribution with the given min-entropy. (The notion of computational indistinguishability is formally defined in Section 7.8.)
A key-derivation function provides a way to obtain a uniformly distributed string from any distribution with high (computational) min-entropy. It is not hard to see that if we model a hash function H as a random oracle, then H serves as a good key-derivation function. Consider an attacker’s uncertainty about H(X), where X is sampled from a distribution with min-entropy m (as a technical point, we require the distribution to be independent of H). Each of the attacker’s queries to H can be viewed as a “guess” for the value of X; by assumption on the min-entropy of the distribution, an attacker making q queries to H will query H(X) with probability at most q·2−m. If the attacker does not query X to H, then H(X) is a uniform string.
It is also possible to design key-derivation functions, without relying on the random-oracle model, using keyed hash functions called (strong) extractors. The key for the extractor must be uniform, but need not be kept secret. One standard for this is called HKDF; see the references at the end of the chapter.
5.6.5 Commitment Schemes
A commitment scheme allows one party to “commit” to a message m by sending a commitment value com, while obtaining the following seemingly contradictory properties:
• Hiding: the commitment reveals nothing about m.
• Binding: it is infeasible for the committer to output a commitment com that it can later “open” as two different messages m, m′. (In this sense, com truly “commits” the committer to some well-defined value.)
A commitment scheme can be seen as a digital envelope: sealing a message in an envelope and handing it over to another party provides privacy (until the
188 Introduction to Modern Cryptography
envelope is opened) and binding (since the envelope is sealed).
Formally, a (non-interactive) commitment scheme is defined by a random- ized algorithm Gen that outputs public parameters params and an algorithm Com that takes params and a message m ∈ {0, 1}n and outputs a commit- ment com; we will make the randomness used by Com explicit, and de- note it by r. A sender commits to m by choosing uniform r, computing com := Com(params, m; r), and sending it to a receiver. The sender can later decommit com and reveal m by sending m,r to the receiver; the receiver
verifiesthisbycheckingthatCom(params,m;r)=? com.
Hiding, informally, means that com reveals nothing about m; binding means
that it is impossible to output a commitment com that can be opened two different ways. We define these properties formally now.
The commitment hiding experiment HidingA,Com(n):
1. Parameters params ← Gen(1n) are generated.
2. The adversary A is given input params, and outputs a pair of messages m0, m1 ∈ {0, 1}n.
3. A uniform b ∈ {0, 1} is chosen and com ← Com(params, mb; r) is computed.
4. The adversary A is given com and outputs a bit b′.
5. The output of the experiment is 1 if and only if b′ = b.
The commitment binding experiment BindingA,Com(n):
1. Parameters params ← Gen(1n) are generated.
2. A is given input params and outputs (com, m, r, m′, r′).
3. The output of the experiment is defined to be 1 if and only if m ̸= m′ and Com(params, m; r) = com = Com(params, m′; r′).
DEFINITION 5.13 A commitment scheme Com is secure if for all ppt adversaries A there is a negligible function negl such that
Pr BindingA,Com(n) = 1 ≤ negl(n).
(n) = 1 ≤ 1 + negl(n) A,Com 2
Pr Hiding
and
It is easy to construct a secure commitment scheme from a random oracle H. To commit to a message m, the sender chooses uniform r ∈ {0,1}n and outputs com := H(m∥r). (In the random-oracle model, Gen and params are not needed since H, in effect, serves as the public parameters of the scheme.) Intuitively, hiding follows from the fact that an adversary queries H(⋆∥r) with
Hash Functions and Applications 189
only negligible probability (since r is a uniform n-bit string); if it never makes a query of this form then H(m∥r) reveals nothing about m. Binding follows from the fact that H is collision resistant.
Commitment schemes can be constructed without random oracles (in fact, from one-way functions), but the details are beyond the scope of this book.
References and Additional Reading
Collision-resistant hash functions were formally defined by Damg ̊ard [52]. Additional discussion regarding notions of security for hash functions besides collision resistance can be found in [120, 150]. The Merkle–Damg ̊ard trans- form was introduced independently by Damg ̊ard and Merkle [53, 123]
HMAC was introduced by Bellare et al. [14] and later standardized [131].
The small-space birthday attack described in Section 5.4.2 relies on a cycle- finding algorithm of Floyd. Related algorithms and results are described at http://en.wikipedia.org/wiki/Cycle_detection. The idea for finding meaningful collisions using the small-space attack is by Yuval [180]. The possibility of parallelizing collision-finding attacks, which can offer significant speedups in practice, is discussed in [170]. Time/space tradeoffs for function inversion were introduced by Hellman [87], with practical improvements—not discussed here—given by Rivest (unpublished) and by Oechslin [134].
The first formal treatment of the random-oracle model was given by Bellare and Rogaway [21], although the idea of using a “random-looking” function in cryptographic applications had been suggested previously, most notably by Fiat and Shamir [65]. Proper instantiation of a random oracle based on concrete cryptographic hash functions is discussed in [21, 22, 23, 48]. The seminal negative result concerning the random-oracle model is that of Canetti et al. [41], who show (contrived) schemes that are secure in the random-oracle model but are insecure for any concrete instantiation of the random oracle.
Merkle trees were introduced in [121]. Key-derivation functions used in practice include HKDF, PBKDF2, and bcrypt. See [109] for a formal treat- ment of the problem and an analysis of HKDF.
Exercises
5.1 Provide formal definitions for second preimage resistance and preimage resistance. Prove that any hash function that is collision resistant is sec- ond preimage resistant, and any hash function that is second preimage resistant is preimage resistant.
190 Introduction to Modern Cryptography
5.2 Let (Gen1, H1) and (Gen2, H2) be two hash functions. Define (Gen, H)
so that Gen runs Gen1 and Gen2 to obtain keys s1 and s2, respectively.
Then define Hs1,s2 (x) = Hs1 (x)∥Hs2 (x). 12
(a) Prove that if at least one of (Gen1,H1) and (Gen2,H2) is collision resistant, then (Gen,H) is collision resistant.
(b) Determine whether an analogous claim holds for second preimage resistance and preimage resistance, respectively. Prove your answer in each case.
5.3 Let (Gen, H ) be a collision-resistant hash function. Is (Gen, Hˆ ) defined ˆs def s s
by H (x) = H (H (x)) necessarily collision resistant?
5.4 Provide a formal proof of Theorem 5.4 (i.e., describe the reduction).
5.5 Generalize the Merkle–Damg ̊ard transform (Construction 5.3) for the case when the fixed-length hash function h has input length n + κ (with κ > 0) and output length n, and the length of the input to H should be encoded as an l-bit value (as discussed in Section 5.3.2). Prove collision resistance of (Gen,H), assuming collision resistance of (Gen,h).
5.6 For each of the following modifications to the Merkle–Damg ̊ard trans- form (Construction 5.3), determine whether the result is collision resis- tant. If yes, provide a proof; if not, demonstrate an attack.
(a) Modify the construction so that the input length is not included at all (i.e., output zB and not zB+1 = hs(zB∥L)). (Assume the resulting hash function is only defined for inputs whose length is an integer multiple of the block length.)
(b) Modify the construction so that instead of outputting z = hs(zB∥L), the algorithm outputs zB∥L.
(c) Instead of using an IV , just start the computation from x1. That is, define z1 := x1 and then compute zi := hs(zi−1∥xi) for i = 2,…,B+1 and output zB+1 as before.
(d) Instead of using a fixed IV , set z0 := L and then compute zi := hs(zi−1∥xi) for i = 1,…,B and output zB.
5.7 Assume collision-resistant hash functions exist. Show a construction of a fixed-length hash function (Gen,h) that is not collision resistant, but such that the hash function (Gen,H) obtained from the Merkle– Damg ̊ard transform to (Gen, h) as in Construction 5.3 is collision resis- tant.
5.8 Prove or disprove: if (Gen, h) is preimage resistant, then so is the hash function (Gen, H ) obtained by applying the Merkle–Damg ̊ard transform to (Gen, h) as in Construction 5.3.
Hash Functions and Applications 191
5.9 Prove or disprove: if (Gen, h) is second preimage resistant, then so is the hash function (Gen,H) obtained by applying the Merkle–Damg ̊ard transform to (Gen, h) as in Construction 5.3.
5.10 Before HMAC, it was common to define a MAC for arbitrary-length messages by Macs,k(m) = Hs(k∥m) where H is a collision-resistant hash function.
(a) Show that this is never a secure MAC when H is constructed via the Merkle–Damg ̊ard transform. (Assume the hash key s is known to the attacker, and only k is kept secret.)
(b) Prove that this is a secure MAC if H is modeled as a random oracle.
5.11 Prove that the construction of a pseudorandom function given in Sec-
tion 5.5.1 is secure in the random-oracle model.
5.12 Prove Theorem 5.11.
5.13 Show how to find a collision in the Merkle tree construction if t is not fixed. Specifically, show how to find two sets of inputs x1,…,xt and x′1,…,x′2t suchthatMTt(x1,…,xt)=MT2t(x′1,…,x′2t).
5.14 Consider the scenario introduced in Section 5.6.2 in which a client stores files on a server and wants to verify that files are returned unmodified.
(a) Provide a formal definition of security for this setting.
(b) Formalize the construction based on Merkle trees as discussed in Section 5.6.2.
(c) Prove that your construction is secure relative to your definition under the assumption that (GenH,H) is collision resistant.
5.15 Prove that the commitment scheme discussed in Section 5.6.5 is secure in the random-oracle model.
Chapter 6
Practical Constructions of Symmetric-Key Primitives
In previous chapters we have demonstrated how secure encryption schemes and message authentication codes can be constructed from cryptographic primitives such as pseudorandom generators, pseudorandom permutations, and hash functions. One question we have not yet addressed, however, is how these cryptographic primitives are constructed in the first place, or even whether they exist at all! In the next chapter we will study this question from a theoretical vantage point, and show constructions of pseudorandom gener- ators and pseudorandom permutations based on quite weak assumptions. (It turns out that hash functions are more difficult to construct, and appear to require stronger assumptions. We will see a provably secure construction of hash functions in Section 8.4.2.) In this chapter, our focus will be on compar- atively heuristic, but far more efficient, constructions of these primitives that are widely used in practice.
As just mentioned, the constructions that we will explore in this chapter are heuristic in the sense that they cannot be proven secure based on any weaker assumption. These constructions are, however, based on a number of design principles, some of which can be justified by theoretical analysis. Perhaps more importantly, many of these constructions have withstood years of public scrutiny and attempted cryptanalysis and, given this, it is quite reasonable to assume the security of these constructions.
In some sense there is no fundamental difference between assuming, say, that factoring is hard and assuming that AES (a block cipher we will study in detail later in this chapter) is a pseudorandom permutation. There is, however, a significant qualitative difference between these assumptions.1 The primary difference is that the former assumption is more believable since it seemingly relates to a weaker requirement: the assumption that large integers are hard to factor is arguably more natural than the assumption that AES with a uniform key is indistinguishable from a random permutation. Other relevant differences between the assumptions are that factoring has been studied much longer than the problem of distinguishing AES from a random permutation,
1It should be clear that the discussion in this paragraph is informal, as we cannot formally argue about any of this when we cannot even prove that factoring is hard in the first place!
193
194 Introduction to Modern Cryptography
and was recognized as a hard problem well before the advent of cryptographic schemes based on it.
To summarize, it is reasonable to assume that the recommended construc- tions described in this chapter are secure, and people are comfortable relying on such assumptions in practice. Still, it would be preferable to base security of cryptographic primitives on weaker and more long-standing assumptions. As we will see in Chapter 7, this is (in principle) possible; unfortunately, the constructions we will see there are orders of magnitude less efficient than the constructions described here, and as such are not useful in practice.
The Aim of This Chapter
The main aims of this chapter are (1) to present some design principles used in the construction of modern cryptographic primitives, and (2) to introduce the reader to some popular constructions used extensively in the real world. We caution that:
• It is not the aim of this chapter to teach readers how to design new cryptographic primitives. On the contrary, we believe that the design of new primitives requires significant expertise and effort, and is not some- thing to be attempted lightly. Those who are interested in developing additional expertise in this area are advised to read the more advanced references included at the end of the chapter.
• It is not our intent to present all the low-level details of the various primitives we discuss here, and our descriptions should not be relied upon for implementation. In fact, our descriptions are sometimes pur- posefully inaccurate, as we omit certain details that are not relevant to the broader conceptual point we are trying to emphasize.
6.1 Stream Ciphers
Recall from Section 3.3.1 that a stream cipher is defined by two determin- istic algorithms (Init, GetBits). The Init algorithm takes as input a key k and an (optional) initialization vector IV and returns some initial state st. The GetBits algorithm can be used to generate an infinite stream of bits y1, y2, . . . based on st. The main requirement of a stream cipher is that it should behave like a pseudorandom generator, namely, when k is chosen uniformly at random the resulting sequence y1, y2, . . . should be indistinguishable from a sequence of uniform and independent bits by any computationally bounded attacker. (In Section 3.6.1 we noted that stream ciphers must sometimes satisfy stronger security requirements. We do not explicitly address this here.)
Practical Constructions of Symmetric-Key Primitives 195
We have already pointed out (see the end of Section 3.5.1) that stream ciphers can be constructed easily from block ciphers, which are a stronger primitive. The primary motivation for using the dedicated stream-cipher constructions introduced in this section is efficiency, especially in resource- constrained environments (e.g., in hardware where there may be a desire to keep the number of gates small). Attacks have been shown against several recent constructions of stream ciphers, however, and their security appears much more tenuous than is the case for block ciphers. We therefore recom- mend using block ciphers (possibly in stream-cipher mode) when possible.
6.1.1 Linear-Feedback Shift Registers
We begin by discussing linear-feedback shift registers (LFSRs). These have been used historically for pseudorandom-number generation, as they are ex- tremely efficient to implement in hardware, and generate output having good statistical properties. By themselves, however, they do not give cryptograph- ically strong pseudorandom generators, and in fact we will show an easy key- recovery attack on LFSRs. Nevertheless, LFSRs can be used as a component in building stream ciphers with better security.
FIGURE 6.1: A linear-feedback shift register.
An LFSR consists of an array of n registers sn−1, . . . , s0 along with a feed- back loop specified by a set of n feedback coefficients cn−1, . . . , c0. (See Fig- ure 6.1.) The size of the array is called the degree of the LFSR. Each register stores a single bit, and the state st of the LFSR at any point in time is simply the set of bits contained in the registers. The state of the LFSR is updated in each of a series of “clock ticks” by shifting the values in all the registers to the right, and setting the new value of the left-most register equal to the XOR of some subset of the current registers, with the subset determined by
the feedback coefficients. That is, if the state at some time t is s(t)
then the state after the next clock tick is s(t+1), . . . , s(t+1) with n−1 0
, . . . , s(t), 0
s(t+1) :=s(t) , i i+1
i=0,…,n−2 Figure6.1showsadegree-4LFSRwithc0 =c2 =1andc1 =c3 =0.
s(t+1) := n−1
n−1 i=0
ci s(t). i
n−1
196 Introduction to Modern Cryptography
At each clock tick, the LFSR outputs the value of the right-most register s0.
If the initial state of the LFSR is s(0) , . . . , s(0), the first n bits of the output (0) (0) n−1 0 (1) n−1 (0)
stream are exactly s0 ,…,sn−1. The next output bit is sn−1 = i=0 ci si . If we denote the output bits by y1, y2, . . ., where yi = s(i−1), then
0 yi = s(0) , i = 1,…,n
i−1
n−1
yi =
As an example using the LFSR from Figure 6.1, if the initial state is
cj yi−n+j−1 i > n.
(0, 0, 1, 1) then the states for the first few time periods are
j=0
(0,0,1,1) (1,0,0,1) (1,1,0,0) (1,1,1,0) (1,1,1,1)
and the output (which can be read off the right-most column of the above) is the stream of bits 1,1,0,0,1,….
The state of the LFSR consists of n bits; thus, the LFSR can cycle through at most 2n possible states before repeating. When the states repeat the output bits repeat, and this means that the output sequence will begin repeating after at most 2n output bits have been generated. A maximum-length LFSR cycles through all 2n − 1 nonzero states before repeating. (Note that if the all-0 state is ever realized then the LFSR remains in that state forever, which is why we exclude it.) Whether an LFSR is maximal length or not depends only on the feedback coefficients; if it is maximal length then, once it is initialized in any nonzero state, it will cycle through all 2n − 1 nonzero states. It is well understood how to set the feedback coefficients to obtain a maximal-length LFSR, although the details are beyond the scope of this book.
Reconstruction attacks. The output of a maximal-length LFSR of degree n has good statistical properties; for example, every n-bit string occurs with roughly equal frequency in the output stream of the LFSR. Nevertheless, LFSRs are not good pseudorandom generators for cryptographic purposes because their output is predictable. This follows from the fact that an attacker can reconstruct the entire state of a degree-n LFSR after observing at most 2n output bits. To see this, assume both the initial state and the feedback coefficients of some LFSR are unknown. The first n output bits y1, . . . , yn of the LFSR exactly reveal the initial state. Given the next n output bits yn+1, . . . , y2n, the attacker can set up a system of n linear equations in the n
Practical Constructions of Symmetric-Key Primitives 197 unknowns c0, . . . , cn−1:
yn+1 = cn−1 yn ⊕···⊕c0 y1 .
y2n = cn−1 y2n−1 ⊕···⊕c0 yn.
One can show that the above equations are linearly independent (modulo 2) for a maximal-length LFSR, and so uniquely determine the feedback coeffi- cients. (The solution can be found efficiently using linear algebra.) With the feedback coefficients known, all subsequent output bits of the LFSR can be easily computed.
6.1.2 Adding Nonlinearity
The linear relationships between the outputs bits of an LFSR are exactly what enable an easy attack. To thwart such attacks, we must introduce some nonlinearity, i.e., some operations other than XORs. There are several different approaches to doing so, and we only explore some of them here.
Nonlinear feedback. One obvious way to modify LFSRs is to make the
feedback loop nonlinear. A nonlinear-feedback shift register (FSR) will again
consist of an array of registers, each containing a single bit. As before, the
state of the FSR is updated in each of a series of clock ticks by shifting the
values in all the registers to the right; now, however, the new value of the
left-most register is a nonlinear function of the current registers. In other
words, if the state at some time t is s(t), . . . , s(t) then the state after the
0 n−1 nextclocktickiss(t+1),…,s(t+1) with
0 n−1 s(t+1) :=s(t) ,
i=0,…,n−2 s(t+1) := g(s(t), . . . , s(t) )
i i+1
n−1 0 n−1
for some nonlinear function g. As before, the FSR outputs the value of the right-most register s0 at each clock tick.
It is possible to design nonlinear FSRs with maximal length and such that the output has good statistical properties.
Nonlinear combination generators. Another approach is to introduce nonlinearity in the output sequence. In the most basic case, we could have an LFSR as before (where the new value of the left-most register is again computed as a linear function of the current registers), but where the output at each clock tick is a nonlinear function g of all the current registers, rather than just the right-most register. It is important here that g be balanced in the sense that Pr[g(s0,…,sn−1) = 1] ≈ 1/2 (where the probability is over uniform choice of s0,…,sn−1); otherwise, although it might be difficult to
198 Introduction to Modern Cryptography
reconstruct the entire state of the LFSR based on the output, the output stream will be biased and hence easily distinguishable from uniform.
A variant of the above is to use several LFSRs (with each individual output stream computed, as before, by simply taking the value of the right-most register of each LFSR), and to generate the actual output stream by combining the output of the individual LFSRs in some nonlinear way. This yields what is known as a (nonlinear) combination generator. The individual LFSRs need not have the same degree, and in fact the cycle length of the combination generator will be maximized if they do not have the same degree. Here, care must be taken to ensure that the output stream of the combination generator is not too highly correlated with any of the output streams of the individual LFSRs; high correlation can lead to attacks on the individual LFSRs, thereby defeating the purpose of using several LFSRs in the construction.
6.1.3 Trivium
To illustrate the ideas from the previous section, we briefly describe the stream cipher Trivium. This stream cipher was selected as part of the portfolio of the eSTREAM project, a European effort completed in 2008 whose goal was to identify new stream ciphers. Trivium was designed to have a simple description and to admit a compact hardware implementation.
FIGURE 6.2: A schematic illustration of Trivium with (from top to bottom) three coupled, nonlinear FSRs A, B, and C.
Trivium uses three coupled, nonlinear FSRs denoted by A, B, and C and having degree 93, 84, and 111, respectively. (See Figure 6.2.) The state st of Trivium is simply the 288 bits comprising the values in all the registers
Practical Constructions of Symmetric-Key Primitives 199
of these FSRs. The GetBits algorithm for Trivium works as follows: At each clock tick, the output of each FSR is the XOR of its right-most register and one additional register; the output of Trivium is the XOR of the output bits of the three FSRs. The FSRs are coupled: at each clock tick, the new value of the left-most register of each FSR is computed as a function of one of the registers in the same FSR and a subset of the registers from a second FSR. (The feedback function for A depends on one register of A and four registers of C; the feedback function for B depends on one register of B and four registers of A; and the feedback function for C depends on one register of C and four registers of B.) The feedback function in each case is nonlinear.
The Init algorithm of Trivium accepts an 80-bit key and an 80-bit IV . The key is loaded into the 80 left-most registers of A, and the IV is loaded into the 80 left-most registers of B. The remaining registers are set to 0, except for the three right-most registers of C, which are set to 1. Then GetBits is run 4 · 288 times (with output discarded), and the resulting state is taken as st0.
To date, no cryptanalytic attacks better than exhaustive key search are known against the full Trivium cipher.
6.1.4 RC4
LFSRs are efficient when implemented in hardware, but have poor perfor- mance in software. For this reason, alternate designs of stream ciphers have been explored. A prominent example is RC4, which was designed by Ron Rivest in 1987. RC4 is remarkable for its speed and simplicity, and resisted serious attack for several years. It is widely used today, and we discuss it for this reason; we caution the reader, however, that recent attacks have shown serious cryptographic weaknesses in RC4 and it should no longer be used.
ALGORITHM 6.1
Init algorithm for RC4
Input: 16-byte key k
Output: Initial state (S, i, j)
(Note: All addition is done modulo 256)
for i = 0 to 255: S[i] := i
k[i] := k[i mod 16] j := 0
for i = 0 to 255:
j := j + S[i] + k[i] Swap S[i] and S[j]
i:=0, j:=0 return (S, i, j)
The state of RC4 is a 256-byte array S, which always contains a permutation of the elements 0,…,255, along with two values i,j ∈ {0,…,255}. The Init
200 Introduction to Modern Cryptography
algorithm for RC4 is presented as Algorithm 6.1. For simplicity we assume a 16-byte (128-bit) key k, although the algorithm can handle keys between 1 byte and 256 bytes long. We index the bytes of S as S[0], . . . , S[255], and the bytes of the key as k[0],…,k[15].
During initialization, S is first set to the identity permutation (i.e., with S[i] = i for all i) and k is expanded to 256 bytes by repetition. Then each entry of S is swapped at least once with another entry of S in a “pseudorandom” location. The indices i, j are set to 0, and (S, i, j) is output as the initial state.
The state is then used to generate a sequence of output bits, as shown in Algorithm 6.2. The index i is simply incremented (modulo 256), and j is changed in some “pseudorandom” way. Entries S[i] and S[j] are swapped, and the value of S at position S[i] + S[j] (again computed modulo 256) is output. Note that each entry of S is swapped with some other entry of S (possibly itself) at least once every 256 iterations, ensuring good “mixing” of the permutation S.
ALGORITHM 6.2
GetBits algorithm for RC4
Input: Current state (S, i, j)
Output: Output byte y; updated state (S, i, j) (Note: All addition is done modulo 256)
i := i + 1
j := j + S[i]
Swap S[i] and S[j] t := S[i] + S[j]
y := S[t]
return (S, i, j), y
RC4 was not designed to take an IV as input; however, in practice an IV is often incorporated by simply concatenating it with the actual key k′ before initialization. That is, a random IV of the desired length is chosen, k is set to be equal to the concatenation of IV and k′ (this can be done by either prepending or appending IV), and then Init is run as in Algorithm 6.1 to generate an initial state. Output bits are then produced using Algorithm 6.2 exactly as before. Assuming RC4 is being used in unsynchronized mode (see Section 3.6.1), the IV would then be sent in the clear to the receiver—who presumably already has the actual key k′—thus enabling them to generate the same initial state and hence the same output stream. This method of incor- porating an IV is used in the Wired Equivalent Privacy (WEP) encryption standard for protecting communications in 802.11 wireless networks.
One should be concerned by this relatively ad hoc way of modifying RC4 to accept an IV . Even if RC4 were a secure stream cipher when using (only) a key as originally designed, there is no reason to believe that it should be
Practical Constructions of Symmetric-Key Primitives 201
secure when modified to use an IV in this way. Indeed, contrary to the key, the IV is revealed to an attacker (since it is sent in the clear); furthermore, using different IV s with the same fixed key k′—as would be done when using RC4 in unsynchronized mode—means that related values k are being used to initialize the state of RC4. As we will see below, both of these issues lead to attacks when RC4 is used in this fashion.
Attacks on RC4. Although RC4 is pervasive in modern systems, various attacks on RC4 have been known for several years. Due to this, RC4 should no longer be used; instead, a more modern stream cipher or block cipher should be used in its place.
We begin by demonstrating a simple statistical attack on RC4 that does
not rely on the honest parties’ using an IV . The attack exploits the fact that
the second output byte of RC4 is (slightly) biased toward 0. Let St denote the
state of the array S after t iterations of GetBits, with S0 denoting the initial
state. Treating S0 (heuristically) as a uniform permutation of {0,…,255},
with probability 1/256 · (1 − 1/255) ≈ 1/256, it holds that S0[2] = 0 and def
X = S0[1] ̸= 2. Assume for a moment that this is the case. In the first iteration of GetBits, the value of i is incremented to 1, and j is set equal to S0[i] = S0[1] = X. Then S0[1] and S0[X] are swapped, so that at the end of the iteration we have S1 [X ] = S0 [1] = X . In the second iteration, i is incremented to 2 and j is assigned the value
j + S1[i] = X + S1[2] = X + S0[2] = X,
since S0[2] = 0. Then S1[2] and S1[X] are swapped, so that S2[X] = S1[2] = S0[2] = 0 and S2[2] = S1[X] = X. Finally, the value of S2 at position S2[i]+S2[j] = S2[2]+S2[X] = X is output; this is exactly the value S2[X] = 0.
When S0[2] ̸= 0 the second output byte is uniformly distributed. Overall, then, the probability that the second output byte is 0 is
Pr[S0[2]=0andS0[1]̸=2]+ 1 ·1−Pr[S0[2]=0andS0[1]̸=2] 256
=1+1·1−1 ≈2, 256 256 256 256
or twice what would be expected for a uniform value.
By itself the above might not be viewed as a particularly serious attack,
although it does seem to indicate underlying structural problems with RC4. A more serious attack against RC4 is possible when an IV is incorporated by prepending it to the key. This attack can used to recover the key, regardless of its length, and is thus more serious than a distinguishing attack such as the one described above. Importantly, this attack can be used to completely break the WEP encryption standard mentioned earlier, and was influential in getting the standard replaced.
The core of the attack is a way to extend knowledge of the first n bytes of k to knowledge of the first (n + 1) bytes of k. Note that when an IV is
202 Introduction to Modern Cryptography
prepended to the actual key k′ (so k = IV ∥k′), the first few bytes of k are given to the attacker for free! If the IV is n bytes long, then the adversary can use this attack to first recover the (n + 1)st byte of k (which is the first byte of the real key k′), then the next byte of k, and so on, until it deduces the entire key.
Assume the IV is 3 bytes long, as is the case for WEP. The attacker waits until the first two bytes of the IV have a specific form. The attack can be carried out with several possibilities for the first two bytes of the IV , but we look at the case where the IV takes the form IV = (3,255,X) for X an arbitrary byte. This means, of course, that k[0] = 3, k[1] = 255, and k[2] = X in Algorithm 6.1. One can check that after the first four iterations of the second loop of Init, we have
S[0]=3, S[1]=0, S[3]=X+6+k[3]. (6.1)
In the next 252 iterations of the Init algorithm, i is always greater than 3. So the values of S[0],S[1], and S[3] are not subsequently modified as long as j never takes on the values 0, 1, or 3. If we (heuristically) treat j as taking on a uniform value in each iteration, this means that S[0],S[1], and S[3] are not subsequently modified with probability (253/256)252 ≈ 0.05, or 5% of the time. Assuming this is the case, the first byte output by GetBits will be S[3] = X + 6 + k[3]; since X is known, this reveals k[3].
So, the attacker knows that 5% of the time the first byte of the output is correlated with k[3] as described above. (This is much better than random guessing, which is correct 1/256 = 0.4% of the time.) Thus, by collecting sufficiently many samples of the first byte of the output—for several IVs having the correct form—the attacker gets a high-confidence estimate for k[3].
6.2 Block Ciphers
Recall from Section 3.5.1 that a block cipher is an efficient, keyed permu- tation F : {0, 1}n × {0, 1}l → {0, 1}l. This means the function Fk defined by
def
Fk (x) = F (k, x) is a bijection (i.e., a permutation), and moreover Fk and its inverse F −1 are efficiently computable given k. We refer to n as the key length
and l as the block length of F , and here we explicitly allow them to differ. The key length and block length are now fixed constants, whereas in Chapter 3 they were viewed as functions of a security parameter. This puts us in the setting of concrete security rather than asymptotic security.2 The concrete
2Although a block cipher with fixed key length has no “security parameter” to speak of, we still view security as depending on the length of the key and thus denote this value by n.
k
Practical Constructions of Symmetric-Key Primitives 203
security requirements for block ciphers are quite stringent, and a block cipher is generally only considered “good” if the best known attack (without pre- processing) has time complexity roughly equivalent to a brute-force search for the key. Thus, if a cipher with key length n = 256 can be broken in time 2128, the cipher is (generally) considered insecure even though a 2128-time attack is still infeasible. In contrast, in an asymptotic setting an attack of complexity 2n/2 is not considered efficient since it requires exponential time (and thus a cipher where such an attack is possible might still satisfy the definition of be- ing a pseudorandom permutation). In the concrete setting, however, we must worry about the actual complexity of the attack (rather than its asymptotic behavior). Furthermore, there is a concern that existence of such an attack may indicate some more fundamental weakness in the design of the cipher.
Block ciphers are designed to behave, at a minimum, as (strong) pseudo- random permutations; see Definition 3.28. Modeling block ciphers as pseu- dorandom permutations allows proofs of security for constructions based on block ciphers, and also makes explicit the necessary requirements of a block cipher. A solid understanding of what block ciphers are supposed to achieve is instrumental in their design. The view that block ciphers should be modeled as pseudorandom permutations has, at least in the recent past, served as a major influence in their design. As an example, the call for proposals for the recent Advanced Encryption Standard (AES) that we will encounter later in this chapter stated the following evaluation criterion:
The security provided by an algorithm is the most important fac- tor…. Algorithms will be judged on the following factors…
• The extent to which the algorithm output is indistinguishable from a random permutation …
Modern block ciphers are suitable for all the constructions using pseudoran- dom permutations (or pseudorandom functions) we have seen in this book.
Often, block ciphers are designed (and assumed) to satisfy even stronger security properties, as we discuss briefly in Section 6.3.1.
Notwithstanding the fact that block ciphers are not, on their own, encryp- tion schemes, the standard terminology for attacks on a block cipher F is:
• In a known-plaintext attack, the attacker is given pairs of inputs/outputs {(xi,Fk(xi))} (for an unknown key k), with the {xi} outside the at- tacker’s control.
• In a chosen-plaintext attack, the attacker is given {Fk(xi)} (again, for an unknown key k) for a series of inputs {xi} chosen by the attacker.
• In a chosen-ciphertext attack, the attacker is given {Fk(xi)} for {xi} chosen by the attacker, as well as {F−1(yi)} for chosen {yi}.
Viewing the key length as a parameter makes sense when comparing block ciphers with different key lengths, or when using a block cipher that supports keys of different lengths.
k
204 Introduction to Modern Cryptography
Besides using the above to distinguish Fk from a uniform permutation, we will also be interested in key-recovery attacks in which the attacker is able to recover the key k after interacting with Fk. (This is stronger than being able to distinguish Fk from uniform.)
With respect to this taxonomy, a pseudorandom permutation cannot be distinguished from a uniform permutation under a chosen-plaintext attack, while a strong pseudorandom permutation cannot be so distinguished even under a chosen-ciphertext attack.
6.2.1 Substitution-Permutation Networks
A block cipher must behave like a random permutation. There are 2l! permutations on l-bit strings, so representing an arbitrary permutation with an l-bit block length requires log(2l!) ≈ l · 2l bits. This is impractical for l > 20 and infeasible for l > 50. (Looking ahead, modern block ciphers have block lengths l ≥ 128.) The challenge when designing a block cipher is to construct a set of permutations with a concise description (namely, a short key) that behaves like a random permutation. In particular, just as evaluating a random permutation at two inputs that differ in only a single bit should yield two (almost) independent outputs (they are not completely independent since they cannot be equal), so too changing one bit of the input to Fk(·), where k is uniform and unknown to an attacker, should yield an (almost) independent result. This implies that a one-bit change in the input should “affect” every bit of the output. (Note that this does not mean that all the output bits will be changed—that would be different behavior than one would expect for a random permutation. Rather, we just mean informally that each bit of the output is changed with probability roughly half.) This takes some work to achieve.
The confusion-diffusion paradigm. In addition to his work on perfect se- crecy, Shannon also introduced a basic paradigm for constructing concise, random-looking permutations. The basic idea is to construct a random- looking permutation F with a large block length from many smaller random (or random-looking) permutations {fi} with small block length. Let us see how this works on the most basic level. Say we want F to have a block length of 128 bits. We can define F as follows: the key k for F will specify 16 per- mutations f1,…,f16 that each have an 8-bit (1-byte) block length.3 Given an input x ∈ {0,1}128, we parse it as 16 bytes x1 ···x16 and then set
Fk(x) = f1(x1)∥ · · · ∥f16(x16). (6.2) These round functions {fi} are said to introduce confusion into F.
3An arbitrary permutation on 8 bits can be represented using log(28!) ≈ 1600 bits, so the length of the key for F is about 16 · 1600 bits, or 3 kbytes. This is much smaller than the ≈ 128 · 2128 bits that would be required to specify an arbitrary permutation on 128 bits.
Practical Constructions of Symmetric-Key Primitives 205
It should be immediately clear, however, that F as defined above will not be pseudorandom. Specifically, if x and x′ differ only in their first bit then Fk(x) and Fk(x′) will differ only in their first byte (regardless of the key k). In contrast, if F were a truly random permutation then changing the first bit of the input would be expected to affect all bytes of the output.
For this reason, a diffusion step is introduced whereby the bits of the output are permuted, or “mixed,” using a mixing permutation. This has the effect of spreading a local change (e.g., a change in the first byte) throughout the entire block. The confusion/diffusion steps—together called a round—are repeated multiple times. This helps ensure that changing a single bit of the input will affect all the bits of the output.
As an example, a two-round block cipher following this approach would operate as follows. First, confusion is introduced by computing the interme- diate result f1(x1)∥···∥f16(x16) as in Equation (6.2). The bits of the result are then “shuffled,” or re-ordered, to give x′ . Then f1′ (x′1 )∥ · · · ∥f1′ 6 (x′16 ) is computed (where x′ = x′1 · · · x′16), using possibly different functions fi′, and the bits of the result are permuted to give output x′′. The {fi}, {fi′}, and the mixing permutation(s) could be random and dependent on the key, as we have described above. In practice, however, they are specially designed and fixed, and the key is incorporated in a different way, as we will describe below.
Substitution-permutation networks. A substitution-permutation net- work (SPN) can be viewed as a direct implementation of the confusion- diffusion paradigm. The difference is that now the round functions have a particular form rather than being chosen from the set of all possible permu- tations on some domain. Specifically, rather than having (a portion of) the key k specify an arbitrary permutation f, we instead fix a public “substitu- tion function” (i.e., permutation) S called an S-box, and then let k define the function f given by f(x) = S(k ⊕ x).
To see how this works concretely, consider an SPN with a 64-bit block length based on a collection of 8-bit (1-byte) S-boxes S1,…,S8. (See Figure 6.3.) Evaluating the cipher proceeds in a series of rounds, where in each round we apply the following sequence of operations to the 64-bit input x of that round (the input to the first round is just the input to the cipher):
1. Key mixing: Set x := x ⊕ k, where k is the current-round sub-key;
2. Substitution: Set x := S1(x1)∥ · · · ∥S8(x8), where xi is the ith byte of x; 3. Permutation: Permute the bits of x to obtain the output of the round.
The output of each round is fed as input to the next round. After the last round there is a final key-mixing step, and the result is the output of the cipher. (By Kerckhoffs’ principle, we assume the S-boxes and the mixing per- mutation(s) are public and known to any attacker. This means that without a final key-mixing step, the last substitution and permutation steps would offer no additional security since they do not depend on the key.) Figure 6.4 shows
206
Introduction to Modern Cryptography
FIGURE 6.3: A single round of a substitution-permutation network.
the high-level structure of an SPN with a 16-bit block length and a different set of 4-bit S-boxes used in each round.
Different sub-keys (or round keys) are used in each round. The actual key of the block cipher is sometimes called the master key. The round sub-keys are derived from the master key according to a key schedule. The key schedule is often simple and may work by just taking different subsets of the bits of the master key, although more complex schedules can also be defined. An r-round SPN has r (full) rounds of key mixing, S-box substitution, and application of a mixing permutation, followed by a final key-mixing step. (This means that in an r-round SPN, r + 1 sub-keys are used.)
Any SPN is invertible (given the key). To see this, we show that given the output of the SPN and the key it is possible to recover the input. It suffices to show that a single round can be inverted; this implies the entire SPN can be inverted by working from the final round back to the beginning. But inverting a single round is easy: the mixing permutation can easily be inverted since it is just a re-ordering of bits. Since the S-boxes are permutations (i.e., one- to-one), these too can be inverted. The result can then be XORed with the appropriate sub-key to obtain the original input. Therefore:
PROPOSITION 6.3 Let F be a keyed function defined by an SPN in which the S-boxes are all permutations. Then regardless of the key schedule and the number of rounds, Fk is a permutation for any k.
Practical Constructions of Symmetric-Key Primitives
207
FIGURE 6.4: A substitution-permutation network.
The number of rounds, along with the exact choices of the S-boxes, mixing permutations, and key schedule, are what ultimately determine whether a given block cipher is trivially breakable or highly secure. We now discuss a basic principle behind the design of the S-boxes and mixing permutations.
The avalanche effect. As noted repeatedly, an important property in any block cipher is that a small change in the input must “affect” every bit of the output. We refer to this as the avalanche effect. One way to induce the avalanche effect in a substitution-permutation network is to ensure that the following two properties hold (and sufficiently many rounds are used):
1. The S-boxes are designed so that changing a single bit of the input to an S-box changes at least two bits in the output of the S-box.
2. The mixing permutations are designed so that the output bits of any given S-box are used as input to multiple S-boxes in the next round.
208 Introduction to Modern Cryptography
To see how this yields the avalanche effect, at least heuristically, assume that the S-boxes are all such that changing a single bit of the input of the S-box results in a change in exactly two bits of the output of the S-box, and that the mixing permutations are chosen as required above. For concreteness, assume the S-boxes have input/output size of 8 bits, and that the block length of the cipher is 128 bits. Consider now what happens when the block cipher is applied to two inputs that differ in a single bit:
1. After the first round, the intermediate values differ in exactly two bit- positions. This is because XORing the current sub-key maintains the 1-bit difference in the intermediate values, and so the inputs to all the S- boxes except one are identical. In the one S-box where the inputs differ, the output of the S-box causes a 2-bit difference. The mixing permu- tation applied to the results changes the positions of these differences, but maintains a 2-bit difference.
2. The mixing permutation applied at the end of the first round spreads the two bit-positions where the intermediate results differ into two different S-boxes in the second round. This remains true even after the appropri- ate sub-key is XORed with the result of the previous round. So, in the second round there are now two S-boxes that receive inputs differing by a single bit. Thus, at the end of the second round the intermediate values differ in 4 bits.
3. Continuing the same argument, we expect 8 bits of the intermediate value to be affected after the 3rd round, 16 bits to be affected after the 4th round, and all 128 bits of the output to be affected at the end of the 7th round.
The last point is not quite precise and it is certainly possible that there will be fewer differences than expected at the end of some round. (In fact, this must be the case because the outputs should not differ in all their bits, either.) For this reason, it is customary to use many more than 7 rounds. However, the above analysis gives a lower bound on the number of rounds: if fewer than 7 rounds are used then there must be some set of output bits that are not affected by a single-bit change in the input, implying that it will be possible to distinguish the cipher from a random permutation.
One might expect that the “best” way to design S-boxes would be to choose them at random (subject to the restriction that they are permutations). In- terestingly, this turns out not to be the case, at least if we want to satisfy the design criterion mentioned earlier. Consider the case of an S-box operating on 4-bit inputs and let x and x′ be two distinct values. Let y = S(x), and now consider choosing uniform y′ ̸= y as the value of S(x′). There are 4 strings that differ from y in only 1 bit, and so with probability 4/15 we will choose y′ that does not differ from y in two or more bits. The problem is compounded when we consider all pairs of inputs that differ in a single bit.
Practical Constructions of Symmetric-Key Primitives 209
We conclude based on this example that, as a general rule, the S-boxes must be designed carefully rather than being chosen blindly at random. Random S-boxes are also not good for defending against attacks like the ones we will show in Section 6.2.6.
If a block cipher should also be strongly pseudorandom, then the avalanche effect must also apply to its inverse. That is, changing a single bit of the output should affect every bit of the input. For this it is useful if the S-boxes are designed so that changing a single bit of the output of an S-box changes at least two bits of the input to the S-box. Achieving the avalanche effect in both directions is another reason for further increasing the number of rounds.
Attacking Reduced-Round SPNs
Experience, along with many years of cryptanalytic effort, indicate that substitution-permutation networks are a good choice for constructing pseu- dorandom permutations as long as care is taken in the choice of the S-boxes, the mixing permutations, and the key schedule. The Advanced Encryption Standard, described in Section 6.2.5, is similar in structure to the substitution- permutation network described above, and is widely believed to be a strong pseudorandom permutation.
The strength of a cipher F constructed in this way depends heavily on the number of rounds. In order to obtain more of an insight into substitution- permutation networks, we will demonstrate attacks on SPNs having very few rounds. These attacks are straightforward, but are worth seeing as they demonstrate conclusively why a large number of rounds is needed.
A trivial case. We first consider a trivial case where F consists of one full round and no final key-mixing step. We show that an adversary given only a single input/output pair (x,y) can easily learn the secret key k for which y = Fk(x). The adversary begins with the output value y and then inverts the mixing permutation and the S-boxes. It can do this, as noted before, because the full specification of the mixing permutation and the S-boxes is public. The intermediate value that the adversary computes is exactly x ⊕ k (assuming, without loss of generality, that the master key is used as the sub-key in the only round of the network). Since the adversary also knows the input x, it can immediately derive the secret key k. This is therefore a complete break.
Although this is a trivial attack, it demonstrates that in any substitution- permutation network there is no security gained by performing S-box substi- tution or applying a mixing permutation after the final sub-key mixing.
Attacking a one-round SPN. Now we have one full round followed by a key-mixing step. For concreteness, we assume a 64-bit block length and S- boxes with 8-bit (1-byte) input/output length. We assume independent 64-bit sub-keys k1,k2 are used for the two key-mixing steps, and so the master key k1∥k2 of the SPN is 128 bits long.
210 Introduction to Modern Cryptography
A first observation is that we can extend the attack from the trivial case above to give a key-recovery attack here using much less than 2128 work. The idea is as follows: Given a single input/output pair (x, y) as before, the attacker enumerates over all possible values for the second-round sub-key k2. For each such value, the attacker can invert the final key-mixing step to get a candidate intermediate value y′. We have seen above that given an input x and an output y′ of a (full) SPN round, a unique possible sub-key k1 can be easily identified. Thus, for each possible choice of k2 the attacker derives a unique corresponding k1 for which k1∥k2 can be the master key. In this way, the attacker obtains (in 264 time) a list of 264 possibilities for the master key. These can be narrowed down using additional input/output pairs in roughly 264 additional time; see also below.
A better attack is possible by noting that individual bits of the output depend on only part of the master key. Fix some given input/output pair (x, y) as before. Now, the adversary will enumerate over all possible values for the first byte of k2. It can XOR each such value with the first byte of y to obtain a candidate value for the output of the first S-box. Inverting this S-box, the attacker learns a candidate value for the input to that S-box. Since the input to that S-box is the XOR of 8 bits of x and 8 bits of k1 (where the positions of those bits depend on the first-round mixing permutation and are known to the attacker), this yields a candidate value for 8 bits of k1.
To summarize: for each candidate value of the first byte of k2, there is a unique possible corresponding value for some 8 bits of k1. Put differently, this means that for some 16 bits of the master key, the attacker has reduced the number of possible values for those bits from 216 to 28. The attacker can tabulate all those feasible values in 28 time. This can be repeated for each byte of k2, giving 8 lists—each containing 28 values—that together characterize the possible values of the entire master key. The attacker has thus reduced the number of possible master keys to (28)8 = 264, as in the earlier attack. The total time to do this, however, is now 8 · 28 = 211, a dramatic improvement.
The attacker can use additional input/output pairs to further reduce the space of possible keys. Consider the list of 28 feasible values for some set of 16 bits of the master key. The attacker knows that the correct value from that list must be consistent with any additional input/output pairs the attacker learns. Heuristically, any incorrect value from the list is consistent with some additional input/output pair (x′,y′) with probability no better than random guessing; since each 16-bit value from the table can be used to compute 1 byte of the output given the input x′, we expect that an incorrect value will be consistent with the actual output y′ with probability 2−8. A small number of additional input/output pairs will thus suffice to narrow down all the tables to just a single value each, at which point the entire master key is known.
There is an important lesson to be learned here. The attack is possible since different parts of the key can be isolated from other parts. Thus, further diffusion is needed to make sure that all the bits of the key affect all of the bits of the output. Multiple rounds are needed for this to take place.
Practical Constructions of Symmetric-Key Primitives 211
Attacking a two-round SPN. It is possible to extend the above ideas to give a better-than-brute-force attack on a two-round SPN using independent sub-keys in each round; we leave this as an exercise.
Instead, we simply note that a two-round SPN will not be a good pseudo- random permutation. Here we rely on the fact, mentioned earlier, that the avalanche effect does not occur after only two rounds (of course, this depends on the block length of the cipher and the input/output length of the S-boxes, but with reasonable parameters this will be the case). An attacker can distin- guish a two-round SPN from a uniform permutation if it learns the result of evaluating the SPN on two inputs that differ in a single bit: in a two-round SPN many bits of the two outputs will be the same, something not expected to occur for a random permutation.)
6.2.2 Feistel Networks
Feistel networks offer another approach for constructing block ciphers. An advantage of Feistel networks over substitution-permutation networks is that the underlying functions used in a Feistel network—in contrast to the S-boxes used in SPNs—need not be invertible. A Feistel network thus gives a way to construct an invertible function from non-invertible components. This is im- portant because a good block cipher should have “unstructured” behavior (so it looks random), yet requiring all the components of a construction to be invertible inherently introduces structure. Requiring invertibility also intro- duces an additional constraint on S-boxes, making them harder to design.
A Feistel network operates in a series of rounds. In each round, a keyed round function is applied in the manner described below. Round functions need not be invertible. They will typically be constructed from components like S-boxes and mixing permutations, but a Feistel network can deal with any round functions irrespective of their design.
In a balanced Feistel network (the only type we will consider), the ith round function fˆ takes as input a sub-key k and an l/2-bit string and outputs an
ii
l/2-bit string. As in the case of SPNs, a master key k is used to derive sub- keys for each round. When some master key is chosen, thereby determining
l/2
Note that the round functions fˆ are fixed and publicly known, but the f
each sub-key ki, we define fi : {0,1}
→ {0,1}
l/2 def ˆ
via fi(R) = fi(ki,R).
ii
depend on the master key and so are not known to the attacker.
The ith round of a Feistel network operates as follows. The input to the round is divided into two halves denoted Li−1 and Ri−1 (the “left” and “right” halves, respectively). If the block length of the cipher is l bits, then Li−1 and
Ri−1 each has length l/2. The output (Li,Ri) of the round is
Li := Ri−1 and Ri := Li−1 ⊕ fi(Ri−1). (6.3)
In an r-round Feistel network, the l-bit input to the network is parsed as (L0,R0), and the output is the l-bit value (Lr,Rr) obtained after applying all r rounds. A three-round Feistel network is shown in Figure 6.5.
212 Introduction to Modern Cryptography
FIGURE 6.5: A three-round Feistel network.
Inverting a Feistel network. A Feistel network is invertible regardless of
the {f } (and thus regardless of the round functions {fˆ}). To show this we ii
need only show that each round of the network can be inverted if the {fi} are known. Given the output (Li,Ri) of the ith round, we can compute (Li−1, Ri−1) as follows: first set Ri−1 := Li. Then compute
Li−1 := Ri ⊕ fi(Ri−1).
This gives the value (Li−1,Ri−1) that was the input of this round (i.e., it computes the inverse of Equation (6.3)). Note that fi is evaluated only in the forward direction, so it need not be invertible. We thus have:
PROPOSITION 6.4 Let F be a keyed function defined by a Feistel net- work. Then regardless of the round functions {fˆ} and the number of rounds,
i Fk is an efficiently invertible permutation for all k.
As in the case of substitution-permutation networks, attacks on Feistel net- works are possible when the number of rounds is too low. We will see such attacks when we discuss DES in the next section. Theoretical results concern- ing the security of Feistel networks are discussed in Section 7.6.
6.2.3 DES – The Data Encryption Standard
The Data Encryption Standard, or DES, was developed in the 1970s by IBM (with help from the National Security Agency) and adopted in 1977 as
Practical Constructions of Symmetric-Key Primitives 213
a Federal Information Processing Standard for the US. In its basic form, DES is no longer considered secure due to its short key length of 56 bits, which makes it vulnerable to brute-force attacks. Nevertheless, it remains in wide use today in the strengthened form of triple-DES, described in Section 6.2.4.
DES is of great historical significance. It has undergone intensive scrutiny within the cryptographic community, arguably more than any other crypto- graphic algorithm in history. The common consensus is that, apart from its key length, DES is an extremely well-designed cipher. Indeed, even after many years, the best known attack on DES in practice is an exhaustive search over all 256 possible keys. (As we will see, there are important theoretical attacks on DES that require less computation; however, these attacks assume certain conditions that seem difficult to realize in practice.)
In this section, we provide a high-level overview of the main components of DES. We stress that we will not provide a full specification that is correct in every detail, and some parts of the design will be omitted from our description. Our aim is to present the basic ideas underlying the construction of DES, and not all the low-level details; the reader interested in such details can consult the references at the end of this chapter.
The Design of DES
The DES block cipher is a 16-round Feistel network with a block length of 64 bits and a key length of 56 bits. The same round function fˆ is used in each of the 16 rounds. The round function takes a 48-bit sub-key and, as expected for a (balanced) Feistel network, a 32-bit input (namely, half a block). The key schedule of DES is used to derive a sequence of 48-bit sub-keys k1 , . . . , k16 from the 56-bit master key. The key schedule of DES is relatively simple, with each sub-key ki being a permuted subset of 48 bits of the master key. For our purposes, it suffices to note that the 56 bits of the master key are divided into two halves—a “left half” and a “right half”—each containing 28 bits. (This division occurs after an initial permutation is applied to the key, but we ignore this in our description.) In each round, the left-most 24 bits of the sub-key are taken as some subset of the 28 bits in the left half of the master key, and the right-most 24 bits of the round sub-key are taken as some subset of the 28 bits in the right half of the master key. We stress that the entire key schedule (including the manner in which bits are divided into the left and right halves, and which bits are used in forming each sub-key ki) is fixed and public, and the only secret is the master key itself.
ˆ
The DES round function. The DES round function f—sometimes called
the DES mangler function—is constructed using a paradigm we have previ- ously analyzed: it is (essentially) just a substitution-permutation network! In
ˆ 48 32 more detail, computation of f (ki , R) with ki ∈ {0, 1} and R ∈ {0, 1} pro-
ceeds as follows: first, R is expanded to a 48-bit value R′. This is carried out by simply duplicating half the bits of R; we denote this by R′ := E(R) where
214
Introduction to Modern Cryptography
FIGURE 6.6: The DES mangler function.
E is called the expansion function. Following this, computation proceeds ex- actly as in our earlier discussion of SPNs: The expanded value R′ is XORed with ki, which is also 48 bits long, and the resulting value is divided into 8 blocks, each of which is 6 bits long. Each block is passed through a (different) S-box that takes a 6-bit input and yields a 4-bit output; concatenating the output from the 8 S-boxes gives a 32-bit result. A mixing permutation is then applied to the bits of this result to obtain the final output. See Figure 6.6.
One difference as compared to our original discussion of SPNs is that the S-boxes here are not invertible; indeed, they cannot be invertible since their inputs are longer than their outputs. Further discussion regarding the struc- tural details of the S-boxes is given below.
We stress once again that everything in the above description (including the S-boxes themselves as well as the mixing permutation) is publicly known. The only secret is the master key which is used to derive all the sub-keys.
The S-boxes and the mixing permutation. The eight S-boxes that form the “core” of fˆ are a crucial element of the DES construction and were very carefully designed. Studies of DES have shown that if the S-boxes were slightly modified, DES would have been much more vulnerable to attack.
Practical Constructions of Symmetric-Key Primitives 215
This should serve as a warning to anyone who wishes to design a block cipher: seemingly arbitrary choices are not arbitrary at all, and if not made correctly may render the entire construction insecure.
Recall that each S-box maps a 6-bit input to a 4-bit output. Each S-box can be viewed as a table with 4 rows and 16 columns, where each cell of the table contains a 4-bit entry. A 6-bit input can be viewed as indexing one of the 26 = 64 = 4 × 16 cells of the table in the following way: The first and last input bits are used to choose the table row, and bits 2–5 are used to choose the table column. The 4-bit entry at some position of the table represents the output value for the input associated with that position.
The DES S-boxes have the following properties (among others):
1. Each S-box is a 4-to-1 function. (That is, exactly 4 inputs are mapped
to each possible output.) This follows from the properties below.
2. Each row in the table contains each of the 16 possible 4-bit strings exactly once.
3. Changing one bit of any input to an S-box always changes at least two bits of the output.
The mixing permutation was also designed carefully. In particular it has the property that the four output bits from any S box will affect the input to six S-boxes in the next round. (This is possible because of the expansion function that is applied in the next round before the S-boxes are computed.)
The DES avalanche effect. The design of the mangler function ensures that DES exhibits a strong avalanche effect. In order to see this, we will trace the difference between the intermediate values in a DES computation of two inputs that differ by just a single bit. Let us denote the two inputs to the cipher by (L0, R0) and (L′0, R0′ ), where we assume that R0 = R0′ and so the single-bit difference occurs in the left half of the inputs (it may help to refer to Equation (6.3) and Figure 6.6 in what follows). After the first round the intermediate values (L1, R1) and (L′1, R1′ ) still differ by only a single bit, although now this difference is in the right half. In the second round of DES,
ˆ
the right half of each input is run through f. Assuming that the bit where R1
and R1′ differ is not duplicated in the expansion step, the intermediate values before applying the S-boxes still differ by only a single bit. By property 3 of the S-boxes, the intermediate values after the S-box computation differ in at least two bits. The result is that the intermediate values (L2,R2) and (L′2, R2′ ) differ in three bits: there is a 1-bit difference between L2 and L′2 (carried over from the difference between R1 and R1′ ) and a 2-bit difference between R2 and R2′ .
The mixing permutation spreads the two-bit difference between R2 and R2′ such that, in the following round, each of the two bits is used as input to a different S-box, resulting in a difference of at least 4 bits in the right halves of the intermediate values. (If either or both of the two bits in which R2 and R2′
216 Introduction to Modern Cryptography
differ are duplicated by E, the difference may be even greater.) There is also now a 2-bit difference in the left halves. As with a substitution-permutation network, we have an exponential effect and so after 7 rounds we expect all 32 bits in the right half to be affected (and after 8 rounds all 32 bits in the left half will be affected as well).
DES has 16 rounds, and so the avalanche effect occurs very early in the computation. This ensures that the computation of DES on similar inputs yields independent-looking outputs.
Attacks on Reduced-Round DES
A useful exercise for understanding more about the DES construction and its security is to look at the behavior of DES with only a few rounds. We will show attacks on one-, two-, and three-round variants of DES (recall that the real DES has 16 rounds). DES with three rounds or fewer cannot be a pseudorandom function because three rounds are not enough for the avalanche effect to occur. Thus, we will be interested in demonstrating more difficult (and more damaging) key-recovery attacks which compute the key k using only a relatively small number of input/output pairs computed using that key. Some of the attacks are similar to those we have seen in the context of substitution-permutation networks; here, however, we will see how they are applied to a concrete block cipher rather than to an abstract design.
The attacks below will be known-plaintext attacks in which the adversary
knows some plaintext/ciphertext pairs {(xi, yi)} with yi = DESk(xi) for some
secret key k. When we describe the attacks, we will focus on a particular in-
put/output pair (x,y) and will describe the information about the key that
the adversary can derive from this pair. Continuing to use the notation de-
veloped earlier, we denote the left and right halves of the input x as L0 and
R0, respectively, and let Li,Ri denote the left and right halves after the ith
round. Recall that E denotes the DES expansion function, ki denotes the ˆ
sub-key used in round i, and fi(R) = f(ki,R) denotes the actual function being applied in the Feistel network in the ith round.
One-round DES. Say we are given an input/output pair (x,y). In one- round DES, we have y = (L1,R1), where L1 = R0 and R1 = L0 ⊕ f1(R0). We therefore know an input/output pair for f1; specifically, we know that f1(R0) = R1 ⊕ L0. By applying the inverse of the mixing permutation to the output R1 ⊕ L0, we obtain the intermediate value consisting of the outputs from all the S-boxes, where the first 4 bits are the output from the first S-box, the next 4 bits are the output from the second S-box, and so on.
Consider the (known) 4-bit output of the first S-box. Since each S-box is a 4-to-1 function, this means there are exactly four possible inputs to this S-box that would result in the given output, and similarly for all the other S-boxes; each such input is 6 bits long. The input to the S-boxes is simply the XOR of E(R0) with the sub-key k1. Since R0, and hence E(R0), is known, we can
Practical Constructions of Symmetric-Key Primitives 217
compute a set of four possible values for each 6-bit portion of k1. This means we have reduced the number of possible keys k1 from 248 to 448/6 = 48 = 216 (since there are four possibilities for each of the eight 6-bit portions of k1). This is already a small number and so we can just try all the possibilities on a different input/output pair (x′, y′) to find the right key. We thus obtain the key using only two known plaintexts in time roughly 216.
Two-round DES. In two-round DES, the output y is equal to (L2 , R2 ) where
L1 = R0
R1 = L0 ⊕ f1(R0)
L2 = R1 = L0 ⊕ f1(R0) R2 = L1 ⊕ f2(R1).
L0,R0,L2, and R2 are known from the given input/output pair (x,y), and thus we also know L1 = R0 and R1 = L2. This means that we know the input/output of both f1 and f2, and so the same method used in the attack on one-round DES can be used here to determine both k1 and k2 in time roughly 2 · 216. This attack works even if k1 and k2 are completely independent keys, although in fact the key schedule of DES ensures that many of the bits of k1 and k2 are equal (which can be used to further speed up the attack).
Three-round DES. Referring to Figure 6.5, the output value y is now equal to (L3,R3). Since L1 = R0 and R2 = L3, the only unknown values in the figure are R1 and L2 (which are equal).
Now we no longer have the input/output to any round function fi. For example, the output value of f2 is equal to L1 ⊕ R2, where both of these values are known. However, we do not know the value R1 that is input to f2. Similarly, we can determine the inputs to f1 and f3 but not the outputs of those functions. Thus, the attack we used to break one-round and two-round DES will not work here.
Instead of relying on full knowledge of the input and output of one of the round functions, we will use knowledge of a certain relation between the inputs and outputs of f1 and f3. Observe that the output of f1 is equal to L0 ⊕ R1 = L0 ⊕ L2, and the output of f3 is equal to L2 ⊕ R3. Therefore,
f1(R0)⊕f3(R2)=(L0 ⊕L2)⊕(L2 ⊕R3)=L0 ⊕R3,
where both L0 and R3 are known. That is, the XOR of the outputs of f1 and f3 is known. Furthermore, the input to f1 is R0 and the input to f3 is L3, both of which are known. We conclude that we can determine the inputs to f1 and f3, and the XOR of their outputs. We now describe an attack that finds the secret key based on this information.
Recall that the key schedule of DES has the property that the master key is divided into a “left half,” which we denote by kL, and a “right half” kR, each containing 28 bits. Furthermore, the 24 left-most bits of the sub-key used in
218 Introduction to Modern Cryptography
each round are taken only from kL, and the 24 right-most bits of each sub-key are taken only from kR. This means that kL affects only the inputs to the first four S-boxes in any round, while kR affects only the inputs to the last four S-boxes. Since the mixing permutation is known, we also know which bits of the output of each round function come out of each S-box.
The idea behind the attack is to separately traverse the key space for each half of the master key, giving an attack with complexity roughly 2 · 228 rather than complexity 256. Such an attack will be possible if we can verify a guess of half the master key, and we now show how this can be done. Say we guess some value for kL, the left half of the master key. We know the input R0 of f1, and so using our guess of kL we can compute the input to the first four S-boxes. This means that we can compute half the output bits of f1 (the mixing permutation spreads out the bits we know, but since the mixing permutation is known we know exactly which bits these are). Likewise, we can compute the same locations in the output of f3 by using the known input L3 to f3 and the same guess for kL. Finally, we can compute the XOR of these output values and check whether they match the appropriate bits in the known value of the XOR of the outputs of f1 and f3. If they are not equal, then our guess for kL is incorrect. A correct guess for kL will always pass this test, and so will not be eliminated, but an incorrect guess is expected to pass this test only with probability roughly 2−16 (since we check equality of 16 bits in two computed values). There are 228 possible values for kL, so if each incorrect value remains a viable candidate with probability 2−16 then we expect to be left with only 228 · 2−16 = 212 possibilities for kL after the above.
By performing the above for each half of the master key, we obtain in time 2·228 approximately 212 candidates for the left half and 212 candidates for the right half. Since each combination of the left half and right half is possible, we have 224 candidate keys overall and can run a brute-force search over this set using an additional input/output pair (x′,y′). (An alternative approach which is more efficient is to simply repeat the previous attack using the 212 remaining candidates for each half-key.) The time complexity of the attack is roughly 2 · 228 + 224 < 230, which is much less than 256.
Security of DES
After almost 30 years of intensive study, the best known practical attack on DES is still an exhaustive search through its key space. (We discuss some important theoretical attacks in Section 6.2.6. These attacks require a large number of input/output pairs, which would be difficult to obtain in an attack on any real-world system using DES.) Unfortunately, the 56-bit key length of DES is short enough that an exhaustive search through all 256 possible keys is now feasible. Already in the late 1970s there were strong objections to the choice of such a short key for DES. Back then, the objection was theoretical, as the computational power needed to search through that many
Practical Constructions of Symmetric-Key Primitives 219
keys was generally unavailable.4 The practicality of a brute-force attack on DES, however, was demonstrated in 1997 when the first of a set of DES challenges set up by RSA Security was solved by the DESCHALL project using thousands of computers coordinated across the Internet; the computation took 96 days. A second challenge was broken the following year in just 41 days by the distributed.net project. A significant breakthrough came in 1998 when the third challenge was solved in just 56 hours. This impressive feat was achieved via a special-purpose DES-breaking machine called Deep Crack that was built by the Electronic Frontier Foundation at a cost of $250,000. In 1999, a DES challenge was solved in just over 22 hours as a combined effort of Deep Crack and distributed.net. The current state-of-the-art is the DES cracking box by PICO Computing, which uses 48 FPGAs and can find a DES key in approximately 23 hours.
The time/space tradeoffs discussed in Section 5.4.3 show that exhaustive key-search attacks can be accelerated using pre-computation and additional memory. Due to the short key length of DES, time/space tradeoffs can be especially effective. Specifically, using pre-processing it is possible to generate a table a few terabytes large that then enables recovery of a DES key with high probability from a single input/output pair using approximately 238 DES evaluations (which can be computed in mere minutes). The bottom line is that DES has a key that is far too short, and cannot be considered secure for any serious application today.
A secondary cause for concern is the relatively short block length of DES. A short block length is problematic because the concrete security of many con- structions based on block ciphers depends on the block length—even if the cipher used is “perfect.” For example, the proof of security for CTR mode (cf. Theorem 3.32) shows that even when a completely random function is used an attacker can break the security of this encryption scheme with prob- ability 2q2/2l if it obtains q plaintext/ciphertext pairs. In the case of DES where l = 64, this means that if an attacker obtains only q = 230 plain- text/ciphertext pairs, security is compromised with high probability. Obtain- ing plaintext/ciphertext pairs is relatively easy if an adversary eavesdrops on the encryption of messages containing known headers, redundancies, etc.
The insecurity of DES has nothing to do with its design per se, but rather is due to its short key length (and, to a lesser extent, its short block length). This is a great tribute to the designers of DES, who seem to have succeeded in constructing an almost “perfect” block cipher (besides its too-short key). Since DES itself seems not to have significant structural weaknesses, it makes sense to use DES as a building block for constructing block ciphers with longer keys. We discuss this further in Section 6.2.4.
4In 1977, it was estimated that a computer that could crack DES in one day would cost $20 million to build.
220 Introduction to Modern Cryptography
The replacement for DES—the Advanced Encryption Standard (AES), cov- ered later in this chapter—was explicitly designed to address concerns regard- ing the short key length and block length of DES. AES supports 128-, 192-, or 256-bit keys, and a block length of 128 bits.
Better-than-brute-force attacks on DES were first shown in the early 1990s by Biham and Shamir, who developed a technique called differential cryptanal- ysis. Their attack takes time 237 and requires 247 chosen plaintexts. While the attack was a breakthrough from a theoretical standpoint, it does not appear to be of much practical concern since it is hard to imagine a realistic scenario where an adversary can obtain this many encryptions of chosen plaintexts.
Interestingly, the work of Biham and Shamir indicated that the DES S- boxes had been specifically designed to be resistant to differential cryptanal- ysis, suggesting that the technique of differential cryptanalysis was known (but not publicly revealed) by the designers of DES. After Biham and Shamir announced their result, this suspicion was confirmed.
Linear cryptanalysis was developed by Matsui in the mid-1990s and was also applied successfully to DES. The advantage of this attack is that it uses known plaintexts rather than chosen plaintexts. Nevertheless, the number of plaintext/ciphertext pairs required—about 243—is still huge.
We briefly describe differential and linear cryptanalysis in Section 6.2.6.
6.2.4 3DES: Increasing the Key Length of a Block Cipher
The main weakness of DES is its short key. It thus makes sense to try to design a block cipher with a larger key length using DES as a building block. Some approaches to doing so are discussed in this section. Although we refer to DES frequently throughout the discussion, and DES is the most prominent block cipher to which these techniques have been applied, everything we say here applies generically to any block cipher.
Internal modifications vs. “black-box” constructions. There are two general approaches one could take to constructing another cipher based on DES. The first approach would be to somehow modify the internal structure of DES, while increasing the key length. For example, one could leave the round function untouched and simply use a 128-bit master key with a differ- ent key schedule (still choosing a 48-bit sub-key in each round). Or, one could change the S-boxes themselves and use a larger sub-key in each round. The disadvantage of such approaches is that by modifying DES—in even the small- est way—we lose the confidence we have gained in DES by virtue of the fact that it has remained resistant to attack for so many years. Cryptographic con- structions are very sensitive, and even mild, seemingly insignificant changes can render a construction completely insecure.5 Changing the internals of a block cipher is therefore not recommended.
5In fact, various results to this effect have been shown for DES; e.g., changing the S-boxes or the mixing permutation can make DES much more vulnerable to attack.
Practical Constructions of Symmetric-Key Primitives 221
An alternative approach that does not suffer from the above problem is to use DES as a “black box” and not touch its internal structure at all. In this approach we treat DES as a “perfect” block cipher with a 56-bit key, and construct a new block cipher that only invokes the original, unmodified DES. Since DES itself is not tampered with, this is a much more prudent approach, and is the one we will pursue here.
Double Encryption
Let F be a block cipher with an n-bit key length and l-bit block length. Then a new block cipher F′ with a key of length 2n can be defined by
′ def
Fk1,k2(x) = Fk2(Fk1(x)),
where k1 and k2 are independent keys. For the case where F is DES, we obtain a cipher F ′ called 2DES that takes a 112-bit key; if exhaustive key search were the best available attack, a key length of 112 bits would be sufficient since an attack requiring time 2112 is completely out of reach. Unfortunately, we now show an attack on F′ that runs in time roughly 2n, significantly less than the 22n time one would hope would be necessary to carry out an exhaustive search for a 2n-bit key. This means that the new block cipher is essentially no better than the old one, even though it has a key that is twice as long.6
The attack is called a “meet-in-the-middle attack,” for reasons that will soon become clear. Say the adversary is given a single input/output pair (x, y), where y = Fk′1∗ ,k2∗ (x) = Fk2∗ (Fk1∗ (x)) for unknown k1∗, k2∗. The adversary can narrow down the set of possible keys in the following way:
1. For each k1 ∈ {0, 1}n, compute z := Fk1 (x) and store (z, k1) in a list L.
2. Foreachk2 ∈{0,1}n,computez:=F−1(y)andstore(z,k2)inalistL′.
3. Entries (z1,k1) ∈ L and (z2,k2) ∈ L′ are a match if z1 = z2. For each such match, add (k1,k2) to a set S. (Matches can be found easily after sorting L and L′ by their first components.)
See Figure 6.7 for a graphical depiction of the attack.
The attack takes time O(n · 2n), and requires space O((n + l) · 2n). The set
S output by this algorithm contains exactly those values (k1,k2) for which Fk1 (x) = F −1(y) (6.4)
or, equivalently, for which y = Fk′1,k2(x). In particular, (k1∗,k2∗) ∈ S. On
the other hand, a pair (k1, k2) ̸= (k1∗, k2∗) is (heuristically) expected to satisfy
Equation (6.4) with probability 2−l if we treat Fk1 (x) and F −1(y) as uniform k2
6This is not quite true since a brute-force attack on F can be carried out in time 2n and constant memory, whereas the attack we show on F′ requires 2n time and 2n memory. Nevertheless, the attack illustrates that F′ does not achieve the desired level of security.
k2
k2
222
Introduction to Modern Cryptography
FIGURE 6.7: A meet-in-the-middle attack.
l-bit strings, and so the expected size of S is 22n · 2−l = 22n−l. Using an- other few input/output pairs, and taking the intersection of the sets that are obtained, the correct (k1∗, k2∗) can be identified with very high probability.
Triple Encryption
The obvious generalization of the preceding approach is to apply the block cipher three times in succession. Two variants of this approach are common:
Variant 1: three keys. Choosethreeindependentkeysk1,k2,k3 anddefine ′′ def −1
Fk1 ,k2 ,k3 (x) = Fk3 (Fk2 (Fk1 (x))).
Variant 2: two keys. Choose two independent keys k1,k2 and then define
′′ def −1
Fk1 ,k2 (x) = Fk1 (Fk2 (Fk1 (x))).
Before comparing the security of the two alternatives we note that the middle invocation of F is reversed. If F is a sufficiently good cipher this makes no difference as far as security is concerned, since if F is a strong pseudorandom permutation then F−1 must be too. The reason for reversing the second application of F is to obtain backward compatibility: if one sets k1 = k2 = k3, the resulting function is equivalent to a single invocation of F using key k1.
Security of the first variant. The key length of the first variant is 3n, and so we might hope that the best attack on this cipher would require time 23n. However, the cipher is susceptible to a meet-in-the-middle attack just as in the case of double encryption, though the attack now takes time 22n. This is the best known attack. Thus, although this variant is not as secure as one might have hoped, it obtains sufficient security for all practical purposes even for n = 56 (assuming, of course, the original cipher F has no weaknesses).
Practical Constructions of Symmetric-Key Primitives 223
Security of the second variant. The key length of this variant is 2n and so the best we can hope for is security against attacks running in time 22n. There is no known attack with better time complexity when the adversary is given only a small number of input/output pairs. (See Exercise 6.13 for an attack using 2n chosen plaintexts.) Thus, two-key triple encryption is a reasonable choice in practice.
Triple-DES (3DES). Triple-DES (or 3DES) is based on a triple invocation of DES using two or three keys, as described above. 3DES was standardized in 1999, and is widely used today. Its main drawbacks are its relatively small block length and the fact that it is relatively slow since it requires 3 full block- cipher operations. Since the minimum recommended key length nowadays is 128 bits, 2-key 3DES is no longer recommended (due to its key length of only 112 bits). These drawbacks have led to the replacement of DES/triple-DES by the Advanced Encryption Standard, presented in the next section.
6.2.5 AES – The Advanced Encryption Standard
In January 1997, the United States National Institute of Standards and Technology (NIST) announced that it would hold a competition to select a new block cipher—to be called the Advanced Encryption Standard, or AES— to replace DES. The competition began with an open call for teams to submit candidate block ciphers for evaluation. A total of 15 different algorithms were submitted from all over the world, including contributions from many of the best cryptographers and cryptanalysts. Each team’s candidate cipher was intensively analyzed by members of NIST, the public, and (especially) the other teams. Two workshops were held, one in 1998 and one in 1999, to discuss and analyze the various submissions. Following the second workshop, NIST narrowed the field down to 5 “finalists” and the second round of the competition began. A third AES workshop was held in April 2000, inviting additional scrutiny on the five finalists. In October 2000, NIST announced that the winning algorithm was Rijndael (a block cipher designed by the Belgian cryptographers Vincent Rijmen and Joan Daemen), although NIST conceded that any of the 5 finalists would have made an excellent choice. In particular, no serious security vulnerabilities were found in any of the 5 finalists, and the selection of a “winner” was based in part on properties such as efficiency, performance in hardware, flexibility, etc.
The process of selecting AES was ingenious because any group that sub- mitted an algorithm, and was therefore interested in having its algorithm adopted, had strong motivation to find attacks on the other submissions. In this way, the world’s best cryptanalysts focused their attention on finding even the slightest weaknesses in the candidate ciphers submitted to the competi- tion. After only a few years each candidate algorithm was already subjected to intensive study, thus increasing our confidence in the security of the winning algorithm. Of course, the longer the algorithm is used and studied without
224 Introduction to Modern Cryptography
being broken, the more our confidence will continue to grow. Today, AES is
widely used and no significant security weaknesses have been discovered.
The AES construction. In this section, we present the high-level structure of Rijndael/AES. (Technically speaking, Rijndael and AES are not the same thing but the differences are unimportant for our discussion here.) As with DES, we will not present a full specification and our description should not be used as a basis for implementation. Our aim is only to provide a general idea of how the algorithm works.
The AES block cipher has a 128-bit block length and can use 128-, 192-, or 256-bit keys. The length of the key affects the key schedule (i.e., the sub-key that is used in each round) as well as the number of rounds, but does not affect the high-level structure of each round.
In contrast to DES, which uses a Feistel structure, AES is essentially a substitution-permutation network. During computation of the AES algo- rithm, a 4-by-4 array of bytes called the state is modified in a series of rounds. The state is initially set equal to the input to the cipher (note that the input is 128 bits, which is exactly 16 bytes). The following operations are then applied to the state in a series of four stages during each round:
Stage 1 – AddRoundKey: In every round of AES, a 128-bit sub-key is derived from the master key, and is interpreted as a 4-by-4 array of bytes. The state array is updated by XORing it with this sub-key.
Stage 2 – SubBytes: In this step, each byte of the state array is replaced by another byte according to a single fixed lookup table S. This substitu- tion table (or S-box) is a bijection over {0, 1}8.
Stage 3 – ShiftRows: In this step, the bytes in each row of the state array are shifted to the left as follows: the first row of the array is untouched, the second row is shifted one place to the left, the third row is shifted two places to the left, and the fourth row is shifted three places to the left. All shifts are cyclic so that, e.g., in the second row the first byte becomes the fourth byte.
Stage 4 – MixColumns: In this step, an invertible transformation is applied to the four bytes in each column. (Technically speaking, this is a linear transformation—i.e., matrix multiplication—over an appropriate field.) This transformation has the property that if two inputs differ in b > 0 bytes, then applying the transformation yields two outputs differing in at least 5 − b bytes.
In the final round, MixColumns is replaced with AddRoundKey. This prevents an adversary from simply inverting the last three stages, which do not depend on the key.
By viewing stages 3 and 4 together as a “mixing” step, we see that each round of AES has the structure of a substitution-permutation network: the
Practical Constructions of Symmetric-Key Primitives 225
round sub-key is first XORed with the input to the current round; next, a small, invertible function is applied to “chunks” of the resulting value; finally, the bits of the result are mixed in order to obtain diffusion. The only difference is that, unlike our previous description of substitution-permutation networks, here the mixing step does not consist of a simple shuffling of the bits but is instead carried out using a shuffling plus an invertible linear transformation. (Simplifying things a bit and looking at a trivial 3-bit example, shuffling the bits of x = x1∥x2∥x3 might, e.g., map x to x′ = x2∥x1∥x3. An invertible linear transformation might map x to x1 ⊕ x2∥x2 ⊕ x3∥x1 ⊕ x2 ⊕ x3.)
The number of rounds depends on the key length. Ten rounds are used for a 128-bit key, 12 rounds for a 192-bit key, and 14 rounds for a 256-bit key.
Security of AES. As we have mentioned, the AES cipher was subject to intense scrutiny during the selection process and has continued to be studied ever since. To date, there are no practical cryptanalytic attacks that are significantly better than an exhaustive search for the key.
We conclude that, as of today, AES constitutes an excellent choice for any cryptographic scheme that requires a (strong) pseudorandom permutation. It is free, standardized, efficient, and highly secure.
6.2.6 *Differential and Linear Cryptanalysis
Block ciphers are relatively complicated, and as such are difficult to analyze. Nevertheless, one should not be fooled into thinking that a complicated cipher is difficult to break. On the contrary, it is very hard to construct a secure block cipher and surprisingly easy to find attacks on most constructions (no matter how complicated they appear). This should serve as a warning that non-experts should not try to construct new ciphers. Given the availability of triple-DES and AES, it is hard to justify using anything else.
In this section we describe two tools that are now a standard part of the cryptanalyst’s toolbox. Our goal here is give a taste of some advanced crypt- analysis, as well as to reinforce the idea that designing a secure block cipher involves careful choice of its components.
Differential cryptanalysis. This technique, which can lead to a chosen- plaintext attack on a block cipher, was first presented in the late 1980s by Biham and Shamir, who used it to attack DES in 1993. The basic idea behind the attack is to tabulate specific differences in the input that lead to specific differences in the output with probability greater than would be expected for a random permutation. Specifically, say the differential (∆x,∆y) occurs in some keyed permutation F ′ with probability p if for uniform inputs x1 and x2 satisfying x1 ⊕ x2 = ∆x, and uniform choice of key k, the probability that Fk′ (x1) ⊕ Fk′ (x2) = ∆y is p. For any fixed (∆x, ∆y) and x1, x2 satisfying x1 ⊕ x2 = ∆x, if we choose a uniform function f : {0, 1}l → {0, 1}l, we have Pr[f(x1) ⊕ f(x2) = ∆y] = 2−l. In a weak block cipher, however, there may
226 Introduction to Modern Cryptography
be differentials that occur with significantly higher probability. This can be leveraged to give a full key-recovery attack, as we now show for SPNs.
We describe the basic idea, and then work through a concrete example. Let F be a block cipher with l-bit block length which is an r-round SPN, and let Fk′ (x) denote the intermediate result in a computation of Fk (x) after applying the key-mixing step of round r. (That is, F′ excludes the S-box substitution and mixing permutation of the last round, as well as the final key-mixing step.) Say there is a differential (∆x,∆y) in F′ that occurs with probability p ≫ 2−l. It is possible to exploit this high-probability differential to learn bits of the final mixing sub-key kr+1. The high-level idea is as follows: let {(xi1, xi2)}Li=1 be a collection of L pairs of random inputs with differential ∆x, i.e., with xi1 ⊕ xi2 = ∆x for all i. Using a chosen-plaintext attack, obtain the values y1i = Fk(xi1) and y2i = Fk(xi2) for all i. Now, for all possible bitstrings k∗ ∈ {0,1}l, compute y ̃1i = Fk′(xi1) and y ̃2i = Fk′(xi2), assuming the value of the final sub-key kr+1 is k∗. This is done by inverting the final key- mixing step using k∗, and then inverting the mixing permutation and S-boxes of round r, which do not depend on the master key. When k∗ = kr+1, we expect that a p-fraction of the pairs will satisfy y ̃1i ⊕ y ̃2i = ∆y. On the other hand, when k∗ ̸= kr+1 we may heuristically expect only a 2−l-fraction of the pairs to yield this differential. By setting L large enough, the correct value of the final sub-key kr+1 can be determined.
This works, but is not very efficient since in each step we enumerate over 2l possible values. We can do better by guessing portions of kr+1 at a time. More concretely, assume the S-boxes in F have 1-byte input/output length, and focus on the first byte of ∆y, which we assume is nonzero. It is possible to verify if the differential holds in that byte by guessing only 8 bits of kr+1, namely, the 8 bits that correspond (after the round-r mixing permutation) to the output of the first S-box. Thus, proceeding as above, we can learn these 8 bits by enumerating over all possible values for those bits, and seeing which value yields the desired differential in the first byte with the highest probability. Incorrect guesses for those 8 bits yield the expected differential in that byte with (heuristic) probability 2−8, but the correct guess will give the expected differential with probability roughly p+2−8; this is because with probability p the differential holds on the entire block (so in particular for the first byte), and when this is not the case then we can treat the differential in the first byte as random. Note that different differentials may be needed to learn different portions of kr+1.
In practice, various optimizations are performed to improve the effective- ness of the above test or, more specifically, to increase the gap between the probability that an incorrect guess yields the differential vs. the probability that a correct guess does. One optimization is to use a low-weight differen- tial in which ∆y has many zero bytes in the positions that enter the S-boxes in round r. Any pairs y ̃1,y ̃2 satisfying such a differential have equal values entering many of the S-boxes in round r, and so will result in output values y1, y2 that are equal in the corresponding bit-positions (depending on the final
Practical Constructions of Symmetric-Key Primitives 227
mixing permutation). This means that when performing the test described earlier, one can simply discard any pairs (y1i , y2i ) that do not agree in those bit- positions (since the corresponding intermediate values (y ̃1, y ̃2) cannot possibly satisfy the differential, for any choice of the final sub-key). This significantly improves the effectiveness of the attack.
Once kr+1 is known, the attacker can “peel off” the final key-mixing step, as well as the mixing permutation and S-box substitution steps of round r (since these do not depend on the master key), and then apply the same attack—using a different differential—to find the rth-round sub-key kr, and so on, until it learns all sub-keys (or, equivalently, the entire master key).
A worked example. We work through a “toy” example, illustrating also how a good differential can be found. We use a four-round SPN with a block length of 16 bits, based on a single S-box with 4-bit input/output length. The S-box is defined as follows (the table shows how each 4-bit input is mapped to a 4-bit output):
Input: 0000 0001 0010 0011 0100 0101 0110 0111 Output: 0000 1011 0101 0001 0110 1000 1101 0100
Input: 1000 1001 1010 1011 1100 1101 1110 1111 Output: 1111 0111 0010 1100 1001 0011 1110 1010
The mixing permutation, showing where each bit is moved for each of the 16 bits in a block, is as follows:
In:1234 5 6 7 8 9 10111213141516 Out: 7 2 3 8 12 5 11 9 10 1 14 13 4 6 16 15
FIGURE 6.8: The effect of the input difference ∆x = 1111 in our S-box.
228 Introduction to Modern Cryptography
We first find a differential in the S-box. Let S(x) denote the output of the S-box on input x. Consider the differential ∆x = 1111. Then, for example, we have S(0000) ⊕ S(1111) = 0000 ⊕ 1010 = 1010 and so in this case a difference of 1111 in the inputs leads to a difference of 1010 in the outputs. Let us see if this relation holds frequently. We have S(0001) = 1011 and S(0001 ⊕ 1111) = S(1110) = 1110, and so here a difference of 1111 in the inputs does not lead to a difference of 1010 in the outputs. However, S(0100) = 0110 and S(0100 ⊕ 1111) = S(1011) = 1100 and so in this case, a difference of 1111 in the inputs yields a difference of 1010 in the outputs. In Figure 6.8 we have tabulated results for all possible inputs. We see that half the time a difference of 1111 in the inputs yields a difference of 1010 in the outputs. Thus, (1111, 1010) is a differential in S that occurs with probability 1/2.
FIGURE 6.9: Differentials in our S-box.
This same process can be carried out for all 24 input differences ∆x to calculate the probability of every differential. Namely, for each pair (∆x,∆y) we tabulate the number of 4-bit inputs x for which S(x) ⊕ S(x ⊕ ∆x) = ∆y. We have done this for our example S-box in Figure 6.9. (For conciseness we represent (∆x,∆y) using hexadecimal notation.) The table should be read as follows: entry (i, j) counts how many inputs with difference i map to outputs with difference j. Observe, for example, that there are 8 inputs with difference 0xF = 1111 that map to output 0xA = 1010, as we have shown above. This is the highest-probability differential (apart from the trivial differential (0, 0)). But there are other differentials of interest: an input difference of 0x4 = 0100 maps to an output difference of 0x6 = 0110 with probability 6/16 = 3/8, and there are several differentials with probability 4/16 = 1/4.
Output Difference ∆y
0123456789ABCDEF
0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10000400022220040
20000020202242200
30224040000000220
40002226020002000
50220000040004220
60202000002020404
70200242202000200
80002000200022224
90202220402200000
A0040202420200000
B0020002002004240
C0000000044040004
D0422002200000004
E0242400000002020
F0000022020820000
Input Difference ∆x
Practical Constructions of Symmetric-Key Primitives 229
0000110000000000
0xC=1100 maps to 0x8=1000 w.p. 1/4
0x1=0001 maps to 0x4=0100 w.p. 1/4
0x8=1000
maps to 1 0 0 0
1100 1000
0xF=1111 w.p. 1/4
1111
0001 0100
0110001100000000
FIGURE 6.10: Tracing differentials through a four-round SPN that uses the S-box and mixing permutation given in the text.
We now extend this to find a good differential for the first three rounds of the SPN. Consider evaluating the SPN on two inputs that have a differential of 0000 1100 0000 0000, and tracing the differential between the intermediate values at each step of this evaluation. (Refer to Figure 6.10, which shows the four full rounds of the SPN plus the final key-mixing step. For clarity, the fig- ure omits the mixing permutation in the 4th round; that mixing permutation just has the effect of shuffling the bits of the differential, and so can easily be taken into account in the attack.) The key-mixing step in the first round does not affect the differential, and so the inputs to the second S-box in the first round have differential 1100. We see from Figure 6.9 that a difference of 0xC = 1100 in the inputs to the S-box yields a difference of 0x8 = 1000 in the outputs of the S-box with probability 1/4. So with probability 1/4 the differential in the output of the 2nd S-box after round 1 is a single bit which is
230 Introduction to Modern Cryptography
moved by the mixing permutation from the 5th position to the 12th position. (The inputs to the other S-boxes are equal, so their outputs are equal and the differential of the outputs is 0000.) Assuming this to be the case, the input difference to the third S-box in the second round is 0x1 = 0001 (once again, the key-mixing step in the second round does not affect the differential); using Figure 6.9 we have that with probability 1/4 the output difference from that S-box is 0x4 = 0100. Thus, once again there is just a single output bit that is different, and it is moved from the 10th position to the first position by the mixing permutation. Finally, consulting Figure 6.9 yet again, we see that an input difference of 0x8 = 1000 to the S-box results in an output difference of 0xF = 1111 with probability 1/4. The bits in positions 1, 2, 3, and 4 are then moved by the mixing permutation to positions 7, 2, 3, and 8.
Overall, then, we see that an input difference of ∆x = 0000 1100 0000 0000
yields the output difference ∆y = 0110 0011 0000 0000 after three rounds with
probability at least 1 · 1 · 1 = 1 .7 (We multiply the probabilities since we 4 4 4 64
heuristically assume independence of each round.) For a random function, the probability that any given differential occurs is just 2−16 = 1/65536. Thus, the differential we have found occurs with probability significantly higher than what would be expected for a random function. Observe also that we have found a low-weight differential.
We can use this differential to find the first 8 bits of the final sub-key k5. As discussed earlier, we begin by letting {(xi1,xi2)}Li=1 be a set of L pairs of random inputs with differential ∆x. Using a chosen-plaintext attack, we then obtain the values y1i = Fk(xi1) and y2i = Fk(xi2) for all i. Now, for all possible values for the initial 8 bits of k5, we compute the initial 8 bits of y ̃1i,y ̃2i, the intermediate values after the key-mixing step of the 4th round. (We can do this because we only need to invert the two left-most S-boxes of the 4th round in order to derive those 8 bits.) When we guess the correct value for the initial 8 bits of k5, we expect the 8-bit differential 0110 0011 to occur with probability at least 1/64. Heuristically, an incorrect guess yields the expected differential only with probability 2−8 = 1/256. By setting L large enough, we can (with high probability) identify the correct value.
Differential attacks in practice. Differential cryptanalysis is very power- ful, and has been used to attack real ciphers. A prominent example is FEAL-8, which was proposed as an alternative to DES in 1987. A differential attack on FEAL-8 was found that requires just 1,000 chosen plaintexts. In 1991, it took less than 2 minutes using this attack to find the entire key. Today, any proposed cipher is tested for resistance to differential cryptanalysis.
A differential attack was the first attack on DES to require less time than a simple brute-force search. While an interesting theoretical result, the attack is not of significant concern in practice since it requires 247 chosen plaintexts. It
7This is a lower bound on the probability of the differential, since there may be other differences in the intermediate values that result in the same difference in the outputs.
Practical Constructions of Symmetric-Key Primitives 231
is very difficult for an attacker to obtain this many chosen plaintext/ciphertext pairs in most real-world applications. Interestingly, small modifications to the S-boxes of DES make the cipher much more vulnerable to differential attacks. Personal testimony of the DES designers (after differential attacks were discovered in the outside world) has confirmed that the S-boxes of DES were designed specifically to thwart differential attacks.
Linear cryptanalysis. Linear cryptanalysis was developed by Matsui in the
early 1990s. We will only describe the technique at a high level. The basic
idea is to consider linear relationships between the input and output that
hold with higher probability than would be expected for a random function.
In more detail, say that bit positions i1,…,iin and i′1,…,i′out have linear def
bias ε if, for uniform x and k, and y = Fk(x), it holds that
⊕···⊕y
where xi, yi denote the ith bits of x and y. For a random function and any fixed set of bit positions, we expect the bias to be close to 0. Matsui showed how to use a large enough bias in a cipher F to find the secret key. Besides giving another method for attacking ciphers, an important feature of this attack is that it does not require chosen plaintexts, but rather known plaintexts suffice. This is very significant, since an encrypted file can provide a huge amount of known plaintext, whereas gathering encryptions of chosen plaintexts is much more difficult. Matsui showed that DES can be broken with just 243 known plaintext/ciphertext pairs.
Impact on block-cipher design. Modern block ciphers are designed and evaluated based, in part, on their resistance to differential and linear crypt- analysis. When constructing a block cipher, designers choose S-boxes and other components so as to minimize differential probabilities and linear biases. We remark that it is not possible to eliminate all high-probability differentials in an S-box: any S-box will have some differential that occurs more frequently than others. Still, these deviations can be minimized. Moreover, increasing the number of rounds (and choosing the mixing permutation carefully) can both reduce the differential probabilities as well as make it more difficult for cryptanalysts to find any differentials to exploit.
6.3 Hash Functions
Recall from Chapter 5 that the primary security requirement for a hash function H is collision resistance; that is, it should be difficult to find a collision, or distinct inputs x,x′ such that H(x) = H(x′). (We drop mention
Pr[x ⊕···⊕x ⊕y
i1 iin i i 2
′
′
= 0] − = ε,
1
1 out
232 Introduction to Modern Cryptography
of any key here, since real-world hash functions are generally unkeyed.) If the hash function has l-bit output length, then the best we can hope for is that it should be infeasible to find a collision using substantially fewer than 2l/2 invocations of H. (See Section 5.4.1.) We would also like the hash function to achieve (second) preimage resistance against attacks running in time much less than 2l, although we do not consider such attacks in our discussion here.
Hash functions are generally constructed in two steps. First, a compression function (i.e., a fixed-length hash function) h is designed; next, some mecha- nism is used to extend h so as to handle arbitrary input lengths. In Section 5.2 we have already shown one approach—the Merkle–Damg ̊ard transform—for the second step. Here, we explore a technique for designing the underlying compression function. We also discuss some hash functions used in prac- tice. A theoretical construction of a compression function based on a number- theoretic assumption is given in Section 8.4.2.
6.3.1 Hash Functions from Block Ciphers
Perhaps surprisingly, it is possible to build a collision-resistant compression
function from a block cipher that satisfies certain additional properties. There
are several ways to do this; one of the most common is via the Davies–Meyer
construction. Let F be a block cipher with n-bit key length and l-bit block
length. We can then define the compression function h : {0, 1}n+l → {0, 1}l def
by h(k, x) = Fk (x) ⊕ x. (See Figure 6.11.)
FIGURE 6.11: The Davies–Meyer construction.
We do not know how to prove collision resistance of the resulting compres- sion function based only on the assumption that F is a strong pseudorandom permutation, and in fact there are reasons to believe such a proof is not pos- sible. We can, however, prove collision resistance if we are willing to model F as an ideal cipher. The ideal-cipher model is a strengthening of the random- oracle model (see Section 5.5), in which we posit that all parties have access to an oracle for a random keyed permutation F : {0, 1}n × {0, 1}l → {0, 1}l as well as its inverse F−1 (i.e., such that F−1(k,F(k,x)) = x for all k,x). Another way to think of this is that each key k ∈ {0, 1}n specifies an indepen- dent, uniform permutation F(k,·) on l-bit strings. As in the random-oracle model, the only way to compute F (or F−1) is to explicitly query the oracle with (k, x) and receive back F (k, x) (or F −1(k, x)).
Practical Constructions of Symmetric-Key Primitives 233
Analyzing constructions in the ideal-cipher model comes with all the advan- tages and disadvantages of working in the random-oracle model, as discussed at length in Section 5.5. We only add here that the ideal-cipher model implies the absence of related-key attacks, in the sense that (as we have just said) the permutations F(k,·) and F(k′,·) must behave independently even if, for ex- ample, k and k′ differ in only a single bit. In addition, there can be no “weak keys” k (say, the all-0 key) for which F(k,·) is easily distinguishable from random. It also means that F (k, ·) should “behave randomly” even when k is known. For any real-world cipher F , these properties do not necessarily hold (and are not even well defined) even if F is a strong pseudorandom permu- tation, and the reader may note that we have not discussed these properties in any of our analysis of real-world block-cipher constructions. (In fact, DES and triple-DES do not satisfy these properties.) Any block cipher being used to instantiate an ideal cipher must be evaluated with respect to these more stringent requirements.
We prove the following theorem in a concrete setting, but the proof could be adapted easily for the asymptotic setting as well.
THEOREM 6.5 If F is modeled as an ideal cipher, then the Davies–Meyer construction yields a collision-resistant compression function. Concretely, any attacker making q < 2l/2 queries to its ideal-cipher oracles finds a collision with probability at most q2/2l.
PROOF To be clear, we consider here the probabilistic experiment in which F is sampled at random (more precisely, for each k ∈ {0,1}n the function F (k, ·) : {0, 1}l → {0, 1}l is chosen uniformly from the set Perml of permutations on l-bit strings) and then the attacker is given oracle access to F and F −1. The attacker then tries to find a colliding pair (k, x), (k′, x′), i.e., for which F(k,x)⊕x = F(k′,x′)⊕x′. No computational bounds are placed on the attacker other than bounding the number of oracle queries it makes. We assume that if the attacker outputs a colliding pair (k, x), (k′, x′) then it has previously made the oracle queries necessary to compute the values F(k,x) and F(k′,x′). We also assume the attacker never makes the same query more than once, and never queries F−1(k,y) once it has learned that y = F(k,x) (and vice versa). All these assumptions are without loss of generality.
Consider the ith query the attacker makes to its oracles. A query (ki,xi) def
to F reveals only the hash value hi = h(ki,xi) = F(ki,xi) ⊕ xi; similarly, a query to F−1 giving the result xi = F−1(ki,yi) yields only the hash value
def −1
hi = h(ki, xi) = yi ⊕ F (ki, yi). The attacker does not obtain a collision
unlesshi =hj forsomei̸=j.
Fix i,j with i > j and consider the probability that hi = hj. At the
time of the ith query, the value of hj is fixed. A collision between hi and hj is obtained on the ith query only if the attacker queries (ki,xi) to F and obtains the result F(ki,xi) = hj ⊕ xi, or queries (ki,yi) to F−1 and obtains
234 Introduction to Modern Cryptography
the result F −1(ki, yi) = hj ⊕ yi. Either event occurs with probability at most 1/(2l − (i − 1)) since, for example, F(ki,xi) is uniform over {0,1}l except that it cannot be equal to any value F(ki,x) already defined by the attacker’s (at most) i − 1 previous oracle queries using key ki. Since i ≤ q < 2l/2, the probability that hi = hj is at most 2/2l.
Taking a union bound over all q2 < q2/2 distinct pairs i, j gives the result stated in the theorem.
Davies–Meyer and DES. As we have mentioned above, one must take care when instantiating the Davies–Meyer construction with any concrete block cipher, since the cipher must satisfy additional properties (beyond being a strong pseudorandom permutation) in order for the resulting construction to be secure. In Exercise 6.21 we explore what goes wrong when DES is used in the Davies–Meyer construction.
This should serve as a warning that the proof of security for the Davies– Meyer construction in the ideal-cipher model does not necessarily translate into real security when instantiated with a real cipher. Nevertheless, as we will describe below, this paradigm has been used to construct practical hash functions that have resisted attack (but specifically when the block cipher at the center of the construction was designed specifically for this purpose).
In conclusion, the Davies–Meyer construction is a useful paradigm for con- structing collision-resistant compression functions. However, it should not be applied to block ciphers designed for encryption, like DES and AES.
6.3.2 MD5
MD5 is a hash function with a 128-bit output length. It was designed in 1991 and for some time was believed to be collision resistant. Over a period of several years, various weaknesses began to be found in MD5 but these did not appear to lead to any easy way to find collisions. Shockingly, in 2004 a team of Chinese cryptanalysts presented a new method for finding collisions in MD5; they were easily able to convince others that their approach was correct by demonstrating an explicit collision! Since then, the attack has been improved and today collisions can be found in under a minute on a desktop PC. In addition, the attacks have been extended so that even “controlled collisions” (e.g., two postscript files generating arbitrary viewable content) can be found.
Due to these attacks, MD5 should not be used anywhere cryptographic se- curity is needed. We mention MD5 only because it is still found in legacy code.
6.3.3 SHA-0, SHA-1, and SHA-2
The Secure Hash Algorithm (SHA) refers to a series of cryptographic hash functions standardized by NIST. Perhaps the most well known of these is SHA-1, which was introduced in 1995. This algorithm has a 160-bit output length and supplanted a predecessor called SHA-0, which was withdrawn due to unspecified flaws discovered in that algorithm.
Practical Constructions of Symmetric-Key Primitives 235
At the time of this writing, an explicit collision has yet to be found in SHA-1. However, theoretical analysis over the past few years indicates that collisions in SHA-1 can be found using significantly fewer than the 280 hash- function evaluations that would be necessary using a birthday attack, and it is conjectured that a collision will be found soon. It is therefore recommended to migrate to SHA-2, which does not currently appear to have the same weak- nesses. SHA-2 is comprised of two related functions: SHA-256 and SHA-512, with 256- and 512-bit output lengths, respectively.
All hash functions in the SHA family are constructed using the same basic design, which incorporates components we have already seen: A compres- sion function is first defined by applying the Davies–Meyer construction to a block cipher, and this is then extended to support arbitrary length inputs using the Merkle–Damg ̊ard transform. One interesting thing here is that the block cipher in each case was designed specifically for building the compres- sion function. In fact, it was only retroactively that the underlying compo- nents in the compression functions were isolated and analyzed as the block ciphers SHACAL-1 (for SHA-1) and SHACAL-2 (for SHA-2). These ciphers are themselves intriguing, as they have large block lengths (160 and 256 bits, respectively) and 512-bit keys.
6.3.4 SHA-3 (Keccak)
In the aftermath of the collision attack on MD5 and the theoretical weak- nesses found in SHA-1, NIST announced in late 2007 a public competition to design a new cryptographic hash function to be called SHA-3. Submitted algorithms were required to support at least 256- and 512-bit output lengths. As in the case of the AES competition from roughly 10 years earlier, the competition was completely open and transparent; anyone could submit an algorithm for consideration, and the public was invited to submit their opin- ions on any of the candidates. The 51 first-round candidates were narrowed down to 14 in December 2008, and these were further reduced to five finalists in 2010. The remaining candidates were subject to intense scrutiny by the cryptographic community over the next two years. In October 2012, NIST announced the selection of Keccak as the winner of the competition. As of the time of this writing, this algorithm is undergoing standardization as the next-generation replacement for SHA-2.
Keccak is unusual in several respects. (Interestingly, one of the reasons Keccak was chosen is because its structure is very different from that of SHA-1 and SHA-2.) At its core, it is based on an unkeyed permutation f with a large block length of 1600 bits; this is radically different from, e.g., the Davies–Meyer construction, which relies on a keyed permutation. Fur- thermore, Keccak does not use the Merkle–Damg ̊ard transform to handle arbitrary input lengths. Instead, it uses a newer approach called the sponge construction. Keccak—and the sponge construction more generally—can be analyzed in the random-permutation model in which we postulate that parties
236 Introduction to Modern Cryptography
have access to an oracle for a random permutation f : {0, 1}l → {0, 1}l (and possibly its inverse). This is weaker than the ideal-cipher model; indeed, we can easily obtain a random permutation in the ideal-cipher model by simply fixing the key to the cipher to be any constant value.
It will be fascinating to watch the new hash standard evolve, and to see how quickly developers adapt from SHA-1/SHA-2 to the newer SHA-3.
References and Additional Reading
Additional information on LFSRs and stream ciphers can be found in the Handbook of Applied Cryptography [120] or the more recent text by Paar and Pelzl [135]. Further details regarding eSTREAM, as well as a detailed spec- ification of Trivium, can be found at http://www.ecrypt.eu.org/stream. See the work of AlFardan et al. [9] for a recent survey of attacks on RC4.
The confusion-diffusion paradigm and substitution-permutation networks were introduced by Shannon [154] and Feistel [64]. See the thesis of Heys [90] for further information regarding SPN design. Better generic attacks on three- round SPNs than what we have shown here are known [31]. Miles and Vi- ola [126] give a theoretical analysis of SPNs.
Feistel networks were first described in [64]. A theoretical analysis of Feistel networks was given by Luby and Rackoff [116]; see Chapter 7.
More details on DES, AES, and block-cipher constructions in general can be found in the text by Knudsen and Robshaw [106]. The meet-in-the-middle attack on double encryption is due to Diffie and Hellman [59]. The attack on two-key triple encryption mentioned in the text (and explored in Exercise 6.13) is by Merkle and Hellman [124]. Theoretical analysis of the security of double and triple encryption can be found in [6, 24].
DESX is another technique for increasing the effective key length of DES. The secret key consists of values ki, ko ∈ {0, 1}64, and k ∈ {0, 1}56, and the cipher is defined by
def
This methodology was first studied by Even and Mansour [63] in a slightly different context. Its application to DES was proposed in unpublished work by Rivest, and its security was later analyzed by Kilian and Rogaway [105, 149].
Differential cryptanalysis was introduced by Biham and Shamir [29] and its application to DES is described in a book by those authors [30]. Copper- smith [45] describes design principles of the DES S-boxes in light of the public discovery of differential cryptanalysis. Linear cryptanalysis was discovered by Matsui [118], who shows its application to DES there. For more information on these advanced cryptanalytic techniques, we refer the reader to the tutorial
DESXki,k,ko (x) = ko ⊕ DESk(x ⊕ ki).
Practical Constructions of Symmetric-Key Primitives 237
on differential and linear cryptanalysis by Heys [91] or to the aforementioned book by Knudsen and Robshaw [106].
For further information about MD5 and SHA-1 see [120]. Note, however, that their treatment pre-dates the attacks by Wang et al. [175, 174]. Construc- tions of compression functions from block ciphers are analyzed in [143, 33]. The sponge construction is described and analyzed by Bertoni et al. [28]. For additional details about the SHA-3 competition, see the NIST webpage at http://csrc.nist.gov/groups/ST/hash/sha-3/index.html .
Exercises
6.1 Assumeadegree-6LFSRwithc0 =c5 =1andc1 =c2 =c3 =c4 =0.
(a) What are the first 10 bits output by this LFSR if it starts in initial
state (1, 1, 1, 1, 1, 1)?
(b) Is this LFSR maximal length?
6.2 In this question we consider a nonlinear combination generator, where we have a degree-n LFSR but the output at each time step is not s0 but instead g(s0, . . . , sn−1) for some nonlinear function g. Assume the feedback coefficients of the LFSR are known, but its initial state is not. Show that each of the following choices of g does not yield a good pseudorandom generator:
(a) g(s0,...,sn−1)=s0 ∧s1.
(b) g(s0,...,sn−1)=(s0 ∧s1)⊕s2.
6.3 Let F be a block cipher with n-bit key length and block length. Say there is a key-recovery attack on F that succeeds with probability 1 using n chosen plaintexts and minimal computational effort. Prove formally that F cannot be a pseudorandom permutation.
6.4 Inourattackonaone-roundSPN,weconsideredablocklengthof64bits and 16 S-boxes that each take a 4-bit input. Repeat the analysis for the case of 8 S-boxes, each taking an 8-bit input. What is the complexity of the attack now? Repeat the analysis again with a 128-bit block length and 16 S-boxes that each take an 8-bit input.
6.5 Consider a modified SPN where instead of carrying out the key-mixing, substitution, and permutation steps in alternating order for r (full) rounds, the cipher instead first applies r rounds of key mixing, then carries out r rounds of substitution, and finally applies r mixing permu- tations. Analyze the security of this construction.
238 Introduction to Modern Cryptography
6.6 In this question we assume a two-round SPN with 64-bit block length.
(a) Assume independent 64-bit sub-keys are used in each round, so the master key is 192 bits long. Show a key-recovery attack using much less than 2192 time.
(b) Assume the first and third sub-keys are equal, and the second sub- key is independent, so the master key is 128 bits long. Show a key-recovery attack using much less than 2128 time.
6.7 What is the output of an r-round Feistel network when the input is (L0,R0) in each of the following two cases:
(a) Each round function outputs all 0s, regardless of the input. (b) Each round function is the identity function.
6.8 Let Feistelf1 ,f2 (·) denote a two-round Feistel network using functions f1 andf2 (inthatorder). ShowthatifFeistelf1,f2(L0,R0)=(L2,R2),then Feistelf2,f1(R2,L2) = (R0,L0).
6.9 For this exercise, rely on the description of DES given in this chapter, but use the fact that in the actual construction of DES the two halves of the output of the final round of the Feistel network are swapped. That is, if the output of the final round of the Feistel network is (L16,R16), then the output of DES is (R16,L16).
(a) Show that the only difference between computation of DESk and DES−1 is the order in which sub-keys are used. (Rely on the
previous exercise.)
(b) Show that for k = 056 it holds that DESk(DESk(x)) = x.
Hint: Consider the sub-keys generated from this key.
(c) Find three other DES keys with the same property. These keys are known as weak keys for DES. (Note: the keys you find will differ from the actual weak keys of DES because of differences in our presentation.)
(d) Do these 4 weak keys represent a serious vulnerability in the use of triple-DES as a pseudorandom permutation? Explain.
6.10 Show that DES has the property that DESk(x) = DESk ̄(x ̄) for ev- ery key k and input x (where z ̄ denotes the bitwise complement of z). (This is called the complementarity property of DES.) Does this repre- sent a serious vulnerability in the use of triple-DES as a pseudorandom permutation? Explain.
k
Practical Constructions of Symmetric-Key Primitives 239 6.11 Describe attacks on the following modifications to DES:
(a) Each sub-key is 32 bits long, and the round function simply XORs ˆ
the sub-key with the input to the round (i.e., f (k, R) = ki ⊕ R). For this question, the key schedule is unimportant and you can treat the sub-keys ki as independent keys.
(b) Instead of using different sub-keys in every round, the same 48-bit sub-key is used in every round. Show how to distinguish the cipher from a random permutation in ≪ 248 time.
Hint: Exercises 6.8 and 6.9 may help. . .
6.12 (This exercise relies on Exercise 6.9.) Our goal is to show that for any
weak key k of DES, it is easy to find an input x such that DESk(x) = x.
(a) Assume we evaluate DESk on input (L0, R0), and the output after 8 rounds of the Feistel network is (L8, R8) with L8 = R8. Show that the output of DESk(L0, R0) is (L0, R0). (Recall from Exercise 6.9 that DES swaps the two halves of the 16th round of the Feistel network before outputting the result.)
(b) Show how to find an input (L0,R0) with the property in part (a).
6.13 This question illustrates an attack on two-key triple encryption. Let F be a block cipher with n-bit block length and key length, and set
′ def −1
Fk1 ,k2 (x) = Fk1 (Fk2 (Fk1 (x))).
(a) Assume that given a pair (m1, m2) it is possible to find in constant time all keys k2 such that m2 = F−1(m1). Show how to recover
′k2 n the entire key for F (with high probability) in time roughly 2
using three known input/output pairs.
(b) In general, it will not be possible to find k2 as above in constant time. However, show that by using a preprocessing step taking 2n time it is possible, given m2, to find in (essentially) constant time all keys k2 such that m2 = F −1(0n).
k2
(c) Assume k1 is known and that the pre-processing step above has
already been run. Show how to use the value y = Fk′1,k2 (x) for a single chosen plaintext x to determine k2 in constant time.
(d) Put the above components together to devise an attack that re- covers the entire key of F′ by running in roughly 2n time and requesting the encryption of roughly 2n chosen inputs.
6.14 Say the key schedule of DES is modified as follows: the left half of the master key is used to derive all the sub-keys in rounds 1–8, while the right half of the master key is used to derive all the sub-keys in rounds 9–16. Show an attack on this modified scheme that recovers the entire key in time roughly 228.
240 Introduction to Modern Cryptography
6.15 Let f : {0,1}m × {0,1}l → {0,1}l and g : {0,1}n × {0,1}l → {0,1}l be secure block ciphers with m > n, and define Fk1 ,k2 (x) = fk1 (gk2 (x)). Show a key-recovery attack on F using time O(2m) and space O(l · 2n).
6.16 Define DESYk,k′ (x) = DESk(x ⊕ k′). The key length of DESY is 120 bits. Show a key-recovery attack on DESY taking time and space ≈ 264.
6.17 Choose random S-boxes and mixing permutations for SPNs of different sizes, and develop differential attacks against them. We recommend trying five-round SPNs with 16-bit and 24-bit block lengths, using S- boxes with 4-bit input/output. Write code to compute the differential tables, and to carry out the attack.
6.18 Implement the time/space tradeoff for 40-bit DES (i.e., fix the first 16 bits of the key of DES to 0). Calculate the time and memory needed, and empirically estimate the probability of success. Experimentally verify the increase in success probability as the number of tables is increased. (Warning: this is a big project!)
6.19 For each of the following constructions of a compression function h from a block cipher F, either show an attack or prove collision resistance in the ideal-cipher model:
(a) h(k,x)=Fk(x).
(b) h(k,x)=Fk(x)⊕k⊕x.
(c) h(k,x)=Fk(x)⊕k.
6.20 Consider using DES to construct a compression function in the following
112 64 def 64 way: Defineh:{0,1} →{0,1} ash(x1,x2) = DESx1(DESx2(0 ))
where |x1| = |x2| = 56.
(a) Write down an explicit collision in h.
Hint: Use Exercise 6.9(a–b).
(b) Show how to find a preimage of an arbitrary value y (that is, x1, x2
such that h(x1∥x2) = y) in roughly 256 time.
(c) Show a more clever preimage attack that runs in roughly 232 time and succeeds with high probability.
Hint: Rely on the results of Appendix A.4.
6.21 Let F be a block cipher for which it is easy to find fixed points for some key: namely, there is a key k for which it is easy to find inputs x for which Fk(x) = x. Find a collision in the Davies–Meyer construction when applied to F . (Consider this in light of Exercise 6.12.)
Chapter 7
*Theoretical Constructions of Symmetric-Key Primitives
In Chapter 3 we introduced the notion of pseudorandomness and defined some basic cryptographic primitives including pseudorandom generators, functions, and permutations. We showed in Chapters 3 and 4 that these primitives serve as the building blocks for all of private-key cryptography. As such, it is of great importance to understand these primitives from a theoretical point of view. In this chapter we formally introduce the concept of one-way func- tions—functions that are, informally, easy to compute but hard to invert—and show how pseudorandom generators, functions, and permutations can be con- structed under the sole assumption that one-way functions exist.1 Moreover, we will see that one-way functions are necessary for “non-trivial” private-key cryptography. That is: the existence of one-way functions is equivalent to the existence of all (non-trivial) private-key cryptography. This is one of the major contributions of modern cryptography.
The constructions we show in this chapter should be viewed as comple- mentary to the constructions of stream ciphers and block ciphers discussed in the previous chapter. The focus of the previous chapter was on how various cryptographic primitives are currently realized in practice, and the intent of that chapter was to introduce some basic approaches and design principles that are used. Somewhat disappointing, though, was the fact that none of the constructions we showed could be proven secure based on any weaker (i.e., more reasonable) assumptions. In contrast, in the present chapter we will prove that it is possible to construct pseudorandom permutations starting from the very mild assumption that one-way functions exist. This assumption is more palatable than assuming, say, that AES is a pseudorandom permuta- tion, both because it is a qualitatively weaker assumption and also because we have a number of candidate, number-theoretic one-way functions that have been studied for many years, even before the advent of cryptography. (See the very beginning of Chapter 6 for further discussion of this point.) The down- side, however, is that the constructions we show here are all far less efficient than those of Chapter 6, and thus are not actually used. It remains an impor- tant challenge for cryptographers to “bridge this gap” and develop provably
1This is not quite true since we are for the most part going to rely on one-way permutations in this chapter. But it is known that one-way functions suffice.
241
242 Introduction to Modern Cryptography
secure constructions of pseudorandom generators, functions, and permuta- tions whose efficiency is comparable to the best available stream ciphers and block ciphers.
Collision-resistant hash functions. In contrast to the previous chapter, here we do not consider collision-resistant hash functions. The reason is that constructions of such hash functions from one-way functions are unknown and, in fact, there is evidence suggesting that such constructions are impossible. We will turn to provable constructions of collision-resistant hash functions— based on specific, number-theoretic assumptions—in Section 8.4.2.
A note regarding this chapter. The material in this chapter is somewhat more advanced than the material in the rest of this book. This material is not used explicitly elsewhere, and so this chapter can be skipped if desired. Having said this, we have tried to present the material in such a way that it is understandable (with effort) to an advanced undergraduate or beginning graduate student. We encourage all readers to peruse Sections 7.1 and 7.2, which introduce one-way functions and provide an overview of the rest of this chapter. We believe that familiarity with at least some of the topics covered here is important enough to warrant the effort.
7.1 One-Way Functions
In this section we formally define one-way functions, and then briefly dis- cuss some candidates that are widely believed to satisfy this definition. (We will see more examples of conjectured one-way functions in Chapter 8.) We next introduce the notion of hard-core predicates, which can be viewed as encapsulating the hardness of inverting a one-way function and will be used extensively in the constructions that follow in subsequent sections.
7.1.1 Definitions
A one-way function f : {0, 1}∗ → {0, 1}∗ is easy to compute, yet hard to invert. The first condition is easy to formalize: we will simply require that f be computable in polynomial time. Since we are ultimately interested in building cryptographic schemes that are hard for a probabilistic polynomial- time adversary to break except with negligible probability, we will formalize the second condition by requiring that it be infeasible for any probabilistic polynomial-time algorithm to invert f—that is, to find a preimage of a given value y—except with negligible probability. A technical point is that this probability is taken over an experiment in which y is generated by choosing a uniform element x of the domain of f and then setting y := f(x) (rather than
*Theoretical Constructions of Symmetric-Key Primitives 243
choosing y uniformly from the range of f). The reason for this should become clear from the constructions we will see in the remainder of the chapter.
Let f : {0, 1}∗ → {0, 1}∗ be a function. Consider the following experiment defined for any algorithm A and any value n for the security parameter:
The inverting experiment InvertA,f (n)
1. Choose uniform x ∈ {0, 1}n, and compute y := f (x).
2. A is given 1n and y as input, and outputs x′.
3. The output of the experiment is defined to be 1 if f(x′) = y, and 0 otherwise.
We stress that A need not find the original preimage x; it suffices for A to find any value x′ for which f(x′) = y = f(x). We give the security parameter 1n to A in the second step to stress that A may run in time polynomial in the security parameter n, regardless of the length of y.
We can now define what it means for a function f to be one-way. DEFINITION 7.1 A function f : {0,1}∗ → {0,1}∗ is one-way if the
following two conditions hold:
1. (Easy to compute:) There exists a polynomial-time algorithm Mf
computing f; that is, Mf(x) = f(x) for all x.
2. (Hard to invert:) For every probabilistic polynomial-time algorithm
A, there is a negligible function negl such that Pr[InvertA,f (n) = 1] ≤ negl(n).
Notation. In this chapter we will often make the probability space more explicit by subscripting (part of) it in the probability notation. For example, we can succinctly express the second requirement in the definition above as follows: For every probabilistic polynomial-time algorithm A, there exists a negligible function negl such that
Pr n A(1n, f(x)) ∈ f−1(f(x)) ≤ negl(n). x←{0,1}
(Recall that x ← {0, 1}n means that x is chosen uniformly from {0, 1}n.) The probability above is also taken over the randomness used by A, which here is left implicit.
Successful inversion of one-way functions. A function that is not one- way is not necessarily easy to invert all the time (or even “often”). Rather, the converse of the second condition of Definition 7.1 is that there exists a probabilistic polynomial-time algorithm A and a non-negligible function γ
244 Introduction to Modern Cryptography
such that A inverts f(x) with probability at least γ(n) (where the probability is taken over uniform choice of x ∈ {0, 1}n and the randomness of A). This means, in turn, that there exists a positive polynomial p(·) such that for infinitely many values of n, algorithm A inverts f with probability at least 1/p(n). Thus, if there exists an A that inverts f with probability n−10 for all even values of n (but always fails to invert f when n is odd), then f is not one-way—even though A only succeeds on half the values of n, and only succeeds with probability n−10 (for values of n where it succeeds at all).
Exponential-time inversion. Any one-way function can be inverted at any point y in exponential time, by simply trying all values x ∈ {0,1}n until a value x is found such that f(x) = y. Thus, the existence of one-way functions is inherently an assumption about computational complexity and computa- tional hardness. That is, it concerns a problem that can be solved in principle but is assumed to be hard to solve efficiently.
One-way permutations. We will often be interested in one-way functions with additional structural properties. We say a function f is length-preserving if |f(x)| = |x| for all x. A one-way function that is length-preserving and one- to-one is called a one-way permutation. If f is a one-way permutation, then any value y has a unique preimage x = f−1(y). Nevertheless, it is still hard to find x in polynomial time.
One-way function/permutation families. The above definitions of one- way functions and permutations are convenient in that they consider a single function over an infinite domain and range. However, most candidate one-way functions and permutations do not fit neatly into this framework. Instead, there is an algorithm that generates some set of parameters I which define a function fI ; one-wayness here means essentially that fI should be one-way with all but negligible probability over choice of I. Because each value of I defines a different function, we now refer to families of one-way functions (resp., permutations). We give the definition now, and refer the reader to the next section for a concrete example. (See also Section 8.4.1.)
DEFINITION 7.2 A tuple Π = (Gen, Samp, f ) of probabilistic polynomial- time algorithms is a function family if the following hold:
1. The parameter-generation algorithm Gen, on input 1n, outputs parameters I with |I| ≥ n. Each value of I output by Gen defines sets DI and RI that constitute the domain and range, respectively, of a function fI.
2. The sampling algorithm Samp, on input I, outputs a uniformly distributed element of DI.
3. The deterministic evaluation algorithm f , on input I and x ∈ DI , outputs an element y ∈ RI. We write this as y := fI(x).
*Theoretical Constructions of Symmetric-Key Primitives 245 Π is a permutation family if for each value of I output by Gen(1n), it holds
that DI = RI and the function fI : DI → DI is a bijection.
Let Π be a function family. What follows is the natural analogue of the
experiment introduced previously.
The inverting experiment InvertA,Π(n):
1. Gen(1n) is run to obtain I, and then Samp(I) is run to obtain a uniform x ∈ DI . Finally, y := fI (x) is computed.
2. A is given I and y as input, and outputs x′.
3. The output of the experiment is 1 if fI (x′) = y.
DEFINITION 7.3 A function/permutation family Π = (Gen, Samp, f) is one-way if for all probabilistic polynomial-time algorithms A there exists a negligible function negl such that
Pr[InvertA,Π(n) = 1] ≤ negl(n).
Throughout this chapter we work with one-way functions/permutations over an infinite domain (as in Definition 7.1), rather than working with fam- ilies of one-way functions/permutations. This is primarily for convenience, and does not significantly affect any of the results. (See Exercise 7.7.)
7.1.2 Candidate One-Way Functions
One-way functions are of interest only if they exist. We do not know how to prove they exist unconditionally (this would be a major breakthrough in complexity theory), so we must conjecture or assume their existence. Such a conjecture is based on the fact that several natural computational problems have received much attention, yet still have no polynomial-time algorithm for solving them. Perhaps the most famous such problem is integer factorization, i.e., finding the prime factors of a large integer. It is easy to multiply two numbers and obtain their product, but difficult to take a number and find its factors. This leads us to define the function fmult(x, y) = x · y. If we do not place any restriction on the lengths of x and y, then fmult is easy to invert: with high probability x · y will be even, in which case (2, xy/2) is an inverse. This issue can be addressed by restricting the domain of fmult to equal-length primes x and y. We return to this idea in Section 8.2.
Another candidate one-way function, not relying directly on number theory, is based on the subset-sum problem and is defined by
fss(x1,…,xn,J) = x1,…,xn, j∈J xj mod 2n,
246 Introduction to Modern Cryptography
where each xi is an n-bit string interpreted as an integer, and J is an n-
bit string interpreted as specifying a subset of {1, . . . , n}. Inverting fss on
an output (x1,…,xn,y) requires finding a subset J′ ⊆ {1,…,n} such that
fss is one-way: P ̸= NP would mean that every polynomial-time algorithm fails to solve the subset-sum problem on at least one input, whereas for fss to be a one-way function it is required that every polynomial-time algorithm fails to solve the subset-sum problem (at least for certain parameters) almost al- ways. Thus, our belief that the function above is one-way is based on the lack of known algorithms to solve this problem even with “small” probability on random inputs, and not merely on the fact that the problem is NP-complete.
We conclude by showing a family of permutations that is believed to be one- way. Let Gen be a probabilistic polynomial-time algorithm that, on input 1n, outputs an n-bit prime p along with a special element g ∈ {2, . . . , p − 1}. (The element g should be a generator of Z∗p; see Section 8.3.3.) Let Samp be an algorithm that, given p and g, outputs a uniform integer x ∈ {1, . . . , p − 1}. Finally, define
fp,g(x) = [gx mod p].
(The fact that fp,g can be computed efficiently follows from the results in Appendix B.2.3.) It can be shown that this function is one-to-one, and thus a permutation. The presumed difficulty of inverting this function is based on the conjectured hardness of the discrete-logarithm problem; we will have much more to say about this in Section 8.3.
Finally, we remark that very efficient one-way functions can be obtained from practical cryptographic constructions such as SHA-1 or AES under the assumption that they are collision resistant or a pseudorandom permutation, respectively; see Exercises 7.4 and 7.5. (Technically speaking, they cannot sat- isfy the definition of one-wayness since they have fixed-length input/output and so we cannot look at their asymptotic behavior. Nevertheless, it is plau- sible to conjecture that they are one-way in a concrete sense.)
7.1.3 Hard-Core Predicates
By definition, a one-way function is hard to invert. Stated differently: given
y = f(x), the value x cannot be computed in its entirety by any polynomial-
time algorithm (except with negligible probability; we ignore this here). One
might get the impression that nothing about x can be determined from f(x)
in polynomial time. This is not necessarily the case. Indeed, it is possible for
n
xj = y mod 2 . Students who have studied NP-completeness may re-
j∈J′
call that this problem is N P -complete. But even P ̸= N P does not imply that
f(x) to “leak” a lot of information about x even if f is one-way. For a trivial def
example,letgbeaone-wayfunctionanddefinef(x1,x2) = (x1,g(x2)),where |x1| = |x2|. It is easy to show that f is also a one-way function (this is left as an exercise), even though it reveals half its input.
*Theoretical Constructions of Symmetric-Key Primitives 247
For our applications, we will need to identify a specific piece of information about x that is “hidden” by f(x). This motivates the notion of a hard-core predicate. A hard-core predicate hc : {0, 1}∗ → {0, 1} of a function f has the property that hc(x) is hard to compute with probability significantly better than 1/2 given f(x). (Since hc is a boolean function, it is always possible to compute hc(x) with probability 1/2 by random guessing.) Formally:
DEFINITION 7.4 A function hc : {0, 1}∗ → {0, 1} is a hard-core predi- cate of a function f if hc can be computed in polynomial time, and for every probabilistic polynomial-time algorithm A there is a negligible function negl such that
Pr [A(1n,f(x))=hc(x)]≤ 1+negl(n), x←{0,1}n 2
where the probability is taken over the uniform choice of x in {0, 1}n and the randomness of A.
We stress that hc(x) is efficiently computable given x (since the function hc can be computed in polynomial time); the definition requires that hc(x) is hard to compute given f(x). The above definition does not require f to be one-way; if f is a permutation, however, then it cannot have a hard-core predicate unless it is one-way. (See Exercise 7.13.)
def n
Simple ideas don’t work. Consider the predicate hc(x) = i=1 xi where
x1 , . . . , xn denote the bits of x. One might hope that this is a hard-core
predicate of any one-way function f: if f cannot be inverted, then f(x) must
hide at least one of the bits xi of its preimage x, which would seem to imply
that the exclusive-or of all of the bits of x is hard to compute. Despite its
appeal, this argument is incorrect. To see this, let g be a one-way function def n
and define f(x) = (g(x), i=1 xi). It is not hard to show that f is one-way. However, it is clear that f(x) does not hide the value of hc(x) = ni=1 xi because this is part of its output; therefore, hc(x) is not a hard-core predicate of f. Extending this, one can show that for any fixed predicate hc, there is a one-way function f for which hc is not a hard-core predicate of f.
Trivial hard-core predicates. Some functions have “trivial” hard-core predicates. For example, let f be the function that drops the last bit of its input (i.e., f(x1 ···xn) = x1 ···xn−1). It is hard to determine xn given f(x) since xn is independent of the output; thus, hc(x) = xn is a hard-core predicate of f. However, f is not one-way. When we use hard-core predicates for our constructions, it will become clear why trivial hard-core predicates of this sort are of no use.
248 Introduction to Modern Cryptography
7.2 From One-Way Functions to Pseudorandomness
The goal of this chapter is to show how to construct pseudorandom genera- tors, functions, and permutations based on any one-way function/permutation. In this section, we give an overview of these constructions. Details are given in the sections that follow.
A hard-core predicate from any one-way function. The first step is to show that a hard-core predicate exists for any one-way function. Actually, it remains open whether this is true; we show something weaker that suffices for our purposes. Namely, we show that given a one-way function f we can construct a different one-way function g along with a hard-core predicate of g.
THEOREM 7.5 (Goldreich–Levin theorem) Assume one-way func- tions (resp., permutations) exist. Then there exists a one-way function (resp., permutation) g and a hard-core predicate hc of g.
d e f n hc(x,r) = xi ·ri,
Let f be a one-way function. Functions g and hc are constructed as follows: def
set g(x, r) = (f (x), r), for |x| = |r|, and define
i=1
where xi (resp., ri) denotes the ith bit of x (resp., r). Notice that if r is uniform, then hc(x, r) outputs the exclusive-or of a random subset of the bits of x. (When ri = 1 the bit xi is included in the XOR, and otherwise it is not.) The Goldreich–Levin theorem essentially states that if f is a one-way function then f(x) hides the exclusive-or of a random subset of the bits of x.
Pseudorandom generators from one-way permutations. The next step is to show how a hard-core predicate of a one-way permutation can be used to construct a pseudorandom generator. (It is known that a hard-core predicate of a one-way function suffices, but the proof is extremely complicated and well beyond the scope of this book.) Specifically, we show:
THEOREM 7.6 Let f be a one-way permutation and let hc be a hard-core def
predicate of f. Then, G(s) = f(s)∥hc(s) is a pseudorandom generator with expansion factor l(n) = n + 1.
As intuition for why G as defined in the theorem constitutes a pseudoran- dom generator, note first that the initial n bits of the output of G(s) (i.e., the bits of f(s)) are truly uniformly distributed when s is uniformly distributed, by virtue of the fact that f is a permutation. Next, the fact that hc is a hard-core predicate of f means that hc(s) “looks random”—i.e., is pseudo-
*Theoretical Constructions of Symmetric-Key Primitives 249 random—even given f(s) (assuming again that s is uniform). Putting these
observations together, we see that the entire output of G is pseudorandom.
Pseudorandom generators with arbitrary expansion. The existence of a pseudorandom generator that stretches its seed by even a single bit (as we have just seen) is already highly non-trivial. But for applications (e.g., for efficient encryption of large messages as in Section 3.3), we need a pseudoran- dom generator with much larger expansion. Fortunately, we can obtain any polynomial expansion factor we want:
THEOREM 7.7 If there exists a pseudorandom generator with expansion factor l(n) = n+1, then for any polynomial poly there exists a pseudorandom generator with expansion factor poly(n).
We conclude that pseudorandom generators with arbitrary (polynomial) expansion can be constructed from any one-way permutation.
Pseudorandom functions/permutations from pseudorandom gener- ators. Pseudorandom generators suffice for constructing EAV-secure private- key encryption schemes. For achieving CPA-secure private-key encryption (not to mention message authentication codes), however, we relied on pseu- dorandom functions. The following result shows that the latter can be con- structed from the former:
THEOREM 7.8 If there exists a pseudorandom generator with expansion factor l(n) = 2n, then there exists a pseudorandom function.
In fact, we can do even more:
THEOREM 7.9 If there exists a pseudorandom function, then there exists a strong pseudorandom permutation.
Combining all the above theorems, as well as the results of Chapters 3 and 4, we have the following corollaries:
COROLLARY 7.10 Assuming the existence of one-way permutations, there exist pseudorandom generators with any polynomial expansion factor, pseudorandom functions, and strong pseudorandom permutations.
COROLLARY 7.11 Assuming the existence of one-way permutations, there exist CCA-secure private-key encryption schemes and secure message authentication codes.
As noted earlier, it is possible to obtain all these results based solely on the existence of one-way functions.
250 Introduction to Modern Cryptography
7.3 Hard-Core Predicates from One-Way Functions
In this section, we prove Theorem 7.5 by showing the following: THEOREM 7.12 Let f be a one-way function and define g by g(x,r) =
def (f(x),r),where|x|=|r|. Definegl(x,r) = i=1xi·ri,wherex=x1···xn
def n
and r = r1 ···rn. Then gl is a hard-core predicate of g.
Due to the complexity of the proof, we prove three successively stronger results culminating in what is claimed in the theorem.
7.3.1 A Simple Case
We first show that if there exists a polynomial-time adversary A that al- ways correctly computes gl(x,r) given g(x,r) = (f(x),r), then it is possible to invert f in polynomial time. Given the assumption that f is a one-way function, it follows that no such adversary A exists.
PROPOSITION 7.13 Let f and gl be as in Theorem 7.12. If there exists a polynomial-time algorithm A such that A(f(x),r) = gl(x,r) for all n and all x, r ∈ {0, 1}n, then there exists a polynomial-time algorithm A′ such that A′(1n,f(x)) = x for all n and all x ∈ {0,1}n.
PROOF We construct A′ as follows. A′(1n, y) computes xi := A(y, ei) for i = 1, . . . , n, where ei denotes the n-bit string with 1 in the ith position and 0 everywhere else. Then A′ outputs x = x1 ···xn. Clearly A′ runs in polynomial time.
In the execution of A′(1n,f(xˆ)), the value xi computed by A′ satisfies
n
xi =A(f(xˆ),ei)=gl(xˆ,ei)=
Thus, xi = xˆi for all i and so A′ outputs the correct inverse x = xˆ.
If f is one-way, it is impossible for any probabilistic polynomial-time al- gorithm to invert f with non-negligible probability. Thus, we conclude that there is no polynomial-time algorithm that always correctly computes gl(x, r) from (f (x), r). This is a rather weak result that is very far from our ultimate goal of showing that gl(x,r) cannot be computed (with probability signifi- cantly better than 1/2) given (f(x),r).
j=1
xˆj ·eij =xˆi.
*Theoretical Constructions of Symmetric-Key Primitives 251 7.3.2 A More Involved Case
We now show that it is hard for any probabilistic polynomial-time algo- rithm A to compute gl(x, r) from (f (x), r) with probability significantly better than 3/4. We will again show that any such A would imply the existence of a polynomial-time algorithm A′ that inverts f with non-negligible probability. Notice that the strategy in the proof of Proposition 7.13 fails here because it may be that A never succeeds when r = ei (although it may succeed, say, on all other values of r). Furthermore, in the present case A′ does not know if the result A(f(x),r) is equal to gl(x,r) or not; the only thing A′ knows is that with high probability, algorithm A is correct. This further complicates the proof.
PROPOSITION 7.14 Let f and gl be as in Theorem 7.12. If there exists a probabilistic polynomial-time algorithm A and a polynomial p(·) such that
Pr A(f(x),r) = gl(x,r) ≥ 3 + 1 x,r←{0,1}n 4 p(n)
for infinitely many values of n, then there exists a probabilistic polynomial- time algorithm A′ such that
Pr A′(1n, f(x)) ∈ f−1(f(x)) ≥ for infinitely many values of n.
1 x←{0,1}n 4 · p(n)
PROOF The main observation underlying the proof of this proposition is that for every r ∈ {0, 1}n, the values gl(x, r ⊕ ei) and gl(x, r) together can be used to compute the ith bit of x. (Recall that ei denotes the n-bit string with 0s everywhere except the ith position.) This is true because
gl(x, r) ⊕ gl(x, r ⊕ ei)
nn
= xj·rj ⊕ xj·(rj⊕eij) =xi·ri⊕ xi·r ̄i =xi, j=1 j=1
where r ̄ is the complement of r , and the second equality is due to the fact ii
that for j ̸= i, the value xj · rj appears in both sums and so is canceled out. The above demonstrates that if A answers correctly on both (f(x),r) and (f(x),r⊕ei), then A′ can correctly compute xi. Unfortunately, A′ does not know when A answers correctly and when it does not; A′ knows only that A answers correctly with “high” probability. For this reason, A′ will use multiple random values of r, using each one to obtain an estimate of xi, and will then
take the estimate occurring a majority of the time as its final guess for xi.
252 Introduction to Modern Cryptography
As a preliminary step, we show that for many x’s the probability that A answers correctly for both (f(x),r) and (f(x),r ⊕ ei), when r is uniform, is sufficiently high. This allows us to fix x and then focus solely on uniform choice of r, which makes the analysis easier.
CLAIM 7.15 Let n be such that
Pr A(f(x),r)=gl(x,r)≥3+ 1 .
x,r←{0,1}n 4 p(n)
Then there exists a set Sn ⊆ {0, 1}n of size at least 1 · 2n such that for
every x ∈ Sn it holds that
Pr [A(f(x),r)=gl(x,r)]≥3+ 1 .
2p(n)
r←{0,1}n 4 2p(n)
PROOF Let ε(n) = 1/p(n), and define Sn ⊆ {0, 1}n to be the set of all
x’s for which
r←{0,1}n
Pr A(f(x), r) = gl(x, r) =
We have
Pr
[A(f(x),r) = gl(x,r)] ≥ 3 + ε(n) . 4 2
x,r←{0,1}n
1 Pr A(f(x), r) = gl(x, r) 2n r←{0,1}n
=
x∈{0,1}n
1 Pr A(f(x),r) = gl(x,r)
2n r←{0,1}n x∈S
n
+ 1 Pr A(f(x),r) = gl(x,r) 2n r←{0,1}n
x̸∈Sn ≤|Sn|+ 1 · 3+ε(n)
2n2n 42 x̸∈Sn
≤|Sn|+ 3+ε(n) . 2n 4 2
Since 3 + ε(n) ≤ Prx,r←{0,1}
A(f (x), r) = gl(x, r) , straightforward algebra The following now follows as an easy consequence.
CLAIM 7.16 Let n be such that
Pr A(f(x),r)=gl(x,r)≥3+ 1 . x,r←{0,1}n 4 p(n)
4
n
gives |Sn| ≥ ε(n) · 2n. 2
*Theoretical Constructions of Symmetric-Key Primitives 253 Then there exists a set Sn ⊆ {0, 1}n of size at least 1 · 2n such that for
2p(n) every x ∈ Sn and every i it holds that
Pr A(f(x),r)=gl(x,r) A(f(x),r⊕ei)=gl(x,r⊕ei)≥1+ 1 . r←{0,1}n 2 p(n)
PROOF Let ε(n) = 1/p(n), and take Sn to be the set guaranteed by the previous claim. We know that for any x ∈ Sn we have
Pr [A(f(x), r) ̸= gl(x, r)] ≤ 1 − ε(n) . r←{0,1}n 4 2
Fixi∈{1,…,n}. Ifrisuniformlydistributedthensoisr⊕ei;thus Pr [A(f(x),r⊕ei)̸=gl(x,r⊕ei)]≤1−ε(n).
We are interested in lower-bounding the probability that A outputs the correct answer for both gl(x,r) and gl(x,r⊕ei); equivalently, we want to upper-bound the probability that A fails to output the correct answer in either of these cases. Note that r and r ⊕ ei are not independent, so we cannot just multiply the probabilities of failure. However, we can apply the union bound (see Proposition A.7) and sum the probabilities of failure. That is, the probability that A is incorrect on either gl(x, r) or gl(x, r ⊕ ei) is at most
1 − ε(n)+1 − ε(n)= 1 −ε(n), 42422
and so A is correct on both gl(x, r) and gl(x, r ⊕ ei) with probability at least 1/2 + ε(n). This proves the claim.
For the rest of the proof we set ε(n) = 1/p(n) and consider only those values
r←{0,1}n 4 2
of n for which
Pr A(f (x), r) = gl(x, r) ≥ 3 + ε(n) . (7.1) x,r←{0,1}n 4
The previous claim states that for an ε(n)/2 fraction of inputs x, and any i, algorithm A answers correctly on both (f(x),r) and (f(x),r⊕ei) with prob- ability at least 1/2 + ε(n) over uniform choice of r, and from now on we focus only on such values of x. We construct a probabilistic polynomial-time algo- rithm A′ that inverts f(x) with probability at least 1/2 when x ∈ Sn. This suffices to prove Proposition 7.14 since then, for infinitely many values of n,
Pr n[A′(1n, f(x)) ∈ f−1(f(x))] x←{0,1}
≥ Pr n[A′(1n,f(x))∈f−1(f(x))|x∈Sn]· Pr n[x∈Sn] x←{0,1} x←{0,1}
≥1·ε(n)= 1 . 2 2 4p(n)
254 Introduction to Modern Cryptography
Algorithm A′, given as input 1n and y, works as follows: 1. Fori=1,…,ndo:
• Repeatedly choose a uniform r ∈ {0, 1}n and compute A(y, r) ⊕ A(y, r ⊕ ei) as an “estimate” for the ith bit of the preimage of y. After doing this sufficiently many times (see below), let xi be the “estimate” that occurs a majority of the time.
2. Output x = x1 ···xn.
We sketch an analysis of the probability that A′ correctly inverts its given input y. (We allow ourselves to be a bit laconic, since a full proof for a more difficult case is given in the following section.) Say y = f (xˆ) and recall that we assume here that n is such that Equation (7.1) holds and xˆ ∈ Sn. Fix some i. The previous claim implies that the estimate A(y, r) ⊕ A(y, r ⊕ ei) is equal
to gl(xˆ, ei) with probability at least 1 + ε(n) over choice of r. By obtaining 2′
sufficiently many estimates and letting xi be the majority value, A can ensure that xi is equal to gl(xˆ, ei) with probability at least 1 − 1 . Of course, we
2n
need to make sure that polynomially many estimates are enough. Fortunately,
since ε(n) = 1/p(n) for some polynomial p and an independent value of r is used for obtaining each estimate, the Chernoff bound (cf. Proposition A.14) shows that polynomially many estimates suffice.
Summarizing, we have that for each i the value xi computed by A′ is in- correct with probability at most 1 . A union bound thus shows that A′ is
2n 11 ′
incorrect for some i with probability at most n · 2n = 2 . That is, A is correct
for all i—and thus correctly inverts y—with probability at least 1 − 1 = 1 .
This completes the proof of Proposition 7.14.
A corollary of Proposition 7.14 is that if f is a one-way function, then for any polynomial-time algorithm A the probability that A correctly guesses gl(x, r) when given (f (x), r) is at most negligibly more than 3/4.
7.3.3 The Full Proof
We assume familiarity with the simplified proofs in the previous sections, and build on the ideas developed there. We rely on some terminology and standard results from probability theory discussed in Appendix A.3.
We prove the following proposition, which implies Theorem 7.12:
PROPOSITION 7.17 Let f and gl be as in Theorem 7.12. If there exists a probabilistic polynomial-time algorithm A and a polynomial p(·) such that
Pr A(f(x),r) = gl(x,r) ≥ 1 + 1 x,r←{0,1}n 2 p(n)
22
*Theoretical Constructions of Symmetric-Key Primitives 255 for infinitely many values of n, then there exists a probabilistic polynomial-
time algorithm A′ and a polynomial p′(·) such that
Pr A′(1n, f(x)) ∈ f−1(f(x)) ≥ 1
x←{0,1}n p′ (n) for infinitely many values of n.
PROOF Once again we set ε(n) = 1/p(n) and consider only those values
of n for which Prx,r←{0,1}
11
A(f (x), r) = gl(x, r) ≥
is analogous to Claim 7.15 and is proved in the same way.
n
+ . The following
CLAIM 7.18 Let n be such that
Pr A(f (x), r) = gl(x, r) ≥ 1 + ε(n).
2 p(n)
x,r←{0,1}n 2
Then there exists a set Sn ⊆ {0,1}n of size at least ε(n) ·2n such that for every
x ∈ Sn it holds that
r←{0,1}n 2 2
Pr [A(f(x), r) = gl(x, r)] ≥ 1 + ε(n) . (7.2) If we start by trying to prove an analogue of Claim 7.16, the best we can
2
claim here is that when x ∈ Sn we have
Pr n A(f(x),r)=gl(x,r) A(f(x),r⊕ei)=gl(x,r⊕ei)≥ε(n)
r←{0,1}
for any i. Thus, if we try to use A(f(x),r) ⊕ A(f(x),r ⊕ ei) as an estimate for xi, all we can claim is that this estimate will be correct with probability at least ε(n), which may not be any better than taking a random guess! We cannot claim that flipping the result gives a good estimate, either.
Instead, we design A′ so that it computes gl(x, r) and gl(x, r⊕ei) by invoking A only once. We do this by having A′ run A(x, r ⊕ ei), and having A′ simply “guess” the value gl(x, r) itself. The naive way to do this would be to choose the r’s independently, as before, and to have A′ make an independent guess of gl(x, r) for each value of r. But then the probability that all such guesses are correct—which, as we will see, is necessary if A′ is to output the correct inverse—would be negligible because polynomially many r’s are used.
The crucial observation of the present proof is that A′ can generate the r’s in a pairwise-independent manner and make its guesses in a particular way so that with non-negligible probability all its guesses are correct. Specifically, in order to generate m values of r, we have A′ select l = ⌈log(m + 1)⌉ indepen- dent and uniformly distributed strings s1,…,sl ∈ {0,1}n. Then, for every nonempty subset I ⊆ {1,…,l}, we set rI := ⊕i∈I si. Since there are 2l − 1
256 Introduction to Modern Cryptography
nonempty subsets, this defines a collection of 2⌈log(m+1)⌉ − 1 ≥ m strings. Each such string is uniformly distributed. The strings are not independent, but they are pairwise independent. To see this, notice that for every two subsets I ̸= J there is an index j ∈ I ∪ J such that j ∈/ I ∩ J . Without loss of generality, assume j ̸∈ I. Then the value of sj is uniform and independent of the value of rI . Since sj is included in the XOR that defines rJ , this implies that rJ is uniform and independent of rI as well.
We now have the following two important observations:
1. Given gl(x, s1 ), . . . , gl(x, sl ), it is possible to compute gl(x, rI ) for every
subset I ⊆ {1,…,l}. This is because
gl(x, rI ) = gl(x, ⊕i∈I si) = ⊕i∈I gl(x, si).
2. If A′ simply guesses the values of gl(x,s1),…,gl(x,sl) by choosing a uniform bit for each, then all these guesses will be correct with proba- bility 1/2l. If m is polynomial in the security parameter n, then 1/2l is not negligible, and so with non-negligible probability A′ correctly guesses all the values gl(x, s1), . . . , gl(x, sl).
Combining the above yields a way of obtaining m = poly(n) uniform and pairwise-independent strings {rI} along with correct values for {gl(x,rI)} with non-negligible probability. These values can then be used to compute xi in the same way as in the proof of Proposition 7.14. Details follow.
The inversion algorithm A′. We now provide a full description of an algorithm A′ that receives inputs 1n,y and tries to compute an inverse of y. The algorithm proceeds as follows:
1. Set l := ⌈log(2n/ε(n)2 + 1)⌉.
2. Chooseuniform,independents1,…,sl ∈{0,1}n andσ1,…,σl ∈{0,1}.
3. For every nonempty subset I ⊆ {1, . . . , l}, compute rI := ⊕i∈I si and σI := ⊕i∈I σi.
4. Fori=1,…,ndo:
(a) For every nonempty subset I ⊆ {1, . . . , l}, set
xIi :=σI ⊕A(y,rI ⊕ei).
(b) Set xi := majorityI {xIi } (i.e., take the bit that appeared a majority
of the time in the previous step).
5. Output x = x1 ···xn.
*Theoretical Constructions of Symmetric-Key Primitives 257
It remains to compute the probability that A′ outputs x ∈ f−1(y). As in the proof of Proposition 7.14, we focus only on n as in Claim 7.18 and assume y = f(xˆ) for some xˆ ∈ Sn. Each σi represents a “guess” for the value of gl(xˆ,si). As noted earlier, with non-negligible probability all these guesses are correct; we show that conditioned on this event, A′ outputs x = xˆ with probability at least 1/2.
Assume σi = gl(xˆ,si) for all i. Then σI = gl(xˆ,rI) for all I. Fix an in-
dex i ∈ {1,…,n} and consider the probability that A′ obtains the correct
value xi = xˆi. For any nonempty I we have A(y, rI ⊕ ei) = gl(xˆ, rI ⊕ ei) with
probability at least 1 + ε(n)/2 over choice of r; this follows because xˆ ∈ Sn Ii2
and r ⊕e is uniformly distributed. Thus, for any nonempty subset I we have Pr[xI = xˆ ] ≥ 1 + ε(n)/2. Moreover, the {xI } are pairwise indepen-
i i 2 i I⊆{1,…,l}
dent because the {rI}I⊆{1,…,l} (and hence the {rI ⊕ei}I⊆{1,…,l}) are pairwise independent. Since xi is defined to be the value that occurs a majority of the time among the {xIi }I⊆{1,…,l}, we can apply Proposition A.13 to obtain
P r [ x i ̸ = xˆ i ] ≤ 1
4 · (ε(n)/2)2 · (2l − 1)
≤1
4 · (ε(n)/2)2 · (2n/ε(n)2)
=1. 2n
The above holds for all i, so by applying a union bound we see that the probability that xi ̸= xˆi for some i is at most 1/2. That is, xi = xˆi for all i (and hence x = xˆ) with probability at least 1/2.
Putting everything together: Let n be as in Claim 7.18 and y = f(xˆ). With probability at least ε(n)/2 we have xˆ ∈ Sn. All the guesses σi are correct with probability at least
1 ≥ 1 > ε(n)2 2l 2 · (2n/ε(n)2 + 1) 5n
for n sufficiently large. Conditioned on both the above, A′ outputs x = xˆ with probability at least 1/2. The overall probability with which A′ inverts its input is thus at least ε(n)3/20n = 1/(20np(n)3) for infinitely many n. Since 20 np(n)3 is polynomial in n, this proves Proposition 7.17.
7.4 Constructing Pseudorandom Generators
We first show how to construct pseudorandom generators that stretch their input by a single bit, under the assumption that one-way permutations exist. We then show how to extend this to obtain any polynomial expansion factor.
258 Introduction to Modern Cryptography
7.4.1 Pseudorandom Generators with Minimal Expansion
Let f be a one-way permutation with hard-core predicate hc. This means that hc(s) “looks random” given f (s), when s is uniform. Furthermore, since f is a permutation, f (s) itself is uniformly distributed. (Applying a permutation to a uniformly distributed value yields a uniformly distributed value.) So if s is a uniform n-bit string, the (n+1)-bit string f(s)∥hc(s) consists of a uniform n-bit string plus on additional bit that looks uniform even conditioned on the initial n bits; in other words, this (n + 1)-bit string is pseudorandom. Thus, the algorithm G defined by G(s) = f(s)∥hc(s) is a pseudorandom generator.
THEOREM 7.19 Let f be a one-way permutation with hard-core predi- cate hc. Then algorithm G defined by G(s) = f(s)∥hc(s) is a pseudorandom generator with expansion factor l(n) = n + 1.
PROOF Let D be a probabilistic polynomial-time algorithm. We prove that there is a negligible function negl such that
Pr n+1[D(r) = 1] − Pr n[D(G(s)) = 1] ≤ negl(n). (7.3) r←{0,1} s←{0,1}
A similar argument shows that there is a negligible function negl′ for which
Pr n[D(G(s)) = 1] − s←{0,1}
which completes the proof. Observe first that
Pr n+1[D(r) = 1] ≤ negl′(n), r←{0,1}
Pr
r←{0,1}n+1
[D(r) = 1] = =
Pr [D(r∥r′) = 1] r←{0,1}n, r′←{0,1}
Pr [D (f(s)∥r′) = 1] s←{0,1}n,r′←{0,1}
=1· Pr [Df(s)∥hc(s)=1] 2 s←{0,1}n
+ 1 · Pr [D f(s)∥hc(s) = 1], 2 s←{0,1}n
using the fact that f is a permutation for the second equality, and that a uniform bit r′ is equal to hc(s) with probability exactly 1/2 for the third equality. Since
Pr n[D(G(s)) = 1] = Pr n[D (f(s)∥hc(s)) = 1] s←{0,1} s←{0,1}
(by definition of G), this means that Equation (7.3) is equivalent to
1 · Pr [D f (s)∥hc(s) = 1] − Pr [D f (s)∥hc(s) = 1] ≤ negl(n). 2 s←{0,1}n s←{0,1}n
*Theoretical Constructions of Symmetric-Key Primitives 259 Consider the following algorithm A that is given as input a value y = f(s)
and tries to predict the value of hc(s): 1. Choose uniform r′ ∈ {0, 1}.
Pr n[A(f(s)) = hc(s)] s←{0,1}
=1· Pr [A(f(s))=hc(s)|r′=hc(s)] 2 s←{0,1}n
[A(f(s)) = hc(s) | r′ ̸= hc(s)]
′′′
2. Run D(y∥r ). If D outputs 0, output r ; otherwise output r ̄ . Clearly A runs in polynomial time. By definition of A, we have
+ 1 · Pr
2 s←{0,1}n
= 1 · Pr [D(f (s)∥hc(s)) = 0] + Pr [D(f (s)∥hc(s)) = 1] 2 s←{0,1}n s←{0,1}n
= 1 · 1 − Pr
2 s←{0,1}n
[D(f (s)∥hc(s)) = 1] + Pr [D(f (s)∥hc(s)) = 1] s←{0,1}n
= 1+1· Pr
2 2 s←{0,1}n
[D(f(s)∥hc(s))=1]− Pr [D(f(s)∥hc(s))=1]. s←{0,1}n
Since hc is a hard-core predicate of f, it follows that there exists a negligible function negl for which
1 · Pr [D f (s)∥hc(s) = 1] − Pr [D f (s)∥hc(s) = 1] ≤ negl(n), 2 s←{0,1}n s←{0,1}n
as desired.
7.4.2 Increasing the Expansion Factor
We now show that the expansion factor of a pseudorandom generator can be increased by any desired (polynomial) amount. This means that the previous construction, with expansion factor l(n) = n + 1, suffices for constructing a pseudorandom generator with arbitrary (polynomial) expansion factor.
THEOREM 7.20 If there exists a pseudorandom generator G with expan- sion factor n + 1, then for any polynomial poly there exists a pseudorandom generator Gˆ with expansion factor poly(n).
PROOF We first consider constructing a pseudorandom generator Gˆ that outputs n + 2 bits. Gˆ works as follows: Given an initial seed s ∈ {0, 1}n, it
260 Introduction to Modern Cryptography
computes t1 := G(s) to obtain n + 1 pseudorandom bits. The initial n bits of t1 are then used again as a seed for G; the resulting n+1 bits, concatenated with the final bit of t1, yield the (n + 2)-bit output. (See Figure 7.1.) The second application of G uses a pseudorandom seed rather than a random one. The proof of security we give next shows that this does not impact the pseudorandomness of the output.
We now prove that Gˆ is a pseudorandom generator. Define three sequences of distributions {Hn0}n=1,…, {Hn1}n=1,…, and {Hn2}n=1,…, where each of Hn0, Hn1, and Hn2 is a distribution on strings of length n+2. In distribution Hn0, a uniform string t0 ∈ {0, 1}n is chosen and the output is Gˆ(t0). In distribution Hn1, a uniform string t1 ∈ {0,1}n+1 is chosen and parsed as s1∥σ1 (where s1 are the initial n bits of t1 and σ1 is the final bit). The output is t2 := G(s1)∥σ1. In distribution Hn2, the output is a uniform string t2 ∈ {0,1}n+2. We denote by t2 ← Hni the process of generating an (n + 2)-bit string t2 according to distribution Hni .
Fix an arbitrary probabilistic polynomial-time distinguisher D. We first claim that there is a negligible function negl′ such that
Pr 0 [D(t2) = 1] − Pr 1 [D(t2) = 1] ≤ negl′(n). (7.4) t2 ←Hn t2 ←Hn
To see this, consider the polynomial-time distinguisher D′ that, on input t1 ∈ {0,1}n+1, parses t1 as s1∥σ1 with |s1| = n, computes t2 := G(s1)∥σ1, and outputs D(t2). Clearly D′ runs in polynomial time. Observe that:
1. If t1 is uniform, the distribution on t2 generated by D′ is exactly that of distribution Hn1. Thus,
Pr n+1[D′(t1) = 1] = Pr 1 [D(t2) = 1]. t1 ←{0,1} t2 ←Hn
2. If t1 = G(s) for uniform s ∈ {0, 1}n, the distribution on t2 generated by D′ is exactly that of distribution Hn0. That is,
Pr [D′(G(s)) = 1] = Pr [D(t2) = 1]. s←{0,1}n t2 ←Hn0
Pseudorandomness of G implies that there is a negligible function negl′ with Pr n[D′(G(s)) = 1] − Pr n+1[D′(t1) = 1] ≤ negl′(n).
s←{0,1} t1 ←{0,1} Equation (7.4) follows.
We next claim that there is a negligible function negl′′ such that
Pr 1 [D(t2) = 1] − Pr 2 [D(t2) = 1] ≤ negl′′(n). (7.5) t2 ←Hn t2 ←Hn
*Theoretical Constructions of Symmetric-Key Primitives 261
To see this, consider the polynomial-time distinguisher D′′ that, on input w ∈ {0, 1}n+1, chooses uniform σ1 ∈ {0, 1}, sets t2 := w∥σ1, and outputs D(t2). If w is uniform then so is t2; thus,
Pr [D′′(w) = 1] = Pr [D(t2) = 1]. w←{0,1}n+1 t2 ←Hn2
On the other hand, if w = G(s) for uniform s ∈ {0, 1}n, then t2 is distributed exactly according to Hn1 and so
Pr n[D′′(G(s)) = 1] = Pr 1 [D(t2) = 1]. s←{0,1} t2 ←Hn
As before, pseudorandomness of G implies Equation (7.5). Putting everything together, we have
Pr n[D(Gˆ(s)) = 1] − Pr n+2[D(r) = 1] (7.6) s←{0,1} r←{0,1}
= Pr0[D(t2)=1]− Pr2[D(t2)=1] t2 ←Hn t2 ←Hn
≤ negl′(n) + negl′′(n),
using Equations (7.4) and (7.5). Since D was an arbitrary polynomial-time
distinguisher, this proves that Gˆ is a pseudorandom generator.
The general case. The same idea as above can be iteratively applied to generate as many pseudorandom bits as desired. Formally, say we wish to construct a pseudorandom generator Gˆ with expansion factor n + p(n), for some polynomial p. On input s ∈ {0, 1}n, algorithm Gˆ does (cf. Figure 7.1):
1. Set t0 := s. For i = 1,…,p(n) do:
(a) Let si−1 be the first n bits of ti−1, and let σi−1 denote the remaining
i−1bits. (Wheni=1,s0 =t0 andσ0 istheemptystring.) (b) Set ti := G(s′i−1)∥σi−1.
2. Output tp(n).
We show that Gˆ is a pseudorandom generator. The proof uses a common technique known as a hybrid argument. (Actually, even the case of p(n) = 2, above, used a simple hybrid argument.) The main difference with respect to the previous proof is a technical one. Previously, we could define and explicitly work with three sequences of distributions {Hn0}, {Hn1}, and {Hn2}. Here that is not possible since the number of distributions to consider grows with n.
≤ Pr0[D(t2)=1]− Pr1[D(t2)=1] t2 ←Hn t2 ←Hn
+ Pr1[D(t2)=1]− Pr2[D(t2)=1] t2 ←Hn t2 ←Hn
262
Introduction to Modern Cryptography
FIGURE 7.1: Increasing the expansion of a pseudorandom generator.
For any n and 0 ≤ j ≤ p(n), let Hnj be the distribution on strings of length
n+p(n) defined as follows: choose uniform tj ∈ {0, 1}n+j, then run Gˆ starting
from iteration j + 1 and output tp(n). (When j = p(n) this means we simply
choose uniform tp(n) ∈ {0,1}n+p(n) and output it.) The crucial observation
is that Hn0 corresponds to outputting Gˆ(s) for uniform s ∈ {0,1}n, while
Hp(n) corresponds to outputting a uniform (n + p(n))-bit string. Fixing any n
polynomial-time distinguisher D, this means that
Pr [D(Gˆ(s)) = 1] −
s←{0,1}n
We prove the above is negligible, hence Gˆ is a pseudorandom generator.
Fix D as above, and consider the distinguisher D′ that does the following
when given a string w ∈ {0, 1}n+1 as input:
1. Chooseuniformj∈{1,…,p(n)}.
2. Choose uniform σj′ ∈ {0,1}j−1. (When j = 1 then σj′ is the empty string.)
3. Set tj := w∥σj′ . Then run Gˆ starting from iteration j + 1 to com- pute tp(n) ∈ {0, 1}n+p(n). Output D(tp(n)).
Clearly D′ runs in polynomial time. Analyzing the behavior of D′ is more complicated than before, although the underlying ideas are the same. Fix n
Pr [D(r) = 1] r←{0,1}n+p(n)
= Pr
t ← H n0 t ← H p ( n )
[D(t) = 1] − Pr [D(t) = 1] . n
(7.7)
*Theoretical Constructions of Symmetric-Key Primitives 263 and say D′ chooses j = j∗. If w is uniform, then tj∗ is uniform and so the
def j∗ distribution on t = tp(n) is exactly that of distribution Hn . That is,
Pr n+1[D′(w)=1|j=j∗]= Prj∗[D(t)=1]. w←{0,1} t←Hn
Since each value for j is chosen with equal probability,
Pr
w←{0,1}n+1
p(n) j∗ =1 1 p(n)
1 p(n) [D′(w) = 1] = ·
Pr [D′(w) = 1 | j = j∗] w←{0,1}n+1
s ∈ {0,1}n. Defining tj∗−1 = s∥σj′∗, we see that tj∗−1 is uniform and so the experiment involving D′ is equivalent to running Gˆ from iteration j∗ to
def
compute tp(n). That is, the distribution on t = tp(n) is now exactly that of
distribution Hj∗−1, and so n
= p(n) ·
On the other hand, say D′ chooses j = j∗ and w = G(s) for uniform
Prj∗ [D(t) = 1]. (7.8) j∗=1 t←Hn
Therefore,
Pr
s←{0,1}n
ˆ 1p(n) [D′(G(s)) = 1] = ·
Pr [D′(G(s)) = 1 | j = j∗] s←{0,1}n
Pr [D′(G(s))=1|j=j∗]= Pr [D(t)=1]. s←{0,1}n j∗ −1
t←Hn
p(n) j∗ =1 1 p(n)
=p(n)·
1 p(n)−1
Pr [D(t)=1] j∗−1
j∗=1 t←Hn
= p(n) · Prj∗ [D(t) = 1].
(7.9) We can now analyze how well D′ distinguishes outputs of G from random:
(7.10)
j∗=0 t←Hn
Pr [D′(G(s)) = 1] − Pr [D′(w) = 1] s←{0,1}n w←{0,1}n+1
= 1 · Pr [D(t)=1]− Pr [D(t)=1]
p(n)−1 p(n)
∗∗ ∗ t←Hn ∗ t←Hn p(n)j=0 j j=1 j
= 1 · Pr [D(t)=1]− Pr [D(t)=1],
p(n) t←Hn0 t←Hp(n) n
264 Introduction to Modern Cryptography
relying on Equations (7.8) and (7.9) for the first equality. (The second equal- ity holds because the same terms are included in each sum, except for the first term of the left sum and the last term of the right sum.) Since G is a pseudorandom generator, the term on the left-hand side of Equation (7.10) is negligible; because p is polynomial, this implies that Equation (7.7) is negli- gible, completing the proof that Gˆ is a pseudorandom generator.
Putting it all together. Let f be a one-way permutation. Taking the pseudorandom generator with expansion factor n + 1 from Theorem 7.19, and increasing the expansion factor to n + l using the approach from the proof of Theorem 7.20, we obtain the following pseudorandom generator Gˆ:
Gˆ(s) = f(l)(s)∥hc(f(l−1)(s))∥ ··· ∥hc(s),
where f(i)(s) refers to i-fold iteration of f. Note that Gˆ uses l evaluations of f, and generates one pseudorandom bit per evaluation using the hard-core predicate hc.
Connection to stream ciphers. Recall from Section 3.3.1 that a stream cipher (without an I V ) is defined by algorithms (Init, GetBits), where Init takes a seed s ∈ {0, 1}n and returns initial state st, and GetBits takes as input the current state st and outputs a bit σ and updated state st′. The construction Gˆ from the preceding proof fits nicely into this paradigm: take Init to be the trivial algorithm that outputs st = s, and define GetBits(st) to compute G(st), parse the result as st′∥σ with |st′| = n, and output the bit σ and updated state st′. (If we use this stream cipher to generate p(n) output bits starting from seed s, then we get exactly the final p(n) bits of Gˆ(s) in reverse order.) The preceding proof shows that this yields a pseudorandom generator.
Hybrid arguments. A hybrid argument is a basic tool for proving indis-
tinguishability when a basic primitive is (or several different primitives are)
applied multiple times. Somewhat informally, the technique works by defin-
ing a series of intermediate “hybrid distributions” that bridge between two
“extreme distributions” that we wish to prove indistinguishable. (In the proof
above, these extreme distributions correspond to the output of Gˆ and a ran-
dom string.) To apply the proof technique, three conditions should hold.
First, the extreme distributions should match the original cases of interest.
(In the proof above, Hn0 was equal to the distribution induced by Gˆ, while p(n)
Hn was the uniform distribution.) Second, it must be possible to translate
the capability of distinguishing consecutive hybrid distributions into breaking
some underlying assumption. (Above, we essentially showed that distinguish-
ing Hi from Hi+1 was equivalent to distinguishing the output of G from nn
random.) Finally, the number of hybrid distributions should be polynomial. See also Theorem 7.32.
*Theoretical Constructions of Symmetric-Key Primitives 265
7.5 Constructing Pseudorandom Functions
We now show how to construct a pseudorandom function from any (length- doubling) pseudorandom generator. Recall that a pseudorandom function is an efficiently computable, keyed function F that is indistinguishable from a truly random function in the sense described in Section 3.5.1. For simplicity, we restrict our attention here to the case where F is length preserving, mean- ing that for k ∈ {0, 1}n the function Fk maps n-bit inputs to n-bit outputs. A (length-preserving) pseudorandom function can be viewed, informally, as a pseudorandom generator with expansion factor n · 2n; given such a pseu- dorandom generator G we could define Fk(i) (for 0 ≤ i < 2n) to be the ith n-bit block of G(k). The reason this does not work is that F must be effi- ciently computable; there are exponentially many blocks, and we need a way to compute the ith block without having to compute all other blocks.
We will do this by computing “blocks” of the output by walking down a binary tree. We exemplify the construction by first showing a pseudoran- dom function taking 2-bit inputs. Let G be a pseudorandom generator with expansion factor 2n. If we use G as in the proof of Theorem 7.20 we can obtain a pseudorandom generator Gˆ with expansion factor 4n that uses three invocations of G. (We produce n additional pseudorandom bits each time G is applied.) If we define Fk′ (i) (where 0 ≤ i < 4 and i is encoded as a 2-bit binary string) to be the ith block of Gˆ(k), then computation of Fk′(3) would require computing all of Gˆ and hence three invocations of G. We show how to construct a pseudorandom function F using only two invocations of G on any input.
Let G0 and G1 be functions denoting the first and second halves of the output of G; i.e., G(k) = G0(k) ∥ G1(k) where |G0(k)| = |G1(k)| = |k|. Define F as follows:
Fk(00) = G0(G0(k)) Fk(10) = G0(G1(k)) Fk(01) = G1(G0(k)) Fk(11) = G1(G1(k)).
We claim that the four strings above are pseudorandom even when viewed together. (This suffices to prove that F is pseudorandom.) Intuitively, this is because G0(k)∥G1(k) = G(k) is pseudorandom and hence indistinguishable from a uniform 2n-bit string k0∥k1. But then
G0(G0(k)) ∥ G1(G0(k)) ∥ G0(G1(k)) ∥ G1(G1(k)) is indistinguishable from
G0(k0) ∥ G1(k0) ∥ G0(k1) ∥ G1(k1) = G(k0) ∥ G(k1).
Since G is a pseudorandom generator, the above is indistinguishable from a uniform 4n-bit string. A formal proof uses a hybrid argument.
266 Introduction to Modern Cryptography
Generalizing this idea, we can obtain a pseudorandom function on n-bit
inputs by defining
Fk(x) = Gxn(···Gx1(k)···),
where x = x1 · · · xn; see Construction 7.21. The intuition for why this function is pseudorandom is the same as before, but the formal proof is complicated by the fact that there are now exponentially many inputs to consider.
A pseudorandom function from a pseudorandom generator.
It is useful to view this construction as defining, for each key k ∈ {0, 1}n, a complete binary tree of depth n in which each node contains an n-bit value. (See Figure 7.2, in which n = 3.) The root has value k, and for every in- ternal node with value k′ its left child has value G0(k′) and its right child has value G1(k′). The result Fk(x) for x = x1 ···xn is then defined to be the value on the leaf node reached by traversing the tree according to the bits of x, where xi = 0 means “go left” and xi = 1 means “go right.” (The function is only defined for inputs of length n, and thus only values on the leaves are ever output.) The size of the tree is exponential in n. Nevertheless, to compute Fk(x) the entire tree need not be constructed or stored; only n evaluations of G are needed.
FIGURE 7.2: Constructing a pseudorandom function.
CONSTRUCTION 7.21
Let G be a pseudorandom generator with expansion factor l(n) = 2n, and define G0,G1 as in the text. For k ∈ {0,1}n, define the function Fk : {0,1}n → {0,1}n as:
Fk(x1x2 ···xn) = Gxn (···(Gx2(Gx1(k)))···).
*Theoretical Constructions of Symmetric-Key Primitives 267 THEOREM 7.22 If G is a pseudorandom generator with expansion factor
l(n) = 2n, then Construction 7.21 is a pseudorandom function.
PROOF We first show that for any polynomial t it is infeasible to distin- guish t(n) uniform 2n-bit strings from t(n) pseudorandom strings; i.e., for any polynomial t and any ppt algorithm A, the following is negligible:
P r A r 1 ∥ · · · ∥ r t ( n ) = 1 − P r A G ( s 1 ) ∥ · · · ∥ G ( s t ( n ) ) = 1 ,
where the first probability is over uniform choice of r1,...,rt(n) ∈ {0,1}2n, and the second probability is over uniform choice of s1, . . . , st(n) ∈ {0, 1}n.
The proof is by a hybrid argument. Fix a polynomial t and a ppt algo- rithm A, and consider the following algorithm A′:
Distinguisher A′:
A′ is given as input a string w ∈ {0, 1}2n.
1. Chooseuniformj∈{1,...,t(n)}.
2. Choose uniform, independent values r1,...,rj−1 ∈ {0,1}2n
and sj+1,...,st(n) ∈ {0,1}n.
3. Output A r1∥ · · · ∥rj−1∥ w ∥G(sj+1)∥ · · · ∥G(st(n)).
For any n and 0 ≤ i ≤ t(n), let Gin denote the distribution on strings of length 2n · t(n) in which the first i “blocks” of length 2n are uniform and the
remaining t(n) − i blocks are pseudorandom. Note that Gt(n) corresponds to n
the distribution in which all t(n) blocks are uniform, while G0n corresponds to the distribution in which all t(n) blocks are pseudorandom. That is,
[A(y) = 1] (7.11)
Pr
means that
and
′
1 t(n)
· Pr [A(y)=1]
[A(y) = 1] −
= P r A r 1 ∥ · · · ∥ r t ( n ) = 1 − P r A G ( s 1 ) ∥ · · · ∥ G ( s t ( n ) ) = 1
t(n) y←Gn
Pr
y←G0 n
Say A′ chooses j = j∗. If its input w is a uniform 2n-bit string, then A is run on an input distributed according to Gj∗ . If, on the other hand, w = G(s) for
n j∗−1 uniform s, then A is run on an input distributed according to Gn
. This
Pr [A(r)=1]= r←{0,1}2n
t(n) j=1 y←Gjn ′ 1 t(n)−1
Pr [A(G(s))=1]= s←{0,1}n
· Pr [A(y)=1]. t(n) j=0 y←Gjn
268 Introduction to Modern Cryptography Therefore,
Pr [A′(r) = 1] − Pr [A′(G(s)) = 1] (7.12) r←{0,1}2n s←{0,1}n
= 1 · Pr [A(y)=1]− Pr [A(y)=1]. t(n) y←G0
t(n) y←Gn n
Since G is a pseudorandom generator and A′ runs in polynomial time, we know that the left-hand side of Equation (7.12) must be negligible; because t(n) is polynomial, this implies that the left-hand side of Equation (7.11) is negligible as well.
Turning to the crux of the proof, we now show that F as in Construc- tion 7.21 is a pseudorandom function. Let D be an arbitrary ppt distinguisher that is given 1n as input. We show that D cannot distinguish between the case when it is given oracle access to a function that is equal to Fk for a uniform k, or a function chosen uniformly from Funcn. (See Section 3.5.1.) To do so, we use another hybrid argument. Here, we define a sequence of distributions over the values at the leaves of a complete binary tree of depth n. By associating each leaf with a string of length n as in Construction 7.21, we can equiva- lently view these as distributions over functions mapping n-bit inputs to n-bit outputs. For any n and 0 ≤ i ≤ n, let Hni be the following distribution over the values at the leaves of a binary tree of depth n: first choose values for the nodes at level i independently and uniformly from {0,1}n. Then for every node at level i or below with value k, its left child is given value G0(k) and its right child is given value G1(k). Note that Hn corresponds to the distribu- tion in which all values at the leaves are chosen uniformly and independently, and thus corresponds to choosing a uniform function from Funcn, whereas Hn0 corresponds to choosing a uniform key k in Construction 7.21 since in that case only the root (at level 0) is chosen uniformly. That is,
Pr n[DFk(·)(1n) = 1] − Pr [Df(·)(1n) =1]
k←{0,1} f ←Funcn
= Pr0[Df(·)(1n)=1]− Pr [Df(·)(1n)=1].
n f ←Hn f ←Hn
We show that Equation (7.13) is negligible, completing the proof.
Let t = t(n) be a polynomial upper bound on the number of queries D makes to its oracle on input 1n. Define a distinguisher A that tries to distinguish
t(n) uniform 2n-bit strings from t(n) pseudorandom strings, as follows: Distinguisher A:
A is given as input a 2n · t(n)-bit string w1∥ · · · ∥wt(n).
1. Choose uniform j ∈ {0,...,n − 1}. In what follows, A (im- plicitly) maintains a binary tree of depth n with n-bit values at (a subset of the) internal nodes at depth j + 1 and below.
(7.13)
*Theoretical Constructions of Symmetric-Key Primitives 269
2. Run D(1n). When D makes oracle query x = x1 · · · xn, look
at the prefix x1 ···xj. There are two cases:
• If D has never made a query with this prefix before, then use x1 ···xj to reach a node v on the jth level of the tree. Take the next unused 2n-bit string w and set the value of the left child of node v to the left half of w, and the value of the right child of v to the right half of w.
• If D has made a query with prefix x1 ···xj before, then node x1 · · · xj+1 has already been assigned a value.
Using the value at node x1 · · · xj+1, compute the value at the leaf corresponding to x1 · · · xn as in Construction 7.21, and return this value to D.
3. When execution of D is done, output the bit returned by D.
A runs in polynomial time. It is important here that A does not need to store the entire binary tree of exponential size. Instead, it “fills in” the values of at most 2t(n) nodes in the tree. Say A chooses j = j∗. Observe that:
1. If A’s input is a uniform 2n · t(n)-bit string, then the answers it gives
to D are distributed exactly as if D were interacting with a function
chosen from distribution Hj∗+1. This holds because the values of the n
nodes at level j∗ + 1 of the tree are uniform and independent.
2. If A’s input consists of t(n) pseudorandom strings—i.e., wi = G(si) for
uniform seed si—then the answers it gives to D are distributed exactly
as if D were interacting with a function chosen from distribution Hj∗ . n
This holds because the values of the nodes at level j∗ of the tree (namely, the s-values) are uniform and independent. (These s-values are unknown to A, but this makes no difference.)
Proceeding as before, one can show that PrAr1∥···∥rt(n)=1−PrAG(s1)∥···∥G(st(n))=1 (7.14)
= 1 · Pr [Df(·)(1n)=1]− Pr [Df(·)(1n)=1]. n
n f ← H n0 f ← H n
We have shown earlier that Equation (7.14) must be negligible. The above
thus implies that Equation (7.13) must be negligible as well.
7.6 Constructing (Strong) Pseudorandom Permutations
We next show how pseudorandom permutations and strong pseudorandom permutations can be constructed from any pseudorandom function. Recall
270 Introduction to Modern Cryptography
from Section 3.5.1 that a pseudorandom permutation is a pseudorandom func- tion that is also efficiently invertible, while a strong pseudorandom permuta- tion is additionally hard to distinguish from a random permutation even by an adversary given oracle access to both the permutation and its inverse.
Feistel networks revisited. A Feistel network, introduced in Section 6.2.2, provides a way of constructing an invertible function from an arbitrary set of functions. A Feistel network operates in a series of rounds. The input to the ith round is a string of length 2n, divided into two n-bit halves Li−1 and Ri−1 (the “left half” and the “right half,” respectively). The output of the ith round is the 2n-bit string (Li,Ri) where
Li := Ri−1 and Ri := Li−1 ⊕ fi(Ri−1)
for some efficiently computable (but not necessarily invertible) function fi mappingn-bitinputston-bitoutputs.WedenotebyFeistelf1,...,fr ther-round Feistel network using functions f1,...,fr. (That is, Feistelf1,...,fr (L0,R0) out- puts the 2n-bit string (Lr,Rr).) We saw in Section 6.2.2 that Feistelf1,...,fr is an efficiently invertible permutation regardless of the {fi}.
We can define a keyed permutation by using a Feistel network in which the {fi} depend on a key. For example, let F : {0,1}n × {0,1}n → {0,1}n be a pseudorandom function, and define the keyed permutation F (1) as
(1) def
Fk (x) = FeistelFk (x).
(Note that F(1) has an n-bit key and maps 2n-bit inputs to 2n-bit outputs.)
k
Is F(1) pseudorandom? A little thought shows that it is decidedly not. For any key k ∈ {0,1}n, the first n bits of the output of F(1) (that is, L1) are
equal to the last n bits of the input (i.e., R0), something that occurs with only negligible probability for a random function.
Trying again, define F (2) : {0, 1}2n × {0, 1}2n → {0, 1}2n as follows: (2) def
Fk1 ,k2 (x) = FeistelFk1 ,Fk2 (x). (7.15)
(Note that k1 and k2 are independent keys.) Unfortunately, F (2) is not pseu- dorandom either, as you are asked to show in Exercise 7.16.
Given this, it may be somewhat surprising that a three-round Feistel net- work is pseudorandom. Define the keyed permutation F(3), taking a key of length 3n and mapping 2n-bit inputs to 2n-bit outputs, as follows:
THEOREM 7.23 If F is a pseudorandom function, then F(3) is a pseu- dorandom permutation.
(3) def
Fk1 ,k2 ,k3 (x) = FeistelFk1 ,Fk2 ,Fk3 (x) (7.16) where, once again, k1,k2, and k3 are independent. We have:
k
*Theoretical Constructions of Symmetric-Key Primitives 271
FIGURE 7.3: A three-round Feistel network, as used to construct a pseudorandom permutation from a pseudorandom function.
PROOF In the standard way, we can replace the pseudorandom functions used in the construction of F(3) with functions chosen uniformly at random instead. Pseudorandomness of F implies that this has only a negligible effect on the output of any probabilistic polynomial-time distinguisher interacting with F (3) as an oracle. We leave the details as an exercise.
Let D be a probabilistic polynomial-time distinguisher. In the remainder of the proof, we show the following is negligible:
Pr[DFeistelf1,f2,f3(·)(1n) = 1]−Pr[Dπ(·)(1n) = 1],
where the first probability is taken over uniform and independent choice of f1, f2, f3 from Funcn, and the second probability is taken over uniform choice of π from Perm2n. Fix some value for the security parameter n, and let q = q(n) denote a polynomial upper bound on the number of oracle queries made by D. We assume without loss of generality that D never makes the same oracle query twice. Focusing on D’s interaction with Feistelf1,f2,f3(·), let (Li0, R0i ) denote the ith query D makes to its oracle, and let (Li1, R1i ), (Li2 , R2i ), and (Li3 , R3i ) denote the intermediate values after rounds 1, 2, and 3, respectively, that result from that query. (See Figure 7.3.) Note that D chooses (Li0, R0i ) and sees the result (Li3, R3i ), but does not directly observe (Li1, R1i ) or (Li2, R2i ).
We say there is a collision at R1 if R1i = R1j for some distinct i, j. We first prove that a collision at R1 occurs with only negligible probability. Consider anyfixed,distincti,j.IfR0i =R0j thenLi0̸=Lj0,butthen
R1i =Li0⊕f1(R0i)̸=Lj0⊕f1(R0j)=R1j.
272 Introduction to Modern Cryptography
If R0i ̸= R0j then f1(R0i ) and f1(R0j ) are uniform and independent, so
P r L i0 ⊕ f 1 ( R 0i ) = L j0 ⊕ f 1 ( R 0j ) = P r f 1 ( R 0j ) = L i0 ⊕ f 1 ( R 0i ) ⊕ L j0 = 2 − n .
Taking a union bound over all distinct i,j shows that the probability of a collision at R1 is at most q2/2n.
Say there is a collision at R2 if R2i = R2j for some distinct i,j. We prove that conditioned on no collision at R1, the probability of a collision at R2 is negligible. The analysis is as above: consider any fixed i,j, and note that if there is no collision at R1 then R1i ̸= R1j . Thus f2 (R1i ) and f2 (R1j ) are uniform and independent, and therefore
Pr Li1 ⊕ f2(R1i ) = Lj1 ⊕ f2(R1j ) | no collision at R1 = 2−n.
(Note that f2 is independent of f1, making the above calculation easy.) Taking
a union bound over all distinct i, j gives
Pr[collision at R2 | no collision at R1] ≤ q2/2n.
Note that Li3 = R2i = Li1⊕f2(R1i ); so, conditioned on there being no collision at R1, the values L13,...,Lq3 are all independent and uniformly distributed in {0, 1}n. If we additionally condition on the event that there is no collision at R2, then the values L13,...,Lq3 are uniformly distributed among all sequences of q distinct values in {0, 1}n. Similarly, R3i = Li2 ⊕ f3(R2i ); thus, conditioned on there being no collision at R2, the values R31, . . . , R3q are all uniformly distributed in {0, 1}n, independent of each other as well as L13, . . . , Lq3.
To summarize: when querying F(3) (with uniform round functions) on a series of q distinct inputs, except with negligible probability the output val- ues (L13,R31), ..., (Lq3,R3q) are distributed such that the {Li3} are uniform and independent, but distinct, n-bit values, and the {R3i } are uniform and independent n-bit values. In contrast, when querying a random permutation on a series of q distinct inputs, the output values (L13,R31), ..., (Lq3,R3q) are uniform and independent, but distinct, 2n-bit values. The best distinguishing attack for D, then, is to guess that it is interacting with a random permuta- tion if Li3 = Lj3 for some distinct i,j. But that event occurs with negligible probability even in that case. This can be turned into a formal proof.
F (3) is not a strong pseudorandom permutation, as you are asked to demon- strate in Exercise 7.17. Fortunately, adding a fourth round does yield a strong pseudorandom permutation. The details are given as Construction 7.24.
THEOREM 7.25 If F is a pseudorandom function, then Construction 7.24 is a strong pseudorandom permutation that maps 2n-bit inputs to 2n-bit out- puts (and uses a 4n-bit key).
*Theoretical Constructions of Symmetric-Key Primitives 273
CONSTRUCTION 7.24
Let F be a keyed, length-preserving function. Define the keyed permu- tation F(4) as follows:
• Inputs: A key k = (k1,k2,k3,k4) with |ki| = n, and an input x ∈ {0,1}2n parsed as (L0,R0) with |L0| = |R0| = n.
• Computation:
1. Compute L1 := R0 and R1 := L0 ⊕ Fk1 (R0). 2. Compute L2 := R1 and R2 := L1 ⊕ Fk2 (R1). 3. Compute L3 := R2 and R3 := L2 ⊕ Fk3 (R2). 4. Compute L4 := R3 and R4 := L3 ⊕ Fk4 (R3). 5. Output (L4,R4).
A strong pseudorandom permutation from any pseudorandom function.
7.7 Assumptions for Private-Key Cryptography
We have shown that (1) if there exist one-way permutations, then there exist pseudorandom generators; (2) if there exist pseudorandom generators, then there exist pseudorandom functions; and (3) if there exist pseudorandom functions, then there exist (strong) pseudorandom permutations. Although we did not prove it here, it is possible to construct pseudorandom generators from one-way functions. We thus have the following fundamental theorem:
THEOREM 7.26 If one-way functions exist, then so do pseudorandom generators, pseudorandom functions, and strong pseudorandom permutations.
All the private-key schemes we have studied in Chapters 3 and 4 can be constructed from pseudorandom generators/functions. We therefore have:
THEOREM 7.27 If one-way functions exist, then so do CCA-secure private-key encryption schemes and secure message authentication codes.
That is, one-way functions are sufficient for all private-key cryptography. Here, we show that one-way functions are also necessary.
Pseudorandomness implies one-way functions. We begin by showing that pseudorandom generators imply the existence of one-way functions:
274 Introduction to Modern Cryptography PROPOSITION 7.28 If a pseudorandom generator exists, then so does
a one-way function.
PROOF Let G be a pseudorandom generator with expansion factor l(n) = 2n. (By Theorem 7.20, we know that the existence of a pseudorandom gen- erator implies the existence of one with this expansion factor.) We show that G itself is one-way. Efficient computability is straightforward (since G can be computed in polynomial time). We show that the ability to invert G can be translated into the ability to distinguish the output of G from uniform. Intuitively, this holds because the ability to invert G implies the ability to find the seed used by the generator.
Let A be an arbitrary probabilistic polynomial-time algorithm. We show that Pr[InvertA,G(n) = 1] is negligible (cf. Definition 7.1). To see this, consider the following ppt distinguisher D: on input a string w ∈ {0, 1}2n, run A(w) to obtain output s. If G(s) = w then output 1; otherwise, output 0.
We now analyze the behavior of D. First consider the probability that D outputs 1 when its input string w is uniform. Since there are at most 2n values in the range of G (namely, the values {G(s)}s∈{0,1}n ), the probability that w isintherangeofGisatmost2n/22n =2−n. Whenwisnotintherange of G, it is impossible for A to compute an inverse of w and thus impossible for D to output 1. We conclude that
Pr 2n[D(w)=1]≤2−n. w←{0,1}
On the other hand, if w = G(s) for a seed s ∈ {0, 1}n chosen uniformly at random then, by definition, A computes a correct inverse (and so D outputs 1) with probability exactly equal to Pr[InvertA,G(n) = 1]. Thus,
Pr 2n[D(w)=1]− Pr n[D(G(s))=1]≥Pr[InvertA,G(n)=1]−2−n. w←{0,1} s←{0,1}
Since G is a pseudorandom generator, the above must be negligible. Since 2−n is negligible, this implies that Pr[InvertA,G(n) = 1] is negligible as well and so G is one-way.
Non-trivial private-key encryption implies one-way functions. Propo- sition 7.28 does not imply that one-way functions are needed for constructing secure private-key encryption schemes, since it may be possible to construct the latter without relying on a pseudorandom generator. Furthermore, it is possible to construct perfectly secret encryption schemes (see Chapter 2), as long as the plaintext is no longer than the key. Thus, a proof that secure private-key encryption implies one-way functions requires more care.
PROPOSITION 7.29 If there exists an EAV-secure private-key encryp- tion scheme that encrypts messages twice as long as its key, then a one-way function exists.
*Theoretical Constructions of Symmetric-Key Primitives 275
PROOF Let Π = (Enc, Dec) be a private-key encryption scheme that has indistinguishable encryptions in the presence of an eavesdropper and encrypts messages of length 2n when the key has length n. (We assume for simplicity that the key is chosen uniformly.) Say that when an n-bit key is used, Enc uses at most l(n) bits of randomness. Denote the encryption of a message m using key k and randomness r by Enck(m;r).
Define the following function f:
f(k,m,r) = Enck(m;r)∥m,
where |k| = n, |m| = 2n, and |r| = l(n). We claim that f is a one-way func- tion. Clearly it can be efficiently computed; we show that it is hard to invert. Letting A be an arbitrary ppt algorithm, we show that Pr[InvertA,f (n) = 1] is negligible (cf. Definition 7.1).
Consider the following probabilistic polynomial-time adversary A′ attacking
private-key encryption scheme Π (i.e., in experiment PrivKeav Π,A
Adversary A′(1n)
1. Choose uniform m0, m1 ← {0, 1}2n and output them. Re-
ceive in return a challenge ciphertext c.
2. Run A(c∥m0) to obtain (k′,m′,r′). If f(k′,m′,r′) = c∥m0,
output 0; else, output 1.
We now analyze the behavior of A′. When c is an encryption of m0, then c∥m0 is distributed exactly as f(k, m0, r) for uniform k, m0, and r. Therefore, A outputs a valid inverse of c∥m0 (and hence A′ outputs 0) with probability exactly equal to Pr[InvertA,f (n) = 1].
On the other hand, when c is an encryption of m1 then c is independent of m0. For any fixed value of the challenge ciphertext c, there are at most 2n possible messages (one for each possible key) to which c can correspond. Since m0 is a uniform 2n-bit string, this means the probability there exists some key k for which Deck(c) = m0 is at most 2n/22n = 2−n. This gives an upper bound on the probability with which A can possibly output a valid inverse of c ∥ m0 under f , and hence an upper bound on the probability with which A′ outputs 0 in that case.
Putting the above together, we have: PrPrivKeav ′(n)=1
def
Π,A
=1·Pr[A′ outputs0|b=0]+1·Pr[A′ outputs1|b=1]
′
(n)):
22
≥ 1 · Pr[InvertA,f (n) = 1] + 1 · 1 − 2−n 22
= 1 + 1 · Pr[InvertA,f (n) = 1] − 2−n . 22
276 Introduction to Modern Cryptography SecurityofΠmeansthatPrPrivKeav ′(n)=1≤1+negl(n)forsomenegligi-
Π,A 2
ble function negl. This, in turn, implies that Pr[InvertA,f (n) = 1] is negligible,
completing the proof that f is one-way.
Message authentication codes imply one-way functions. It is also true that message authentication codes satisfying Definition 4.2 imply the existence of one-way functions. As in the case of private-key encryption, a proof of this fact is somewhat subtle because unconditional message authentication codes do exist when there is an a priori bound on the number of messages that will be authenticated. (See Section 4.6.) Thus, a proof relies on the fact that Definition 4.2 requires security even when the adversary sees the authentication tags of an arbitrary (polynomial) number of messages. The proof is somewhat involved, so we do not give it here.
Discussion. We conclude that the existence of one-way functions is necessary and sufficient for all (non-trivial) private-key cryptography. In other words, one-way functions are a minimal assumption as far as private-key cryptog- raphy is concerned. Interestingly, this appears not to be the case for hash functions and public-key encryption, where one-way functions are known to be necessary but are not known (or believed) to be sufficient.
7.8 Computational Indistinguishability
The notion of computational indistinguishability is central to the theory of cryptography, and it underlies much of what we have seen in Chapter 3 and this chapter. Informally, two probability distributions are computationally indistinguishable if no efficient algorithm can tell them apart (or distinguish them). In more detail, consider two distributions X and Y over strings of some length l; that is, X and Y each assigns some probability to every string in {0, 1}l. When we say that some algorithm D cannot distinguish these two distributions, we mean that D cannot tell whether it is given a string sampled according to distribution X or whether it is given a string sampled according to distribution Y . Put differently, if we imagine D outputting “0” when it believes its input was sampled according to X and outputting “1” if it thinks its input was sampled according to Y , then the probability that D outputs “1” should be roughly the same regardless of whether D is provided with a sample from X or from Y . In other words, we want
to be small.
P r [ D ( s ) = 1 ] − P r [ D ( s ) = 1 ] s←X s←Y
*Theoretical Constructions of Symmetric-Key Primitives 277
This should be reminiscent of the way we defined pseudorandom generators and, indeed, we will soon formally redefine the notion of a pseudorandom generator using this terminology.
The formal definition of computational indistinguishability refers to prob- ability ensembles, which are infinite sequences of probability distributions. (This formalism is necessary for a meaningful asymptotic approach.) Al- though the notion can be generalized, for our purposes we consider proba- bility ensembles in which the underlying distributions are indexed by natural numbers. If for every natural number n we have a distribution Xn, then X = {Xn}n∈N is a probability ensemble. It is often the case that Xn = Yt(n) for some function t, in which case we write {Yt(n)}n∈N in place of {Xn}n∈N.
We will only be interested in efficiently sampleable probability ensembles. An ensemble X = {Xn}n∈N is efficiently sampleable if there is a probabilistic polynomial-time algorithm S such that the random variables S(1n) and Xn are identically distributed. That is, algorithm S is an efficient way of sampling X .
We can now formally define what it means for two ensembles to be compu- tationally indistinguishable.
DEFINITION 7.30 Two probability ensembles X = {Xn}n∈N and
Y = {Yn}n∈N are computationally indistinguishable, denoted X ≡c Y, if for every probabilistic polynomial-time distinguisher D there exists a negligible function negl such that:
Pr [D(1n, x) = 1] − Pr [D(1n, y) = 1] ≤ negl(n). x←Xn y←Yn
In the definition, D is given the unary input 1n so it can run in time polynomial in n. This is important when the outputs of Xn and Yn may have length less than n. As shorthand in probability expressions, we will sometimes write X as a placeholder for a random sample from distribution X. That is, we would write Pr[D(1n, Xn) = 1] in place of Prx←Xn [D(1n, x) = 1].
The relation of computational indistinguishability is transitive: if X ≡c Y andY≡c Z,thenX≡c Z.
Pseudorandomness and pseudorandom generators. Pseudorandom- ness is just a special case of computational indistinguishability. For any in- teger l, let Ul denote the uniform distribution over {0, 1}l. We can define a pseudorandom generator as follows:
DEFINITION 7.31 Let l(·) be a polynomial and let G be a (deterministic) polynomial-time algorithm where for all s it holds that |G(s)| = l(|s|). We say that G is a pseudorandom generator if the following two conditions hold:
1. (Expansion:) For every n it holds that l(n) > n.
278 Introduction to Modern Cryptography
2. (Pseudorandomness:) The ensemble {G(Un)}n∈N is computationally
indistinguishable from the ensemble {Ul(n)}n∈N.
Many of the other definitions and assumptions in this book can also be cast
as special cases or variants of computational indistinguishability.
Multiple samples. An important theorem regarding computational indis- tinguishability is that polynomially many samples of (efficiently sampleable) computationally indistinguishable ensembles are also computationally indis- tinguishable.
THEOREM 7.32 Let X and Y be efficiently sampleable probability ensem- bles that are computationally indistinguishable. Then, for every polynomial p,
(1) (p(n))
the ensemble X = {(Xn , . . . , Xn )}n∈N is computationally indistinguish-
able from the ensemble Y ={(Y(1),…,Y(p(n)))} . n n n∈N
For example, let G be a pseudorandom generator with expansion factor 2n, in which case the ensembles {G(Un)}n∈N and {U2n}n∈N are computationally indistinguishable. In the proof of Theorem 7.22 we showed that for any poly- nomial t the ensembles
{(G(Un), . . . , G(Un))}n∈N and {(U2n, . . . , U2n)}n∈N
t(n) t(n)
are also computationally indistinguishable. Theorem 7.32 is proved by a hy-
brid argument in exactly the same way.
References and Additional Reading
The notion of a one-way function was first proposed by Diffie and Hell- man [58] and later formalized by Yao [179]. Hard-core predicates were in- troduced by Blum and Micali [37], and the fact that there exists a hard-core predicate for every one-way function was proved by Goldreich and Levin [79].
The first construction of pseudorandom generators (under a specific number- theoretic hardness assumption) was given by Blum and Micali [37]. The con- struction of a pseudorandom generator from any one-way permutation was given by Yao [179], and the result that pseudorandom generators can be con- structed from any one-way function was shown by H ̊astad et al. [85]. Pseu- dorandom functions were defined and constructed by Goldreich, Goldwasser and Micali [78] and their extension to (strong) pseudorandom permutations was shown by Luby and Rackoff [116]. The fact that one-way functions are a
*Theoretical Constructions of Symmetric-Key Primitives 279
necessary assumption for most of private-key cryptography was shown in [93]. The proof of Proposition 7.29 is from [72].
Our presentation is heavily influenced by Goldreich’s book [75], which is highly recommended for those interested in exploring the topics of this chapter in greater detail.
Exercises
7.1 Prove that if there exists a one-way function, then there exists a one-way function f such that f(0n) = 0n for every n. Note that for infinitely many values y, it is easy to compute f−1(y). Why does this not contra- dict one-wayness?
7.2 Prove that if f is a one-way function, then the function g defined by def
g(x1, x2) = (f(x1), x2), where |x1| = |x2|, is also a one-way function. Observe that g reveals half of its input, but is nevertheless one-way.
7.3 Prove that if there exists a one-way function, then there exists a length- preserving one-way function.
Hint: Let f be a one-way function and let p(·) be a polynomial such ′ def
that |f(x)| ≤ p(|x|). (Justify the existence of such a p.) Define f (x) = f(x)∥10p(|x|)−|f(x)|. Further modify f′ to get a length-preserving func- tion that remains one-way.
7.4 Let(Gen,H)beacollision-resistanthashfunction,whereHmapsstrings of length 2n to strings of length n. Prove that the function family (Gen, Samp, H ) is one-way (cf. Definition 7.3), where Samp is the trivial algorithm that samples a uniform string of length 2n.
Hint: Choosing uniform x ∈ {0,1}2n and finding an inverse of y = Hs(x) does not guarantee a collision. But it does yield a collision most of the time…
7.5 Let F be a (length-preserving) pseudorandom permutation. (a) Show that the function f(x,y) = Fx(y) is not one-way.
(b) Show that the function f (y) = F0n (y) (where n = |y|) is not one- way.
(c) Prove that the function f(x) = Fx(0n) (where n = |x|) is one-way.
7.6 Let f be a length-preserving one-way function, and let hc be a hard- core predicate of f. Define G as G(x) = f(x)∥hc(x). Is G necessarily a pseudorandom generator? Prove your answer.
280 Introduction to Modern Cryptography
7.7 Prove that there exist one-way functions if and only if there exist one- way function families. Discuss why your proof does not carry over to the case of one-way permutations.
def
7.8 Let f be a one-way function. Is g(x) = f(f(x)) necessarily a one-way ′ def
function? What about g (x) = f(x)∥f(f(x))? Prove your answers.
7.9 Let Π = (Gen, Samp, f ) be a function family. A function hc : {0, 1}∗ → {0, 1} is a hard-core predicate of Π if it is efficiently computable and if for every ppt algorithm A there is a negligible function negl such that
Pr [A(I , fI (x)) = hc(I , x)] ≤ 1 + negl(n). I ←Gen(1n ), x←Samp(I ) 2
Prove a version of the Goldreich–Levin theorem for this setting, namely, if a one-way function (resp., permutation) family Π exists, then there exists a one-way function (resp., permutation) family Π′ and a hard-core predicate hc of Π′.
7.10 Show a construction of a pseudorandom generator from any one-way permutation family. You may use the result of the previous exercise.
7.11 This exercise is for students who have taken a course in complexity theory or are otherwise familiar with N P completeness.
(a) Show that the existence of one-way functions implies P ≠ NP.
(b) Assume that P ̸= NP. Show that there exists a function f that is: (1) computable in polynomial time, (2) hard to invert in the worst case (i.e., for all probabilistic polynomial-time A,
[f(A(f(x))) = f(x)] ̸= 1), but (3) is not one-way.
7.12 Let x ∈ {0,1}n and denote x = x1 ···xn. Prove that if there exists a one-way function, then there exists a one-way function f such that for every i there is an algorithm Ai such that
Pr [Ai(f(x))=xi]≥1+ 1 . x←{0,1}n 2 2n
(This exercise demonstrates that it is not possible to claim that every one-way function hides at least one specific bit of the input.)
7.13 Show that if a one-to-one function f has a hard-core predicate, then f is one-way.
7.14 Show that if Construction 7.21 is modified in the natural way so that Fk(x) is defined for every nonempty string x of length at most n, then the construction is no longer a pseudorandom function.
Prx←{0,1}
n
*Theoretical Constructions of Symmetric-Key Primitives 281
7.15 Prove that if there exists a pseudorandom function that, using a key of length n, maps n-bit inputs to single-bit outputs, then there exists a pseudorandom function that maps n-bit inputs to n-bit outputs.
Hint: Use a key of length n2, and prove your construction secure using a hybrid argument.
7.16 Prove that a two-round Feistel network using pseudorandom round func- tions (as in Equation (7.15)) is not pseudorandom.
7.17 Prove that a three-round Feistel network using pseudorandom round functions (as in Equation (7.16)) is not strongly pseudorandom.
Hint: This is significantly more difficult than the previous exercise. Use a distinguisher that makes two queries to the permutation and one query to its inverse.
7.18 Consider the keyed permutation F∗ defined by ∗ def
Fk(x) = FeistelFk,Fk,Fk(x).
(Note that the same key is used in each round.) Show that F∗ is not
pseudorandom.
7.19 Let G be a pseudorandom generator with expansion factor l(n) = n + 1.
Prove that G is a one-way function.
7.20 Let X,Y,Z be probability ensembles. Prove that if X ≡c Y and Y ≡c Z,
t h e n X ≡c Z .
7.21 Prove Theorem 7.32.
7.22 Let X = {Xn}n∈N and Y = {Yn}n∈N be computationally indistinguish- able probability ensembles. Prove that for any probabilistic polynomial- time algorithm A, the ensembles {A(Xn)}n∈N and {A(Yn)}n∈N are com- putationally indistinguishable.
Part III
Public-Key (Asymmetric) Cryptography
Chapter 8
Number Theory and Cryptographic Hardness Assumptions
Modern cryptosystems are invariably based on an assumption that some prob- lem is hard. In Chapters 3 and 4, for example, we saw that private-key cryptography—both encryption schemes and message authentication codes— can be based on the assumption that pseudorandom permutations (a.k.a. block ciphers) exist. Recall, roughly, this means that there exists some keyed permutation F for which it is hard to distinguish in polynomial time between interactions with Fk (for a uniform, unknown key k) and interactions with a truly random permutation.
On the face of it, the assumption that pseudorandom permutations exist seems quite strong and unnatural, and it is reasonable to ask whether this assumption is true or whether there is any evidence to support it. In Chapter 6 we explored how block ciphers are constructed in practice. The fact that existing constructions are resistant to attack serves as an indication that the existence of pseudorandom permutations is plausible. Still, it may be difficult to believe that there are no efficient distinguishing attacks on existing block ciphers. Moreover, the current state of our theory is such that we do not know how to prove the pseudorandomness of any of the existing practical constructions relative to any “simpler” or “more reasonable” assumption. All in all, this is not entirely a satisfying state of affairs.
In contrast, as mentioned in Chapter 3 (and investigated in detail in Chap- ter 7) it is possible to prove that pseudorandom permutations exist based on the much milder assumption that one-way functions exist. (Informally, a function is one-way if it is easy to compute but hard to invert; see Sec- tion 8.4.1.) Apart from a brief discussion in Section 7.1.2, however, we have not seen concrete examples of functions believed to be one-way.
One goal of this chapter is to introduce various problems believed to be “hard,” and to present conjectured one-way functions based on those prob- lems.1 This chapter can thus be viewed as a culmination of a “top down” approach to private-key cryptography. (See Figure 8.1.) That is, in Chap- ters 3 and 4 we have shown that private-key cryptography can be based on pseudorandom functions and permutations. We have then seen that the latter
1Recall we currently do not know how to prove that one-way functions exist, so the best we can do is base one-way functions on assumptions regarding the hardness of certain problems.
285
286 Introduction to Modern Cryptography
can be instantiated in practice using block ciphers, as explored in Chapter 6, or can be constructed in a rigorous fashion from any one-way function, as shown in Chapter 7. Here, we take this one step further and show how one- way functions can be based on certain hard mathematical problems.
FIGURE 8.1:
Private-key cryptography: a top-down approach.
The examples we explore are number theoretic in nature, and we therefore begin with a short introduction to number theory and group theory. Be- cause we are also interested in problems that can be solved efficiently (even a one-way function needs to be easy to compute in one direction, and a cryp- tographic scheme must admit efficient algorithms for the honest parties), we also initiate a study of algorithmic number theory. Even the reader who is familiar with number theory or group theory is encouraged to read this chap- ter, since algorithmic aspects are typically ignored in a purely mathematical treatment of these topics.
A second goal of this chapter is to develop the material needed for public-key cryptography, whose study we will begin in Chapter 10. Strikingly, although in the private-key setting there exist efficient constructions of the necessary primitives (block ciphers and hash functions) without invoking any number theory, in the public-key setting all known constructions rely on hard number- theoretic problems. The material in this chapter thus serves both as a culmi- nation of what we have studied so far with regard to private-key cryptography, as well as the foundation for public-key cryptography.
Number Theory and Cryptographic Hardness Assumptions 287
8.1 Preliminaries and Basic Group Theory
We begin with a review of prime numbers and basic modular arithmetic. Even the reader who has seen these topics before should skim the next two sections since some of the material may be new and we include proofs for most of the stated results.
8.1.1 Primes and Divisibility
The set of integers is denoted by Z. For a, b ∈ Z, we say that a divides b, written a | b, if there exists an integer c such that ac = b. If a does not divide b, we write a̸ | b. (We are primarily interested in the case where a, b, and c are all positive, although the definition makes sense even when one or more of these is negative or zero.) A simple observation is that if a | b and a | c then a|(Xb+Yc)foranyX,Y ∈Z.
If a|b and a is positive, we call a a divisor of b. If in addition a ̸∈ {1,b} then a is called a nontrivial divisor, or a factor, of b. A positive integer p > 1 is prime if it has no factors; i.e., it has only two divisors: 1 and itself. A positive integer greater than 1 that is not prime is called composite. By convention, the number 1 is neither prime nor composite.
A fundamental theorem of arithmetic is that every integer greater than 1
can be expressed uniquely (up to ordering) as a product of primes. That is,
any positive integer N > 1 can be written as N = pei , where the {pi} ii
are distinct primes and ei ≥ 1 for all i; furthermore, the {pi} (and {ei}) are uniquely determined up to ordering.
We are familiar with the process of division with remainder from elementary school. The following proposition formalizes this notion.
PROPOSITION 8.1 Let a be an integer and let b be a positive integer. Then there exist unique integers q, r for which a = qb + r and 0 ≤ r < b.
Furthermore, given integers a and b as in the proposition it is possible to compute q and r in polynomial time; see Appendix B.1. (An algorithm’s running time is measured as a function of the length(s) of its input(s). An important point in the context of algorithmic number theory is that integer inputs are always assumed to be represented in binary. The running time of an algorithm taking as input an integer N is therefore measured in terms of ∥N ∥, the length of the binary representation of N . Note that ∥N ∥ = ⌊log N ⌋ + 1.)
The greatest common divisor of two integers a, b, written gcd(a, b), is the largest integer c such that c | a and c | b. (We leave gcd(0, 0) undefined.) The notion of greatest common divisor makes sense when either or both of a, b are negative but we will typically have a, b ≥ 1; anyway, gcd(a, b) = gcd(|a|, |b|).
288 Introduction to Modern Cryptography
Note that gcd(b, 0) = gcd(0, b) = b; also, if p is prime then gcd(a, p) is either equal to 1 or p. If gcd(a, b) = 1 we say that a and b are relatively prime.
The following is a useful result:
PROPOSITION 8.2 Let a,b be positive integers. Then there exist in- tegers X, Y such that X a + Y b = gcd(a, b). Furthermore, gcd(a, b) is the smallest positive integer that can be expressed in this way.
def ˆ ˆ ˆˆ
PROOF ConsiderthesetI = {Xa+Yb|X,Y ∈Z}. Notethata,b∈I,
and so I certainly contains some positive integers. Let d be the smallest positive integer in I. We show that d = gcd(a,b); since d can be written as d=Xa+YbforsomeX,Y ∈Z(becaused∈I),thisprovesthetheorem.
To show this, we must prove that d|a and d|b, and that d is the largest integer with this property. In fact, we can show that d divides every element in I. To see this, take an arbitrary c ∈ I and write c = X′a+Y′b with X′, Y ′ ∈ Z. Using division with remainder (Proposition 8.1) we have that c=qd+rwithq,rintegersand0≤r
Given a and b, the Euclidean algorithm can be used to compute gcd(a, b) in polynomial time. The extended Euclidean algorithm can be used to com- pute X,Y (as in the above proposition) in polynomial time as well. See Appendix B.1.2 for details.
The preceding proposition is very useful in proving additional results about divisibility. We show two examples now.
PROPOSITION 8.3 If c|ab and gcd(a,c) = 1, then c|b. Thus, if p is prime and p|ab then either p|a or p|b.
PROOF Since c|ab we have γc = ab for some integer γ. If gcd(a,c) = 1 then, by the previous proposition, we know there exist integers X, Y such that
Number Theory and Cryptographic Hardness Assumptions 289
1 = Xa + Y c. Multiplying both sides by b, we obtain
b = Xab + Y cb = Xγc + Y cb = c · (Xγ + Y b).
Since (Xγ + Y b) is an integer, it follows that c | b.
The second part of the proposition follows from the fact that if p̸ | a then
gcd(a, p) = 1.
PROPOSITION 8.4 If a|N, b|N, and gcd(a,b) = 1, then ab|N.
PROOF Writeac=N,bd=N,and(usingProposition8.2)1=Xa+Yb, where c, d, X, Y are all integers. Multiplying both sides of the last equation by N we obtain
N = XaN + Y bN = Xabd + Y bac = ab(Xd + Y c), showing that ab | N .
8.1.2 Modular Arithmetic
Let a,b,N ∈ Z with N > 1. We use the notation [a mod N] to denote the remainder of a upon division by N. In more detail: by Proposition 8.1 there exist unique q,r with a = qN +r and 0 ≤ r < N, and we define [a mod N] to be equal to this r. Note therefore that 0 ≤ [a mod N] < N. We refer to the process of mapping a to [a mod N] as reduction modulo N.
We say that a and b are congruent modulo N, written a = b mod N, if [a mod N] = [b mod N], i.e., if the remainder when a is divided by N is the same as the remainder when b is divided by N. Note that a = b mod N if and only if N | (a − b). By way of notation, in an expression such as
a = b = c = · · · = z mod N,
the understanding is that every equal sign in this sequence (and not just the last) refers to congruence modulo N.
Note that a = [bmodN] implies a = bmodN, but not vice versa. For example, 36 = 21 mod 15 but 36 ̸= [21 mod 15] = 6. (On the other hand, [a mod N] = [b mod N] if and only if a = b mod N.)
Congruence modulo N is an equivalence relation: i.e., it is reflexive (a = amodN foralla),symmetric(a=bmodN impliesb=amodN),andtran- sitive (if a = bmodN and b = cmodN, then a = cmodN). Congruence modulo N also obeys the standard rules of arithmetic with respect to addi- tion, subtraction, and multiplication; so, for example, if a = a′ mod N and b = b′ mod N then (a + b) = (a′ + b′) mod N and ab = a′b′ mod N. A con- sequence is that we can “reduce and then add/multiply” instead of having to “add/multiply and then reduce,” which can often simplify calculations.
290 Introduction to Modern Cryptography
Example 8.5
Let us compute [1093028 · 190301 mod 100]. Since 1093028 = 28 mod 100 and 190301 = 1 mod 100, we have
1093028 · 190301 = [1093028 mod 100] · [190301 mod 100] mod 100 = 28 · 1 = 28 mod 100.
The alternate way of calculating the answer (i.e., computing the product 1093028 · 190301 and then reducing the result modulo 100) is less efficient. ♦
Congruence modulo N does not (in general) respect division. That is, if a = a′ modN and b = b′ modN then it is not necessarily true that a/b = a′/b′ mod N; in fact, the expression “a/b mod N” is not always well-defined. As a specific example that often causes confusion, ab = cb mod N does not necessarily imply that a = c mod N.
Example 8.6
Take N = 24. Then 3 · 2 = 6 = 15 · 2 mod 24, but 3 ̸= 15 mod 24. ♦
In certain cases, however, we can define a meaningful notion of division. If for a given integer b there exists an integer c such that bc = 1 mod N, we say that b is invertible modulo N and call c a (multiplicative) inverse of b modulo N. Clearly, 0 is never invertible. It is also not difficult to show that if c is a multiplicative inverse of b modulo N then so is [c mod N]. Furthermore, if c′ is another multiplicative inverse of b then [c mod N] = [c′ mod N]. When b is invertible we can therefore simply let b−1 denote the unique multiplicative inverse of b that lies in the range {1,...,N −1}.
When b is invertible modulo N, we define division by b modulo N as mul- −1 def −1
tiplication by b (i.e., we define [a/b mod N] = [ab mod N]). We stress that division by b is only defined when b is invertible. If ab = cb mod N and b is invertible, then we may divide each side of the equation by b (or, really, multiply each side by b−1) to obtain
(ab)·b−1 =(cb)·b−1 modN ⇒ a=cmodN.
We see that in this case, division works “as expected.” Invertible integers modulo N are therefore “nicer” to work with, in some sense.
The natural question is: which integers are invertible modulo a given mod- ulus N? We can fully answer this question using Proposition 8.2:
PROPOSITION 8.7 Let b, N be integers, with b ≥ 1 and N > 1. Then b is invertible modulo N if and only if gcd(b, N ) = 1.
PROOF Assume b is invertible modulo N, and let c denote its inverse. Since bc = 1 mod N, this implies that bc − 1 = γN for some γ ∈ Z. Equiv-
Number Theory and Cryptographic Hardness Assumptions 291
alently, bc − γN = 1. Since, by Proposition 8.2, gcd(b,N) is the smallest positive integer that can be expressed in this way, and there is no positive integer smaller than 1, this implies that gcd(b, N ) = 1.
Conversely, if gcd(b,N) = 1 then by Proposition 8.2 there exist integers X, Y such that X b + Y N = 1. Reducing each side of this equation modulo N gives Xb = 1 mod N, and we see that X is a multiplicative inverse of b. (In fact, this gives an efficient algorithm to compute inverses.)
Example 8.8
Let b = 11 and N = 17. Then (−3) · 11 + 2 · 17 = 1, and so 14 = [−3 mod 17] istheinverseof11. Onecanverifythat14·11=1mod17. ♦
Addition, subtraction, multiplication, and computation of inverses (when they exist) modulo N can all be carried out in polynomial time; see Ap- pendix B.2. Exponentiation (i.e., computing [ab mod N] for b > 0 an integer) can also be computed in polynomial time; see Appendix B.2.3.
8.1.3 Groups
Let G be a set. A binary operation ◦ on G is simply a function ◦(·,·) that takes as input two elements of G. If g, h ∈ G then instead of using the cumbersome notation ◦(g, h), we write g ◦ h.
We now introduce the important notion of a group.
DEFINITION 8.9 A group is a set G along with a binary operation ◦ for
which the following conditions hold:
• (Closure:) For all g, h ∈ G, g ◦ h ∈ G.
• (Existence of an identity:) There exists an identity e ∈ G such that for all g ∈ G, e ◦ g = g = g ◦ e.
• (Existence of inverses:) For all g ∈ G there exists an element h ∈ G such that g◦h=e=h◦g. Such an h is called an inverse of g.
• (Associativity:) For all g1,g2,g3 ∈G, (g1 ◦g2)◦g3 =g1 ◦(g2 ◦g3).
When G has a finite number of elements, we say G is finite and let |G| denote the order of the group (that is, the number of elements in G).
A group G with operation ◦ is abelian if the following holds: • (Commutativity:) For all g, h ∈ G, g ◦ h = h ◦ g.
When the binary operation is understood, we simply call the set G a group.
We will always deal with finite, abelian groups. We will be careful to specify, however, when a result requires these assumptions.
292 Introduction to Modern Cryptography
Associativity implies that we do not need to include parentheses when writ- ing long expressions; that is, the notation g1 ◦ g2 ◦ · · · ◦ gn is unambiguous since it does not matter in what order we evaluate the operation ◦.
One can show that the identity element in a group G is unique, and so we can therefore refer to the identity of a group. One can also show that each element g of a group has a unique inverse. See Exercise 8.1.
If G is a group, a set H ⊆ G is a subgroup of G if H itself forms a group under the same operation associated with G. To check that H is a subgroup, we need to verify closure, existence of identity and inverses, and associativity as per Definition 8.9. (In fact, associativity—as well as commutativity if G is abelian—is inherited automatically from G.) Every group G always has the trivial subgroups G and {1}. We call H a strict subgroup of G if H ̸= G.
In general, we will not use the notation ◦ to denote the group operation. Instead, we will use either additive notation or multiplicative notation de- pending on the group under discussion. This does not imply that the group operation corresponds to integer addition or multiplication; it is merely useful notation. When using additive notation, the group operation applied to two elements g, h is denoted g + h; the identity is denoted by 0; the inverse of an element g is denoted by −g; and we write h − g in place of h + (−g). When using multiplicative notation, the group operation applied to g, h is denoted by g · h or simply gh; the identity is denoted by 1; the inverse of an element g is denoted by g−1; and we sometimes write h/g in place of hg−1.
At this point, it may be helpful to see some examples.
Example 8.10
A set may be a group under one operation, but not another. For example, the set of integers Z is an abelian group under addition: the identity is the element 0, and every integer g has inverse −g. On the other hand, it is not a group under multiplication since, for example, the integer 2 does not have a multiplicative inverse in the integers. ♦
Example 8.11
The set of real numbers R is not a group under multiplication, since 0 does not have a multiplicative inverse. The set of nonzero real numbers, however, is an abelian group under multiplication with identity 1. ♦
The following example introduces the group ZN that we will use frequently. Example 8.12
Let N > 1 be an integer. The set {0,…,N − 1} with respect to addition def
modulo N (i.e., where a+b = [a+b mod N]) is an abelian group of order N. Closure is obvious; associativity and commutativity follow from the fact that the integers satisfy these properties; the identity is 0; and, since a + (N − a) = 0 mod N, it follows that the inverse of any element a is [(N −a) mod N]. We
Number Theory and Cryptographic Hardness Assumptions 293 denote this group by ZN . (We will also sometimes use ZN to denote the set
{0, . . . , N − 1} without regard to any particular group operation.) ♦ We end this section with an easy lemma that formalizes a “cancelation law”
for groups.
LEMMA8.13 LetGbeagroupanda,b,c∈G. Ifac=bc,thena=b.
In particular, if ac = c then a is the identity in G.
PROOF We know ac = bc. Multiplying both sides by the unique inverse
c−1 of c, we obtain a = b. In detail:
ac=bc ⇒ (ac)c−1 =(bc)·c−1 ⇒ a(cc−1)=b(cc−1) ⇒ a·1=b·1
⇒ a = b.
Compare the above proof to the discussion (preceding Proposition 8.7) re- garding a cancelation law for division modulo N. As indicated by the sim- ilarity, the invertible elements modulo N form a group under multiplication modulo N. We will return to this example in more detail shortly.
Group Exponentiation
It is often useful to be able to describe the group operation applied m times to a fixed element g, where m is a positive integer. When using additive notation, we express this as m · g or mg; that is,
def
m times
Note that m is an integer, while g is a group element. So mg does not represent the group operation applied to m and g (indeed, we are working in a group where the group operation is written additively). Thankfully, however, the notation “behaves as it should”; so, for example, if g ∈ G and m, m′ are integers then (mg) + (m′g) = (m + m′)g, m(m′g) = (mm′)g, and 1 · g = g. In an abelian group G with g, h ∈ G, (mg) + (mh) = m(g + h).
When using multiplicative notation, we express application of the group operation m times to an element g by gm. That is,
mg=m·g = g+···+g.
m def
g = g···g.
m times
The familiar rules of exponentiation hold: gm · gm′ = gm+m′ , (gm)m′ = gmm′ , andg1 =g. Also,ifGisanabeliangroupandg,h∈Gthengm·hm =(gh)m.
294 Introduction to Modern Cryptography
All these are simply “translations” of the results from the previous paragraph to the setting of groups written multiplicatively rather than additively.
The above notation is extended in the natural way to the case when m is def
zero or a negative integer. When using additive notation we define 0 · g = 0 def
and (−m) · g = m · (−g) for m a positive integer. (Note that in the equation ‘0 · g = 0’ the 0 on the left-hand side is the integer 0 while the 0 on the right- hand side is the identity element in the group.) Observe that −g is the inverse of g and, as one would expect, (−m) · g = −(mg). When using multiplicative
0 def −m def −1 m −1
notation,g = 1andg = (g ) . Again,g istheinverseofg,andwe
have g−m = (gm)−1.
Let g ∈ G and b ≥ 0 be an integer. Then the exponentiation gb can be
computed using a polynomial number of underlying group operations in G. Thus, if the group operation can be computed in polynomial time then so can exponentiation. This is discussed in Appendix B.2.3.
We now know enough to prove the following remarkable result: THEOREM 8.14 Let G be a finite group with m = |G|, the order of the
group. Then for any element g ∈ G, gm = 1.
PROOF We prove the theorem only when G is abelian (although it holds for any finite group). Fix arbitrary g ∈ G, and let g1, . . . , gm be the elements of G. We claim that
g1 ·g2···gm =(gg1)·(gg2)···(ggm).
To see this, note that ggi = ggj implies gi = gj by Lemma 8.13. So each of the m elements in parentheses on the right-hand side is distinct. Because there are exactly m elements in G, the m elements being multiplied together on the right-hand side are simply all elements of G in some permuted order. Since G is abelian, the order in which elements are multiplied does not matter, and so the right-hand side is equal to the left-hand side.
Again using the fact that G is abelian, we can “pull out” all occurrences of g and obtain
g1 ·g2···gm =(gg1)·(gg2)···(ggm)=gm ·(g1 ·g2···gm). Appealing once again to Lemma 8.13, this implies gm = 1.
An important corollary of the above is that we can work “modulo the group order” in the exponent:
COROLLARY 8.15 Let G be a finite group with m = |G| > 1. Then for anyg∈Gandanyintegerx,wehavegx =g[xmodm].
Number Theory and Cryptographic Hardness Assumptions 295 PROOF Say x = qm+r, where q,r are integers and r = [x mod m]. Then
gx =gqm+r =gqm ·gr =(gm)q ·gr =1q ·gr =gr (using Theorem 8.14), as claimed.
Example 8.16
Written additively, the above corollary says that if g is an element in a group of order m, then x · g = [x mod m] · g. As an example, consider the group Z15 of order m = 15, and take g = 11. The corollary says that
152 · 11 = [152 mod 15] · 11 = 2 · 11 = 11 + 11 = 22 = 7 mod 15.
The above agrees with the fact (cf. Example 8.5) that we can “reduce and
then multiply” rather than having to “multiply and then reduce.” ♦ Another corollary that will be extremely useful for cryptographic applica-
tions is the following:
COROLLARY 8.17 Let G be a finite group with m = |G| > 1. Let e>0beaninteger,anddefinethefunctionfe :G→Gbyfe(g)=ge. If gcd(e,m) = 1, then fe is a permutation (i.e., a bijection). Moreover, if d = e−1 mod m then fd is the inverse of fe. (Note by Proposition 8.7, gcd(e, m) = 1 implies e is invertible modulo m.)
PROOF Since G is finite, the second part of the claim implies the first; thus, we need only show that fd is the inverse of fe. This is true because for any g ∈ G we have
fd(fe(g))=fd(ge)=(ge)d =ged =g[edmodm] =g1 =g, where the fourth equality follows from Corollary 8.15.
8.1.4 The Group Z∗N
AsdiscussedinExample8.12,thesetZN ={0,…,N−1}isagroupunder addition modulo N. Can we define a group with respect to multiplication modulo N? In doing so, we will have to eliminate those elements in ZN that are not invertible; e.g., we will have to eliminate 0 since it has no multiplicative inverse. Nonzero elements may also fail to be invertible (cf. Proposition 8.7).
Which elements b ∈ {1, . . . , N −1} are invertible modulo N ? Proposition 8.7 says that these are exactly those elements b for which gcd(b, N ) = 1. We have
296 Introduction to Modern Cryptography
also seen in Section 8.1.2 that whenever b is invertible, it has an inverse lying
in the range {1, . . . , N − 1}. This leads us to define, for any N > 1, the set ∗ def
ZN = {b ∈ {1,…,N − 1} | gcd(b,N) = 1};
i.e.,Z∗N consistsofintegersintheset{1,…,N−1}thatarerelativelyprime
def
to N . The group operation is multiplication modulo N ; i.e., ab = [ab mod N ]. We claim that Z∗N is an abelian group with respect to this operation. Since 1 is always in Z∗N , the set clearly contains an identity element. The discussion above shows that each element in Z∗N has a multiplicative in- verse in the same set. Commutativity and associativity follow from the fact that these properties hold over the integers. To show that closure holds, let a, b ∈ Z∗N ; then [ab mod N] has inverse [b−1a−1 mod N], which means that
gcd([ab mod N], N) = 1 and so ab ∈ Z∗N . Summarizing:
PROPOSITION 8.18 Let N > 1 be an integer. Then Z∗N is an abelian
group under multiplication modulo N.
def ∗ ∗
Define φ(N ) = |ZN |, the order of the group ZN . (φ is called the Euler phi function.) What is the value of φ(N)? First consider the case when N = p is prime. Then all elements in {1, . . . , p − 1} are relatively prime to p, and so φ(p) = |Z∗p| = p − 1. Next consider the case that N = pq, where p,q are distinct primes. If an integer a ∈ {1, . . . , N − 1} is not relatively prime to N, then either p|a or q|a (a cannot be divisible by both p and q since this would imply pq|a but a < N = pq). The elements in {1,...,N −1} divisible by p are exactly the (q − 1) elements p, 2p, 3p, . . . , (q − 1)p, and the elements divisible by q are exactly the (p − 1) elements q, 2q, . . . , (p − 1)q. The number of elements remaining (i.e., those that are neither divisible by p nor q) is therefore given by
(N − 1) − (q − 1) − (p − 1) = pq − p − q + 1 = (p − 1)(q − 1).
We have thus proved that φ(N) = (p − 1)(q − 1) when N is the product of two distinct primes p and q.
You are asked to prove the following general result (used only rarely in the rest of the book) in Exercise 8.4:
THEOREM 8.19 Let N = pei , where the {pi} are distinct primes and ei−1 i i
Example 8.20
Take N = 15 = 5 · 3. Then Z∗15 = {1,2,4,7,8,11,13,14} and |Z∗15| = 8 = 4·2=φ(15). Theinverseof8inZ∗15 is2,since8·2=16=1mod15. ♦
ei ≥ 1. Then φ(N) = i pi (pi − 1).
Number Theory and Cryptographic Hardness Assumptions 297 We have shown that Z∗N is a group of order φ(N). The following are now
easy corollaries of Theorem 8.14 and Corollary 8.17:
COROLLARY 8.21 Take arbitrary integer N > 1 and a ∈ Z∗N . Then aφ(N) =1modN.
For the specific case that N = p is prime and a ∈ {1, . . . , p − 1}, we have ap−1 = 1 mod p.
COROLLARY8.22 FixN>1.Forintegere>0definefe:Z∗N→Z∗N by fe(x) = [xe mod N]. If e is relatively prime to φ(N) then fe is a permuta- tion. Moreover, if d = e−1 mod φ(N) then fd is the inverse of fe.
8.1.5 *Isomorphisms and the Chinese Remainder Theorem
Two groups are isomorphic if they have the same underlying structure. From a mathematical point of view, an isomorphism of a group G provides an alternate, but equivalent, way of thinking about G. From a computational perspective, an isomorphism provides a different way to represent elements in G, which can often have a significant impact on algorithmic efficiency.
DEFINITION 8.23 Let G,H be groups with respect to the operations ◦G, ◦H, respectively. A function f : G → H is an isomorphism from G to H if:
1. f is a bijection, and
2. Forallg1,g2 ∈Gwehavef(g1◦Gg2)=f(g1)◦Hf(g2).
If there exists an isomorphism from G to H then we say that these groups are
isomorphic and write G ≃ H.
In essence, an isomorphism from G to H is just a renaming of elements of G as elements of H. Note that if G is finite and G ≃ H, then H must be finite and of the same size as G. Also, if there exists an isomorphism f from G to H then f−1 is an isomorphism from H to G. It is possible, however, that f is efficiently computable while f−1 is not (or vice versa).
The aim of this section is to use the language of isomorphisms to better understand the group structure of ZN and Z∗N when N = pq is a product of two distinct primes. We first need to introduce the notion of a direct product of groups. Given groups G, H with group operations ◦G, ◦H, respectively, we define a new group G × H (the direct product of G and H) as follows. The elements of G×H are ordered pairs (g,h) with g ∈ G and h ∈ H; thus, if G
298 Introduction to Modern Cryptography
has n elements and H has n′ elements, G × H has n · n′ elements. The group
operation ◦ on G × H is applied component-wise; that is: ′′def ′ ′
(g, h) ◦ (g , h ) = (g ◦G g , h ◦H h ).
We leave it to Exercise 8.8 to verify that G × H is indeed a group. The above notation can be extended to direct products of more than two groups in the natural way, although we will not need this for what follows.
We may now state and prove the Chinese remainder theorem. THEOREM 8.24 (Chinese remainder theorem) Let N = pq where
p, q > 1 are relatively prime. Then
ZN≃Zp×Zq and Z∗N≃Z∗p×Z∗q.
Moreover, let f be the function mapping elements x ∈ {0, . . . , N − 1} to pairs (xp,xq) with xp ∈{0,…,p−1} and xq ∈{0,…,q−1} defined by
def
f(x) = ([x mod p], [x mod q]).
Then f is an isomorphism from ZN to Zp × Zq , and the restriction of f to
Z∗N is an isomorphism from Z∗N to Z∗p × Z∗q .
PROOF For any x ∈ ZN the output f(x) is a pair of elements (xp,xq) withxp ∈Zp andxq ∈Zq. Weclaimthatifx∈Z∗N,then(xp,xq)∈Z∗p×Z∗q. Indeed, if xp ̸∈ Z∗p then this means that gcd([x mod p], p) ̸= 1. But then gcd(x, p) ̸= 1. This implies gcd(x, N ) ̸= 1, contradicting the assumption that x ∈ Z∗N . (An analogous argument holds if xq ̸∈ Z∗q .)
We now show that f is an isomorphism from ZN to Zp ×Zq. (The proof that it is an isomorphism from Z∗N to Z∗p × Z∗q is similar.) Let us start by proving that f is one-to-one. Say f(x) = (xp,xq) = f(x′). Then x = xp = x′ mod p and x = xq = x′ mod q. This in turn implies that (x − x′) is divisible by both p and q. Since gcd(p, q) = 1, Proposition 8.4 says that pq = N divides (x−x′). Butthenx=x′modN. Forx,x′ ∈ZN,thismeansthatx=x′ and so f is indeed one-to-one. Since |ZN| = N = p·q = |Zp|·|Zq|, the sizes of ZN and Zp × Zq are the same. This in combination with the fact that f is one-to-one implies that f is bijective.
In the following paragraph, let +N denote addition modulo N, and let denote the group operation in Zp × Zq (i.e., addition modulo p in the first component and addition modulo q in the second component). To conclude the proof that f is an isomorphism from ZN to Zp ×Zq, we need to show that foralla,b∈ZN itholdsthatf(a+N b)=f(a)f(b).
Number Theory and Cryptographic Hardness Assumptions 299 To see that this is true, note that
f(a+N b)=[(a+N b)modp],[(a+N b)modq] = [(a + b) mod p], [(a + b) mod q]
= [a mod p], [a mod q] [b mod p], [b mod q] = f(a) f(b). (For the second equality, above, we use the fact that [[X mod N ] mod p ] =
[[X mod p] mod p] when p|N; see Exercise 8.9.)
An extension of the Chinese remainder theorem says that if p1, p2, . .. , pl are
def l pairwise relatively prime (i.e., gcd(pi,pj) = 1 for all i ̸= j) and N = i=1 pi,
then
ZN ≃Zp1 ×···×Zpl and Z∗N ≃Z∗p1 ×···×Z∗pl.
An isomorphism in each case is obtained by a natural extension of the one used in the theorem above.
By way of notation, with N understood and x ∈ {0,1,…,N −1} we write x↔(xp,xq)forxp =[xmodp]andxq =[xmodq]. Thatis,x↔(xp,xq)if and only if f(x) = (xp,xq), where f is as in the theorem above. One way to think about this notation is that it means “x (in ZN ) corresponds to (xp , xq ) (in Zp × Zq ).” The same notation is used when dealing with x ∈ Z∗N .
Example 8.25
Take 15 = 5 · 3, and consider Z∗15 = {1, 2, 4, 7, 8, 11, 13, 14}. The Chinese remainder theorem says this group is isomorphic to Z∗5 ×Z∗3. We can compute
1↔(1,1) 2↔(2,2) 4↔(4,1) 7↔(2,1) , 8↔(3,2) 11↔(1,2) 13↔(3,1) 14↔(4,2)
where each pair (a, b) with a ∈ Z∗5 and b ∈ Z∗3 appears exactly once. ♦ Using the Chinese Remainder Theorem
If two groups are isomorphic, then they both serve as representations of the same underlying “algebraic structure.” Nevertheless, the choice of which rep- resentation to use can affect the computational efficiency of group operations. We discuss this abstractly, and then in the specific context of ZN and Z∗N .
Let G,H be groups with operations ◦G,◦H, respectively, and say f is an isomorphism from G to H where both f and f−1 can be computed efficiently. Then for g1,g2 ∈ G we can compute g = g1 ◦G g2 in two ways: either by directly computing the group operation in G, or via the following steps:
1. Compute h1 = f(g1) and h2 = f(g2);
300 Introduction to Modern Cryptography
2. Compute h = h1 ◦H h2 using the group operation in H;
3. Compute g = f−1(h).
The above extends in the natural way when we want to compute multiple group operations in G (e.g., to compute gx for some integer x). Which method is better depends on the relative efficiency of computing the group operation in each group, as well as the efficiency of computing f and f−1.
We now turn to the specific case of computations modulo N, when N = pq is a product of distinct primes. The Chinese remainder theorem shows that addition, multiplication, or exponentiation (which is just repeated multiplica- tion) modulo N can be “transformed” to analogous operations modulo p and q. Building on Example 8.25, we show some simple examples with N = 15.
Example 8.26
Say we want to compute the product 14 · 13 modulo 15 (i.e., in Z∗15). Exam- ple8.25gives14↔(4,2)and13↔(3,1). InZ∗5 ×Z∗3,wehave
(4,2)·(3,1)=([4·3mod5], [2·1mod3])=(2,2).
Note (2, 2) ↔ 2, which is the correct answer since 14 · 13 = 2 mod 15. ♦
Example 8.27
Say we want to compute 1153 mod 15. Example 8.25 gives 11 ↔ (1, 2). Notice that 2 = −1 mod 3 and so
(1, 2)53 = ([153 mod 5], [(−1)53 mod 3]) = (1, [−1 mod 3]) = (1, 2). Thus, 1153 mod 15 = 11. ♦
Example 8.28
Say we want to compute [29100 mod 35]. We first compute the correspon- dence 29 ↔ ([29 mod 5], [29 mod 7]) = ([−1 mod 5], 1). Using the Chinese remainder theorem, we have
([−1 mod 5], 1)100 = ([(−1)100 mod 5], [1100 mod 7]) = (1, 1),
and it is immediate that (1, 1) ↔ 1. We conclude that [29100 mod 35] = 1. ♦
Example 8.29
Say we want to compute [1825 mod 35]. We have 18 ↔ (3, 4) and so 1825 mod 35 ↔ (3, 4)25 = ([325 mod 5], [425 mod 7]).
Since Z∗5 is a group of order 4, we can “work modulo 4 in the exponent” (cf. Corollary 8.15) and see that
325 =3[25mod4] =31 =3mod5.
Number Theory and Cryptographic Hardness Assumptions 301 Similarly,
425 =4[25mod6] =41 =4mod7.
Thus, ([325 mod 5], [425 mod 7]) = (3,4) ↔ 18 and so [1825 mod 35] = 18. ♦
One thing we have not yet discussed is how to convert back and forth between the representation of an element modulo N and its representation modulo p and q. The conversion can be carried out efficiently provided the factorization of N is known. Assuming p and q are known, it is easy to map an element x modulo N to its corresponding representation modulo p and q: the element x corresponds to ([x mod p], [x mod q]), and both the modular reductions can be carried out efficiently (cf. Appendix B.2).
For the other direction, we make use of the following observation: an ele- ment with representation (xp, xq) can be written as
(xp , xq ) = xp · (1, 0) + xq · (0, 1).
So, if we can find elements 1p,1q ∈ {0,…,N −1} such that 1p ↔ (1,0) and
1q ↔ (0, 1), then (appealing to the Chinese remainder theorem) we know that
(xp, xq) ↔ [(xp · 1p + xq · 1q) mod N].
Since p,q are distinct primes, gcd(p,q) = 1. We can use the extended Eu- clidean algorithm (cf. Appendix B.1.2) to find integers X, Y such that
Xp + Y q = 1.
Note that Yq = 0modq and Yq = 1−Xp = 1modp. This means that [YqmodN]↔(1,0);i.e.,[YqmodN]=1p. Similarly,[Xp modN]=1q.
In summary, we can convert an element represented as (xp,xq) to its rep- resentation modulo N in the following way (assuming p and q are known):
1. ComputeX,Y suchthatXp+Yq=1.
2. Set1p :=[YqmodN]and1q :=[XpmodN]. 3. Compute x := [(xp · 1p + xq · 1q) mod N].
If many such conversions will be performed, then 1p,1q can be computed once-and-for-all in a preprocessing phase.
Example 8.30
Take p = 5, q = 7, and N = 5 · 7 = 35. Say we are given the representation (4,3) and want to convert this to the corresponding element of Z35. Using the extended Euclidean algorithm, we compute
3 · 5 − 2 · 7 = 1.
302 Introduction to Modern Cryptography
Thus,1p =[−2·7mod35]=21and1q =[3·5mod35]=15. (Wecancheck that these are correct: e.g., for 1p = 21 we can verify that [21 mod 5] = 1 and [21 mod 7] = 0.) Using these values, we can then compute
(4,3) = 4·(1,0)+3·(0,1)
↔ [4·1p +3·1q mod 35]
= [4·21+3·15mod35] = 24.
Since 24 = 4 mod 5 and 24 = 3 mod 7, this is indeed the correct result. ♦
8.2 Primes, Factoring, and RSA
In this section, we show the first examples of number-theoretic problems that are conjectured to be “hard.” We begin with a discussion of one of the oldest problems: integer factorization or just factoring.
Given a composite integer N, the factoring problem is to find integers p, q > 1 such that pq = N . Factoring is a classic example of a hard problem, both because it is so simple to describe and since it has been recognized as a hard computational problem for a long time (even before its use in cryptog- raphy). The problem can be solved in exponential time O(√N · polylog(N )) using trial division: that is, by exhaustively checking whether p divides N for p = 2,…,⌊√N⌋. (This method requires √N divisions, each one taking polylog(N) = ∥N∥c time for some constant c.) This always succeeds because although the largest prime factor of N may be as large as N/2, the smallest prime factor of N can be at most ⌊√N⌋. Although algorithms with better running time are known (see Chapter 9), no polynomial-time algorithm for factoring has been demonstrated despite many years of effort.
Consider the following experiment for a given algorithm A and parameter n: The weak factoring experiment w-FactorA(n):
1. Choose two uniform n-bit integers x1, x2.
2. Compute N := x1 · x2.
3. A is given N, and outputs x′1, x′2 > 1.
4. The output of the experiment is defined to be 1 if x′1 ·x′2 = N, and 0 otherwise.
We have just said that the factoring problem is believed to be hard. Does this mean that
Pr[w-FactorA(n) = 1] ≤ negl(n)
is negligible for every ppt algorithm A? Not at all. For starters, the number N in the above experiment is even with probability 3/4 (this occurs when
Number Theory and Cryptographic Hardness Assumptions 303
either x1 or x2 is even); it is, of course, easy for A to factor N in this case. While we can make A’s job more difficult by requiring A to output integers x′1,x′2 of length n, it remains the case that x1 or x2 (and hence N) might have small prime factors that can still be easily found. For cryptographic applications, we will need to prevent this.
As this discussion indicates, the “hardest” numbers to factor are those having only large prime factors. This suggests redefining the above experiment so that x1,x2 are random n-bit primes rather than random n-bit integers, and in fact such an experiment will be used when we formally define the factoring assumption in Section 8.2.3. For this experiment to be useful in a cryptographic setting, however, it is necessary to be able to generate random n-bit primes efficiently. This is the topic of the next two sections.
8.2.1 Generating Random Primes
A natural approach to generating a random n-bit prime is to repeatedly choose random n-bit integers until we find one that is prime; we repeat this at most t times or until we are successful. See Algorithm 8.31 for a high-level description of the process.
ALGORITHM 8.31
Generating a random prime – high-level outline Input: Length n; parameter t
Output: A uniform n-bit prime
for i = 1 to t:
p′ ← {0, 1}n−1
p := 1∥p′
if p is prime return p
return fail
Note that the algorithm forces the output to be an integer of length exactly n (rather than length at most n) by fixing the high-order bit of p to “1.” Our convention throughout this book is that an “integer of length n” means an integer whose binary representation with most significant bit equal to 1 is exactly n bits long.
Given a way to determine whether or not a given integer p is prime, the above algorithm outputs a uniform n-bit prime conditioned on the event that it does not output fail. The probability that the algorithm outputs fail depends on t, and for our purposes we will want to set t so as to obtain a failure probability that is negligible in n. To show that Algorithm 8.31 leads to an efficient (i.e., polynomial-time in n) algorithm for generating primes, we need a better understanding of two issues: (1) the probability that a uniform n- bit integer is prime and (2) how to efficiently test whether a given integer
304 Introduction to Modern Cryptography
p is prime. We discuss these issues briefly now, and defer a more in-depth
exploration of the second topic to the following section.
The distribution of primes. The prime number theorem, an important result in mathematics, gives fairly precise bounds on the fraction of integers of a given length that are prime. For our purposes, we need only a weak, one-sided version of that result that we do not prove here:
THEOREM 8.32 (Bertrand’s postulate) For any n > 1, the fraction of n-bit integers that are prime is at least 1/3n.
Returning to the approach for generating primes described above, this implies that if we set t = 3n2 then the probability that a prime is not chosen in all t iterations of the algorithm is at most
1 t 1 3nn −1n −n 1−3n=1−3n ≤e=e
(using Inequality A.2), which is negligible in n. Thus, using poly(n) iterations we obtain an algorithm for which the probability of outputting fail is negligible in n. (Tighter results than Theorem 8.32 are known, and so in practice even fewer iterations are needed.)
Testing primality. The problem of efficiently determining whether a given number is prime has a long history. In the 1970s the first efficient algorithms for testing primality were developed. These algorithms were probabilistic and had the following guarantee: if the input p were a prime number, the algorithm would always output “prime.” On the other hand, if p were composite, then the algorithm would almost always output “composite,” but might output the wrong answer (“prime”) with probability negligible in the length of p. Put differently, if the algorithm outputs “composite” then p is definitely composite, but if the output is “prime” then it is very likely that p is prime but it is also possible that a mistake has occurred (and p is really composite).2
When using a randomized primality test of this sort in Algorithm 8.31 (the prime-generation algorithm shown earlier), the output of the algorithm is a uniform prime of the desired length so long as the algorithm does not output fail and the randomized primality test did not err during the execution of the algorithm. This means that an additional source of error (besides the possibility of outputting fail) is introduced, and the algorithm may now output a composite number by mistake. Since we can ensure that this happens with only negligible probability, this remote possibility is of no practical concern and we can safely ignore it.
2There also exist probabilistic primality tests that work in the opposite way: they always correctly identify composite numbers but sometimes make a mistake when given a prime as input. We will not consider algorithms of this type.
Number Theory and Cryptographic Hardness Assumptions 305
A deterministic polynomial-time algorithm for testing primality was demon- strated in a breakthrough result in 2002. That algorithm, although running in polynomial time, is slower than the probabilistic tests mentioned above. For this reason, probabilistic primality tests are still used exclusively in practice for generating large prime numbers.
In Section 8.2.2 we describe and analyze one of the most commonly used probabilistic primality tests: the Miller–Rabin algorithm. This algorithm takes two inputs: an integer p and a parameter t (in unary) that determines the error probability. The Miller–Rabin algorithm runs in time polynomial in ∥p∥ and t, and satisfies:
THEOREM 8.33 If p is prime, then the Miller–Rabin test always outputs “prime.” If p is composite, the algorithm outputs “composite” except with probability at most 2−t.
Putting it all together. Given the preceding discussion, we can now de- scribe a polynomial-time prime-generation algorithm that, on input n, outputs an n-bit prime except with probability negligible in n; moreover, conditioned on the output p being prime, p is a uniformly distributed n-bit prime. The full procedure is described in Algorithm 8.34.
ALGORITHM 8.34
Generating a random prime Input: Length n
Output: A uniform n-bit prime
for i=1 to 3n2:
p′ ← {0, 1}n−1
p := 1∥p′
run the Miller–Rabin test on input p and parameter 1n if the output is “prime,” return p
return fail
Generating primes of a particular form. It is sometimes desirable to generate a random n-bit prime p of a particular form, for example, satisfying p = 3mod4 or such that p = 2q+1 where q is also prime (p of the latter type are called strong primes). In this case, appropriate modifications of the prime-generation algorithm shown above can be used. (For example, in order to obtain a prime of the form p = 2q + 1, modify the algorithm to generate a random prime q, compute p := 2q + 1, and then output p if it too is prime.) While these modified algorithms work well in practice, rigorous proofs that they run in polynomial time and fail with only negligible probability are more complex (and, in some cases, rely on unproven number-theoretic conjectures
306 Introduction to Modern Cryptography
regarding the density of primes of a particular form). A detailed exploration of these issues is beyond the scope of this book, and we will simply assume the existence of appropriate prime-generation algorithms when needed.
8.2.2 *Primality Testing
We now describe the Miller–Rabin primality test and prove Theorem 8.33. (We rely on the material presented in Section 8.1.5.) This material is not used directly in the rest of the book.
The key to the Miller–Rabin algorithm is to find a property that distin- guishes primes and composites. Let N denote the input number to be tested. We start with the following observation: if N is prime then |Z∗N | = N − 1, andsoforanya∈{1,…,N−1}wehaveaN−1 =1modN byTheorem8.14. This suggests testing whether N is prime by choosing a uniform element a
and checking whether aN−1 =? 1 mod N. If aN−1 ̸= 1 mod N, then N can- not be prime. Conversely, we might hope that if N is not prime then there is a reasonable chance that we will pick a with aN−1 ̸= 1 mod N, and so by repeating this test many times we can determine whether N is prime or not with high confidence. The above approach is shown as Algorithm 8.35. (Recall that exponentiation modulo N and computation of greatest common divisors can be carried out in polynomial time. Choosing a uniform element of {1, . . . , N − 1} can also be done in polynomial time. See Appendix B.2.)
ALGORITHM 8.35
Primality testing – first attempt Input: Integer N and parameter 1t
Output: A decision as to whether N is prime or composite
for i = 1 to t:
a ← {1, . . . , N − 1}
if aN−1 ̸= 1 mod N return “composite”
return “prime”
If N is prime the algorithm always outputs “prime.” If N is composite, the algorithm outputs “composite” if in any iteration it finds an a ∈ {1, . . . , N −1} such that aN−1 ̸= 1 mod N. Observe that if a ̸∈ Z∗N then aN−1 ̸= 1 mod N. (If gcd(a,N) ̸= 1 then gcd(aN−1,N) ̸= 1 and so [aN−1 modN] cannot equal 1.) For now, we therefore restrict our attention to a ∈ Z∗N . We re- fer to any such a with aN−1 ̸= 1 mod N as a witness that N is composite, or simply a witness. We might hope that when N is composite there are many witnesses, and thus the algorithm finds such a witness with “high” probability. This intuition is correct provided there is at least one witness. Before proving this, we need two group-theoretic lemmas.
Number Theory and Cryptographic Hardness Assumptions 307 PROPOSITION 8.36 Let G be a finite group, and H ⊆ G. Assume H is
nonempty, and for all a,b∈H we have ab∈H. Then H is a subgroup of G.
PROOF We need to verify that H satisfies all the conditions of Defini- tion 8.9. By assumption, H is closed under the group operation. Associativity in H is inherited automatically from G. Let m = |G| (here is where we use the fact that G is finite), and consider an arbitrary element a ∈ H. Closure of H means that H contains am−1 = a−1 as well as am = 1. Thus, H contains the inverse of each of its elements, as well as the identity.
LEMMA 8.37 Let H be a strict subgroup of a finite group G (i.e., H ̸= G). Then |H| ≤ |G|/2.
PROOF Let h ̄ be an element of G that is not in H; since H ̸= G, we ̄ ̄def ̄
know such an h exists. Consider the set H = {hh | h ∈ H}. We show that (1) |H ̄| = |H|, and (2) every element of H ̄ lies outside of H; i.e., the intersection of H and H ̄ is empty. Since both H and H ̄ are subsets of G, these imply |G| ≥ |H| + |H ̄ | = 2|H|, proving the lemma.
For any h1, h2 ∈ H, if h ̄h1 = ̄hh2 then, multiplying by h ̄−1 on each side, we have h1 = h2. This shows that every distinct element h ∈ H corresponds to a distinct element h ̄h ∈ H ̄, proving (1).
Assume toward a contradiction that h ̄h ∈ H for some h. This means h ̄h = h′ forsomeh′ ∈H,andsoh ̄=h′h−1. Now,h′h−1 ∈HsinceHisasubgroup and h′,h−1 ∈ H. But this means that h ̄ ∈ H, in contradiction to the way h ̄ was chosen. This proves (2) and completes the proof of the lemma.
The following theorem will enable us to analyze the algorithm given earlier.
THEOREM 8.38 Fix N. Say there exists a witness that N is composite. Then at least half the elements of Z∗N are witnesses that N is composite.
PROOF Let Bad be the set of elements in Z∗N that are not witnesses; that is, a ∈ Bad means aN−1 = 1modN. Clearly, 1 ∈ Bad. If a,b ∈ Bad, then (ab)N−1 = aN−1 ·bN−1 = 1·1 = 1modN and hence ab ∈ Bad. By Lemma 8.36, we conclude that Bad is a subgroup of Z∗N . Since (by assumption) there is at least one witness, Bad is a strict subgroup of Z∗N . Lemma 8.37 then shows that |Bad| ≤ |Z∗N |/2, showing that at least half the elements of Z∗N are not in Bad (and hence are witnesses).
Let N be composite. If there exists a witness that N is composite, then there are at least |Z∗N |/2 witnesses. The probability that we find either a witness or an element not in Z∗N in any given iteration of the algorithm is
308 Introduction to Modern Cryptography
thus at least 1/2, and so the probability that the algorithm does not find a witness in any of the t iterations (and hence the probability that the algorithm mistakenly outputs “prime”) is at most 2−t.
The above, unfortunately, does not give a complete solution since there are infinitely many composite numbers N that do not have any witnesses that they are composite! Such values N are known as Carmichael numbers; a detailed discussion is beyond the scope of this book.
Happily, a refinement of the above test can be shown to work for all N. LetN−1=2ru,whereuisoddandr≥1. (Itiseasytocomputerandu given N. Also, restricting to r ≥ 1 means that N is odd, but testing primality is easy when N is even!) The algorithm shown previously tests only whether aN−1 = a2ru = 1 mod N. A more refined algorithm looks at the sequence of r + 1 values au,a2u,…,a2ru (all modulo N). Each term in this sequence is the square of the preceding term; thus, if some value is equal to ±1 then all subsequent values will be equal to 1.
Say that a ∈ Z∗N is a strong witness that N is composite (or simply a strong witness) if (1) au ̸= ±1 mod N and (2) a2iu ̸= −1 mod N for all i ∈ {1,…,r − 1}. Note that when an element a is not a strong witness then the sequence (au,a2u,…,a2ru) (all taken modulo N) takes one of the following forms:
(±1,1,…,1) or (⋆,…,⋆,−1,1,…,1),
where ⋆ is an arbitrary term. If a is not a strong witness then we have
a2r−1u = ±1 mod N and
aN−1 = a2ru = a2r−1u2 = 1 mod N,
and so a is not a witness that N is composite, either. Put differently, if a is a witness then it is also a strong witness and so there can only possibly be more strong witnesses than witnesses.
We first show that if N is prime then there does not exist a strong witness that N is composite. In doing so, we rely on the following easy lemma (which is a special case of Proposition 13.16 proved subsequently in Chapter 13):
LEMMA8.39 Sayx∈Z∗N isasquarerootof1moduloN ifx2 =1modN. If N is an odd prime then the only square roots of 1 modulo N are [±1 mod N].
PROOF Sayx2=1modNwithx∈{1,…,N−1}.Then0=x2−1= (x+1)(x−1)modN, implying that N|(x+1) or N|(x−1) by Proposi- tion 8.3. This can only possibly occur if x = [±1 mod N].
LetN beanoddprimeandfixarbitrarya∈Z∗N. Leti≥0betheminimum valueforwhicha2iu =1modN;sincea2ru =aN−1 =1modN weknowthat
Number Theory and Cryptographic Hardness Assumptions 309 somesuchi≤rexists. Ifi=0thenau =1modNandaisnotastrong
witness. Otherwise,
a2i−1u2 = a2iu = 1 mod N
and a2i−1u is a square root of 1. If N is an odd prime, the only square roots of 1 are ±1; by choice of i, however, a2i−1u ̸= 1 mod N. So a2i−1u = −1 mod N, and a is not a strong witness. We conclude that when N is an odd prime there is no strong witness that N is composite.
A composite integer N is a prime power if N = pr for some prime p and integer r ≥ 1. We now show that every odd, composite N that is not a prime power has many strong witnesses.
THEOREM 8.40 Let N be an odd number that is not a prime power. Then at least half the elements of Z∗N are strong witnesses that N is composite.
PROOF Let Bad ⊆ Z∗N denote the set of elements that are not strong witnesses. We define a set Bad′ and show that: (1) Bad is a subset of Bad′, and (2) Bad′ is a strict subgroup of Z∗N . This suffices because by combining (2) and Lemma 8.37 we have that |Bad′| ≤ |Z∗N|/2. Furthermore, by (1) it holds that Bad ⊆ Bad′, and so |Bad| ≤ |Bad′| ≤ |Z∗N |/2 as in Theorem 8.38. Thus, at least half the elements of Z∗N are strong witnesses. (We stress that we do not claim that Bad is a subgroup of Z∗N .)
Note first that −1 ∈ Bad since (−1)u = −1 mod N (recall u is odd). Let i ∈ {0,…,r−1} be the largest integer for which there exists an a ∈ Bad with a2iu = −1 mod N; alternatively, i is the largest integer for which there exists an a ∈ Bad with
(au,a2u,…,a2ru) = (⋆,…,⋆,−1,1,…,1).
i + 1 terms
Since −1 ∈ Bad and (−1)20u = −1 mod N, some such i exists.
=±1modN}.
Fix i as above, and define
′ def 2iu
Bad = {a|a We now prove what we claimed above.
CLAIM 8.41 Bad ⊆ Bad′.
Let a ∈ Bad. Then either au = 1modN or a2ju = −1modN for some j ∈ {0,…,r−1}. In the first case, a2iu = (au)2i = 1 mod N and so a ∈ Bad′. Inthesecondcase,wehavej≤ibychoiceofi. Ifj=ithenclearlya∈Bad′.
310 Introduction to Modern Cryptography
If j < i then a2iu = (a2ju)2i−j = 1modN and a ∈ Bad′. Since a was
arbitrary, this shows Bad ⊆ Bad′.
CLAIM 8.42 Bad′ is a subgroup of Z∗N .
Clearly 1 ∈ Bad′. Furthermore, if a, b ∈ Bad′ then
(ab)2iu = a2iub2iu = (±1)(±1) = ±1 mod N
and so ab ∈ Bad′. By Lemma 8.36, Bad′ is a subgroup. CLAIM 8.43 Bad′ is a strict subgroup of Z∗N .
If N is an odd, composite integer that is not a prime power, then N can be written as N = N1N2 with N1, N2 > 1 odd and gcd(N1, N2) = 1. Appealing to the Chinese remainder theorem, let a ↔ (a1 , a2 ) denote the representation of a ∈ Z∗N as an element of Z∗N1 ×Z∗N2; that is, a1 = [amodN1] and a2 =
[a mod N2]. Take a ∈ Bad′ such that a2iu = −1 mod N (such an a must exist by the way we defined i), and say a ↔ (a1, a2). Since −1 ↔ (−1, −1) we have
(a ,a )2iu = (a2iu, a2iu) = (−1,−1), 1212
and so
Consider the element b ∈ Z∗N with b ↔ (a1, 1). Then
a2iu =−1modN and a2iu =−1modN . 1122
b2iu ↔ (a ,1)2iu = ([a2iu mod N ], 1) = (−1,1)↮ ±1. 111
That is, b2iu ̸= ±1 mod N and so we have found an element b ̸∈ Bad′. This proves that Bad′ is a strict subgroup of Z∗N and so, by Lemma 8.37, the size of Bad′ (and thus the size of Bad) is at most half the size of Z∗N .
An integer N is a perfect power if N = Nˆe for integers Nˆ and e ≥ 2 (here it is not required for Nˆ to be prime, although of course any prime power is also a perfect power). Algorithm 8.44 gives the Miller–Rabin primality test. Exercises 8.12 and 8.13 ask you to show that testing whether N is a perfect power, and testing whether a particular a is a strong witness, can be done in polynomial time. Given these results, the algorithm clearly runs in time polynomial in ∥N∥ and t. We can now complete the proof of Theorem 8.33:
PROOF If N is an odd prime, there are no strong witnesses and so the Miller–Rabin algorithm always outputs “prime.” If N is even or a prime power, the algorithm always outputs “composite.” The interesting case is when N is an odd, composite integer that is not a prime power. Consider any
Number Theory and Cryptographic Hardness Assumptions 311
ALGORITHM 8.44
The Miller–Rabin primality test Input: Integer N > 2 and parameter 1t
Output: A decision as to whether N is prime or composite
if N is even, return “composite”
if N is a perfect power, return “composite” compute r ≥ 1 and u odd such that N − 1 = 2ru for j = 1 to t:
a ← {1, . . . , N − 1} i
ifau ̸=±1modN anda2 u ̸=−1modN fori∈{1,…,r−1}
return “composite” return “prime”
iteration of the inner loop. Note first that if a ̸∈ Z∗N then au ̸= ±1 mod N and a2iu ̸= −1 mod N for i ∈ {1,…,r − 1}. The probability of finding either a
strongwitnessoranelementnotinZ∗N isatleast1/2(invokingTheorem8.40). Thus, the probability that the algorithm never outputs “composite” in any of the t iterations is at most 2−t.
8.2.3 The Factoring Assumption
Let GenModulus be a polynomial-time algorithm that, on input 1n, outputs (N, p, q) where N = pq, and p and q are n-bit primes except with probability negligible in n. (The natural way to do this is to generate two uniform n-bit primes, as discussed previously, and then multiply them to obtain N.) Then consider the following experiment for a given algorithm A and parameter n:
The factoring experiment FactorA,GenModulus(n):
1. Run GenModulus(1n) to obtain (N, p, q).
2. A is given N, and outputs p′,q′ > 1.
3. The output of the experiment is defined to be 1 if p′ · q′ = N, and 0 otherwise.
Note that if the output of the experiment is 1 then {p′, q′} = {p, q}, unless p or q are composite (which happens with only negligible probability).
We now formally define the factoring assumption:
DEFINITION 8.45 Factoring is hard relative to GenModulus if for all probabilistic polynomial-time algorithms A there exists a negligible function negl such that
Pr[FactorA,GenModulus(n) = 1] ≤ negl(n).
The factoring assumption is the assumption that there exists a GenModulus relative to which factoring is hard.
312 Introduction to Modern Cryptography 8.2.4 The RSA Assumption
The factoring problem has been studied for hundreds of years without an efficient algorithm being found. Although the factoring assumption does give a one-way function (see Section 8.4.1), it unfortunately does not directly yield practical cryptosystems. (In Section 13.5.2, however, we show how to con- struct efficient cryptosystems based on a problem whose hardness is equivalent to that of factoring.) This has motivated a search for other problems whose difficulty is related to the hardness of factoring. The best known of these is a problem introduced in 1978 by Rivest, Shamir, and Adleman and now called the RSA problem in their honor.
Given a modulus N and an integer e > 2 relatively prime to φ(N), Corol- lary 8.22 shows that exponentiation to the eth power modulo N is a permu- tation. Wecanthereforedefine[y1/e modN](foranyy∈Z∗N)astheunique element of Z∗N which yields y when raised to the eth power modulo N; that is, x = y1/e mod N if and only if xe = y mod N. The RSA problem, informally, is to compute [y1/e mod N] for a modulus N of unknown factorization.
Formally, let GenRSA be a probabilistic polynomial-time algorithm that, on input 1n, outputs a modulus N that is the product of two n-bit primes, as well as integers e,d > 0 with gcd(e,φ(N)) = 1 and ed = 1 mod φ(N). (Such a d exists since e is invertible modulo φ(N). The purpose of d will become clear later.) The algorithm may fail with probability negligible in n. Consider the following experiment for a given algorithm A and parameter n:
The RSA experiment RSA-invA,GenRSA(n):
1. Run GenRSA(1n) to obtain (N, e, d).
2. Choose a uniform y ∈ Z∗N .
3. A is given N,e,y, and outputs x ∈ Z∗N.
4. The output of the experiment is defined to be 1 if xe = y mod N, and 0 otherwise.
DEFINITION 8.46 The RSA problem is hard relative to GenRSA if for all probabilistic polynomial-time algorithms A there exists a negligible function negl such that Pr[RSA-invA,GenRSA(n) = 1] ≤ negl(n).
The RSA assumption is that there exists a GenRSA algorithm relative to which the RSA problem is hard. A suitable GenRSA algorithm can be con- structed from any algorithm GenModulus that generates a composite modulus along with its factorization. A high-level outline is provided as Algorithm 8.47, where the only thing left unspecified is how exactly e is chosen. In fact, the RSA problem is believed to be hard for any e that is relatively prime to φ(N). We discuss some typical choices of e below.
Number Theory and Cryptographic Hardness Assumptions 313
ALGORITHM 8.47
GenRSA – high-level outline Input: Security parameter 1n
Output: N, e, d as described in the text
(N, p, q) ← GenModulus(1n)
φ(N) := (p − 1)(q − 1)
choose e > 1 such that gcd(e,φ(N)) = 1 compute d := [e−1 mod φ(N)]
return N, e, d
Example 8.48
Say GenModulus outputs (N,p,q) = (143,11,13). Then φ(N) = 120. Next, we need to choose an e that is relatively prime to φ(N); say we take e = 7. The next step is to compute d such that d = [e−1 mod φ(N)]. This can be done as shown in Appendix B.2.2 to obtain d = 103. (One can check that 7·103 = 721 = 1 mod 120.) Our GenRSA algorithm thus outputs (143, 7, 103).
As an example of the RSA problem relative to these parameters, take y = 64 and so the problem is to compute the 7th root of 64 modulo 143 without knowledge of d or the factorization of N. ♦
Computing eth roots modulo N becomes easy if d, φ(N ), or the factorization of N is known. (As we show in the next section, any of these can be used to efficiently compute the others.) This follows from Corollary 8.22, which shows that [yd mod N] is the eth root of y modulo N. This asymmetry—namely, that the RSA problem appears to be hard when d or the factorization of N is unknown, but becomes easy when d is known—serves as the basis for applications of the RSA problem to public-key cryptography.
Example 8.49
Continuing the previous example, we can compute the 7th root of 64 mod- ulo 143 using the value d = 103; the answer is 25 = 64d = 64103 mod 143. We can verify that this is the correct solution since 25e = 257 = 64 mod 143. ♦
On the choice of e. There does not appear to be any difference in the hardness of the RSA problem for different exponents e and, as such, different methods have been suggested for selecting it. One popular choice is to set e = 3, since then computing eth powers modulo N requires only two multi- plications (see Appendix B.2.3). If e is to be set equal to 3, then p and q must be chosen with p, q ̸= 1 mod 3 so that gcd(e, φ(N )) = 1. For similar reasons, another popular choice is e = 216 + 1 = 65537, a prime number with low Hamming weight (in Appendix B.2.3, we explain why such exponents are preferable). As compared to choosing e = 3, this makes exponentiation
314 Introduction to Modern Cryptography
slightly more expensive but reduces the constraints on p and q, and avoids some “low-exponent attacks” (described at the end of Section 11.5.1) that can result from poorly implemented cryptosystems based on the RSA problem.
Note that choosing d small (that is, changing GenRSA to choose small d and then compute e := [d−1 modφ(N)]) is a bad idea. If d lies in a very small range then a brute-force search for d can be carried out (and, as noted, once d is known the RSA problem can be solved easily). Even if d is chosen so that d ≈ N1/4, and so brute-force attacks are ruled out, there are known algorithms that can be used to recover d from N and e in this case.
8.2.5 *Relating the RSA and Factoring Assumptions
Say GenRSA is constructed as in Algorithm 8.47. If N can be factored, then we can compute φ(N) and use this to compute d := [e−1 mod φ(N)] for any given e (using Algorithm B.11). So for the RSA problem to be hard relative to GenRSA, the factoring problem must be hard relative to GenModulus. Put differently, the RSA problem cannot be more difficult than factoring; hard- ness of factoring (relative to GenModulus) can only potentially be a weaker assumption than hardness of the RSA problem (relative to GenRSA).
What about the other direction? That is, is hardness of the RSA problem implied by hardness of factoring? This remains an open question. The best we can show is that computing d from N and e is as hard as factoring.
THEOREM 8.50 There is a probabilistic polynomial-time algorithm that, given as input a composite integer N and integers e, d with ed = 1 mod φ(N ), outputs a factor of N except with probability negligible in ∥N∥.
PROOF The theorem holds for any N, but for simplicity—and because it is the case most relevant to cryptography—we focus here on the case where N is a product of two distinct (odd) primes. We rely on Proposition 8.36 and Lemma 8.37 as well as the following facts (which follow from more general results proved in Sections 13.4.2 and 13.5.2):
• If N is a product of two distinct, odd primes, then 1 has exactly four square roots modulo N. Two of these are the “trivial” square roots ±1, and two of these are “nontrivial” square roots.
• Any nontrivial square root of 1 can be used to (efficiently) compute a factor of N. This is by virtue of the fact that y2 = 1 mod N implies
0=y2 −1=(y−1)(y+1)modN,
andsoN|(y−1)(y+1). However,N̸|(y−1)andN̸|(y+1)because y ̸= ±1modN. So it must be the case that gcd(y−1,N) is equal to one of the prime factors of N.
Let k = ed − 1 and note that φ(N ) | k. Using Corollary 8.21, we have xk =1modN forallx∈Z∗N. Letk=2ruforuanoddinteger;notethat
Number Theory and Cryptographic Hardness Assumptions 315 r ≥ 1 since φ(N) (and hence k) is even. Our strategy for factoring N will be
to repeatedly choose a uniform x ∈ Z∗N and compute the sequence xu, x2u, …, x2ru,
all modulo N. Each term in this sequence is the square of the preceding term and, as we have just noted, the final term in the sequence is 1. Take the
def 2iu
largesti(ifany)forwhichy = [x modN]̸=1. Byourchoiceofi,wehave
y2 = 1 mod N. If y ̸= −1 we have found a nontrivial square root of N, and can then factor N as discussed above.
All the above steps can be done in polynomial time, and so the only question is to determine the probability, over choice of x, that y is a nontrivial square root of N. Let i ∈ {0,…,r − 1} be the largest value of i for which there existsanx∈Z∗N suchthatx2iu ̸=1modN. (Sinceuisodd(−1)u =−1̸= 1 mod N , and so the definition is not vacuous.) Then for all x ∈ Z∗N , we have x2i+1u = 1 mod N and so [x2iu mod N] is a square root of 1. Define
def 2iu
Bad = {x|x =±1modN}
and observe that if our algorithm chooses x ̸∈ Bad then it finds a nontrivial square root of 1. We show that Bad is a strict subgroup of Z∗N ; by Lemma 8.37, this implies that |Bad| ≤ |Z∗N |/2. This means that x ̸∈ Bad (and the algo- rithm finds a nontrivial square root of 1) with probability at least 1/2 in each iteration. Using sufficiently many iterations gives the result of the theorem.
We now prove that Bad is a strict subgroup of Z∗N . First note that Bad is not empty, since 1 ∈ Bad. Furthermore, if x, x′ ∈ Bad then
(xx′)2iu = x2iu(x′)2iu = (±1) · (±1) = ±1 mod N,
and so xx′ ∈ Bad and Bad is a subgroup. To see that Bad is a strict subgroup, letx∈Z∗N besuchthatx2iu ̸=1modN (suchanxmustexistbyourdefi- nition of i). If x2iu ̸= −1 mod N, then x ̸∈ Bad and we are done. Otherwise, let N = pq with p, q prime, and let x ↔ (xp , xq ) be the Chinese remaindering representation of x. Since x2iu = −1 mod N, we know that
(x ,x )2iu = (x2iu,x2iu) = (−1,−1) ↔ −1. pqpq
But then (xp,1) (or rather, the element corresponding to it) is not in Bad
since
This completes the proof.
(x ,1)2iu = ([x2iu mod p],1) = (−1,1) ̸↔ ±1. pp
Assuming factoring is hard, the above result rules out the possibility of efficiently solving the RSA problem by first computing d from N and e. How- ever, it does not rule out the possibility that there might be some completely
316 Introduction to Modern Cryptography
different way of attacking the RSA problem that does not involve (or im- ply) factoring N. Thus, based on our current knowledge, the RSA assump- tion is stronger than the factoring assumption—that is, it may be that the RSA problem can be solved in polynomial time even though factoring cannot. Nevertheless, when GenRSA is constructed based on GenModulus as in Algo- rithm 8.47, the prevailing conjecture is that the RSA problem is hard relative to GenRSA whenever factoring is hard relative to GenModulus.
8.3 Cryptographic Assumptions in Cyclic Groups
In this section we introduce a class of cryptographic hardness assumptions in cyclic groups. We begin with a general discussion of cyclic groups, followed by abstract definitions of the relevant assumptions. We then look at two concrete and widely used examples of cyclic groups in which these assumptions are believed to hold.
8.3.1 Cyclic Groups and Generators
Let G be a finite group of order m. For arbitrary g ∈ G, consider the set def0 1
(We warn the reader that if G is an infinite group, ⟨g⟩ is defined differently.) By Theorem 8.14, we have gm = 1. Let i ≤ m be the smallest positive integer for which gi = 1. Then the above sequence repeats after i terms (i.e., gi = g0, gi+1 = g1, etc.), and so
⟨g⟩ = g0,…,gi−1.
We see that ⟨g⟩ contains at most i elements. In fact, it contains exactly i elementssinceifgj =gk with0≤j
318 Introduction to Modern Cryptography
gives an important additional class of cyclic groups; a proof is outside the
scope of this book, but can be found in any standard abstract algebra text. THEOREM 8.56 If p is prime then Z∗p is a cyclic group of order p − 1.
For p > 3 prime, Z∗p does not have prime order and so the above does not follow from the preceding corollary.
Example 8.57
Consider the (additive) group Z15. As we have noted, Z15 is cyclic and the element 1 is a generator since 15 · 1 = 0 mod 15 and i · 1 = i ̸= 0 mod 15 for any 0 < i < 15 (recall that in this group the identity is 0).
Z15 has other generators. For example, ⟨2⟩ = {0,2,4, ..., 14,1,3, ..., 13} and so 2 is also a generator.
Not every element generates Z15. For example, the element 3 has order 5 since 5 · 3 = 0 mod 15, and so 3 does not generate Z15. The subgroup ⟨3⟩ consists of the 5 elements {0, 3, 6, 9, 12}, and this is indeed a subgroup under addition modulo 15. The element 10 has order 3 since 3 · 10 = 0 mod 15, and the subgroup ⟨10⟩ consists of the 3 elements {0, 5, 10}. The orders of the subgroups (i.e., 5 and 3) divide |Z15| = 15 as required by Proposition 8.54. ♦
Example 8.58
Consider the (multiplicative) group Z∗15 of order (5 − 1)(3 − 1) = 8. We have ⟨2⟩ = {1, 2, 4, 8}, and so the order of 2 is 4. As required by Proposition 8.54, 4 divides 8. ♦
Example 8.59
Consider the (additive) group Zp of prime order p. We know this group is cyclic, but Corollary 8.55 tells us more: namely, every element except 0 is a generator. Indeed, for any h ∈ {1,...,p − 1} and integer i > 0 we have ih = 0 mod p if and only if p | ih. But then Proposition 8.3 says that either p | h or p | i. The former cannot occur (since h < p), and the smallest positive integer for which the latter can occur is i = p. We have thus shown that every nonzero element h has order p (and so generates Zp), in accordance with Corollary 8.55. ♦
Example 8.60
Consider the (multiplicative) group Z∗7, which is cyclic by Theorem 8.56. We have ⟨2⟩ = {1, 2, 4}, and so 2 is not a generator. However,
⟨3⟩ = {1,3,2,6,4,5} = Z∗7,
and so 3 is a generator of Z∗7. ♦
Number Theory and Cryptographic Hardness Assumptions 319 The following example relies on the material of Section 8.1.5.
Example 8.61
Let G be a cyclic group of order n, and let g be a generator of G. Then the mapping f : Zn → G given by f(a) = ga is an isomorphism between Zn and G. Indeed, for a, a′ ∈ Zn we have
f(a+a′)=g[a+a′ modn] =ga+a′ =ga ·ga′ =f(a)·f(a′). Bijectivity of f can be proved using the fact that n is the order of g. ♦
The previous example shows that all cyclic groups of the same order are isomorphic and thus the same from an algebraic point of view. We stress that this is not true in a computational sense, and in particular an isomorphism f−1 : G → Zn (which we know must exist) need not be efficiently computable. This point should become clearer from the discussion in the sections below as well as Chapter 9.
8.3.2 The Discrete-Logarithm/Diffie–Hellman Assumptions
We now introduce several computational problems that can be defined for any class of cyclic groups. We will keep the discussion in this section abstract, and consider specific examples of groups in which these problems are believed to be hard in Sections 8.3.3 and 8.3.4.
We let G denote a generic, polynomial-time, group-generation algorithm. This is an algorithm that, on input 1n, outputs a description of a cyclic group G, its order q (with ∥q∥ = n), and a generator g ∈ G. The description of a cyclic group specifies how elements of the group are represented as bit- strings; we assume that each group element is represented by a unique bit- string. We require that there are efficient algorithms (namely, algorithms running in time polynomial in n) for computing the group operation in G, as well as for testing whether a given bit-string represents an element of G. Efficient computation of the group operation implies efficient algorithms for exponentiation in G (see Appendix B.2.3) and for sampling a uniform element h ∈ G (simply choose uniform x ∈ Zq and set h := gx).
As discussed at the end of the previous section, although all cyclic groups of a given order are isomorphic, the representation of the group determines the computational complexity of mathematical operations in that group.
If G is a cyclic group of order q with generator g, then {g0, g1, ..., gq−1} is all of G. Equivalently, for every h ∈ G there is a unique x ∈ Zq such that gx = h. When the underlying group G is understood from the context, we call this x the discrete logarithm of h with respect to g and write x = logg h. (Logarithms in this case are called “discrete” since they take values in a finite range, as opposed to “standard” logarithms from calculus whose values range over the infinite set of real numbers.) Note that if gx′ = h for some arbitrary integer x′, then [x′ mod q] = logg h.
320 Introduction to Modern Cryptography
Discrete logarithms obey many of the same rules as “standard” logarithms. For example, logg 1 = 0 (where 1 is the identity of G); for any integer r, we have logg hr = [r · logg h mod q]; and logg(h1h2) = [(logg h1 + logg h2) mod q].
The discrete-logarithm problem in a cyclic group G with generator g is to compute logg h for a uniform element h ∈ G. Consider the following experi- ment for a group-generation algorithm G, algorithm A, and parameter n:
The discrete-logarithm experiment DLogA,G(n):
1. Run G(1n) to obtain (G,q,g), where G is a cyclic group of
order q (with ∥q∥ = n), and g is a generator of G.
2. Choose a uniform h ∈ G.
3. A is given G,q,g,h, and outputs x ∈ Zq.
4. The output of the experiment is defined to be 1 if gx = h, and 0 otherwise.
DEFINITION 8.62 We say that the discrete-logarithm problem is hard relative to G if for all probabilistic polynomial-time algorithms A there exists a negligible function negl such that Pr[DLogA,G (n) = 1] ≤ negl(n).
The discrete-logarithm assumption is simply the assumption that there ex- ists a G for which the discrete-logarithm problem is hard. The following two sections discuss some candidate group-generation algorithms G for which this is believed to be the case.
The Diffie–Hellman problems. The so-called Diffie–Hellman problems are related, but not known to be equivalent, to the problem of computing discrete logarithms. There are two important variants: the computational Diffie– Hellman (CDH) problem and the decisional Diffie–Hellman (DDH) problem.
Fix a cyclic group G and a generator g ∈ G. Given elements h1,h2 ∈ G, def log h1·log h2 x x
defineDHg(h1,h2) = g g g . Thatis,ifh1 =g 1 andh2 =g 2 then DHg(h1,h2) = gx1·x2 = hx2 = hx1 .
12
The CDH problem is to compute DHg(h1, h2) for uniform h1 and h2. Hardness of this problem can be formalized by the natural experiment; we leave the details as an exercise.
If the discrete-logarithm problem relative to some G is easy, then the CDH
problem is, too: given h1 and h2, first compute x1 := logg h1 and then output
the answer hx1 . In contrast, it is not clear whether hardness of the discrete- 2
logarithm problem implies that the CDH problem is hard as well.
The DDH problem, roughly speaking, is to distinguish DHg(h1,h2) from a uniform group element when h1,h2 are uniform. That is, given uniform h1,h2 and a third group element h′, the problem is to decide whether h′ =
DHg(h1,h2) or whether h′ was chosen uniformly from G. Formally:
Number Theory and Cryptographic Hardness Assumptions 321
DEFINITION 8.63 We say that the DDH problem is hard relative to G if for all probabilistic polynomial-time algorithms A there is a negligible function negl such that
P r [ A ( G , q , g , g x , g y , g z ) = 1 ] − P r [ A ( G , q , g , g x , g y , g x y ) = 1 ] ≤ n e g l ( n ) ,
where in each case the probabilities are taken over the experiment in which G(1n) outputs (G, q, g), and then uniform x, y, z ∈ Zq are chosen. (Note that when z is uniform in Zq, then gz is uniformly distributed in G.)
We have already seen that if the discrete-logarithm problem is easy relative to some G, then the CDH problem is too. Similarly, if the CDH problem is easy relative to G then so is the DDH problem; you are asked to show this in Exercise 8.15. The converse, however, does not appear to be true, and there are examples of groups in which the discrete-logarithm and CDH problems are believed to be hard even though the DDH problem is easy; see Exercise 13.15.
Using Prime-Order Groups
There are various (classes of) cyclic groups in which the discrete-logarithm and Diffie–Hellman problems are believed to be hard. There is a preference, however, for cyclic groups of prime order, for reasons we now explain.
One reason for preferring groups of prime order is because, in a certain sense, the discrete-logarithm problem is hardest in such groups. This is a consequence of the Pohlig–Hellman algorithm, described in Chapter 9, which shows that the discrete-logarithm problem in a group of order q becomes easier if q has (small) prime factors. This does not necessarily mean that the discrete-logarithm problem is easy in groups of nonprime order; it merely means that the problem becomes easier.
Related to the above is the fact that the DDH problem is easy if the group order q has small prime factors. We refer to Exercise 13.15 for one example of this phenomenon.
A second motivation for using prime-order groups is because finding a gen- erator in such groups is trivial. This follows from Corollary 8.55, which says that every element of a prime-order group (except the identity) is a generator. In contrast, efficiently finding a generator of an arbitrary cyclic group requires the factorization of the group order to be known (see Appendix B.3).
Proofs of security for some cryptographic constructions require computing multiplicative inverses of certain exponents (we will see an example in Sec- tion 8.4.2). When the group order is prime, any nonzero exponent will be invertible, making this computation possible.
A final reason for working with prime-order groups applies in situations when the decisional Diffie–Hellman problem should be hard. Fixing a group G with generator g, the DDH problem boils down to distinguishing between
322 Introduction to Modern Cryptography
tuples of the form (h1, h2, DHg(h1, h2)) for uniform h1, h2, and tuples of the form (h1, h2, y), for uniform h1, h2, y. A necessary condition for the DDH problem to be hard is that DHg(h1,h2) by itself should be indistinguishable from a uniform group element. One can show that DHg(h1,h2) is “close” to uniform (in a sense we do not define here) when the group order q is prime, something that is not true otherwise.
8.3.3 Working in (Subgroups of) Z∗p
Groups of the form Z∗p, for p prime, give one class of cyclic groups in which the discrete-logarithm problem is believed to be hard. Concretely, let G be an algorithm that, on input 1n, chooses a uniform n-bit prime p, and outputs p and the group order q = p − 1 along with a generator g of Z∗p. (Section 8.2.1 discusses efficient algorithms for choosing a random prime, and Appendix B.3 shows how to efficiently find a generator of Z∗p given the factorization of p − 1.) The representation of Z∗p here is the trivial one where elements are represented as integers between 1 and p − 1. It is conjectured that the discrete-logarithm problem is hard relative to G of this sort.
The cyclic group Z∗p (for p > 3 prime), however, does not have prime order. (The preference for groups of prime order was discussed in the previous sec- tion.) More problematic, the decisional Diffie–Hellman problem is, in general, not hard in such groups (see Exercise 13.15), and they are therefore unaccept- able for the cryptographic applications based on the DDH assumption that we will explore in later chapters.
These issues can be addressed by using a prime-order subgroup of Z∗p. Let p = rq + 1 where both p and q are prime. We prove that Z∗p has a subgroup G of order q given by the set of rth residues modulo p, i.e., the set of elements {[hr modp]|h∈Z∗p}thatareequaltotherthpowerofsomeh∈Z∗p.
THEOREM 8.64 Let p = rq + 1 with p, q prime. Then def r ∗
G= [hmodp]|h∈Zp is a subgroup of Z∗p of order q.
PROOF The proof that G is a subgroup is straightforward and is omitted. We prove that G has order q by showing that the function fr : Z∗p → G defined by fr(g) = [gr mod p] is an r-to-1 function. (Since |Z∗p| = p − 1, this shows that |G| = (p − 1)/r = q.) To see this, let g be a generator of Z∗p so that g0, . . . , gp−2 are all the elements of Z∗p. By Proposition 8.53 we have
gir = gjr if and only if ir = jr mod (p−1) or, equivalently, p−1|(i−j)r. Sincep−1=rq,thisisequivalenttoq|(i−j). Foranyfixedj∈{0,…,p−2}, thismeansthatthesetofvaluesi∈{0,…,p−2}forwhichgir =gjr is
Number Theory and Cryptographic Hardness Assumptions 323 exactly the set of r distinct values
{j, j + q, j + 2q, . . . , j + (r − 1)q} ,
all reduced modulo p − 1. (Note that j + rq = j mod (p − 1).) This proves
that fr is an r-to-1 function.
Besides showing existence of an appropriate subgroup, the theorem also implies that it is easy to generate a uniform element of G and to test whether a given element of Z∗p lies in G. Specifically, choosing a uniform element of G can be done by choosing uniform h ∈ Z∗p and computing [hr mod p]. Since G has prime order, every element in G except the identity is a generator of G. Finally, it is possible to determine whether any h ∈ Z∗p is also in G by checking
whether hq =? 1 mod p. To see that this works, let h = gi for g a generator ofZ∗p andi∈{0,…,p−2}. Then
hq =1modp⇐⇒giq =1modp
⇐⇒ i q = 0 m o d ( p − 1 ) ⇐⇒ r q | i q ⇐⇒ r | i ,
using Proposition 8.53. So h = gi = gcr = (gc)r for some c, and h ∈ G. Algorithm 8.65 encapsulates the above discussion. In the algorithm, we let n denote the length of q, the order of the group, and let l denote the length of p, the modulus being used. The relationship between these parameters is
discussed below.
Choosing l. Let n = ∥q∥ and l = ∥p∥. Two types of algorithms are
known for computing discrete logarithms in order-q subgroups of Z∗p (see Sec-
tion 9.2): those that run in time O(√q) = O(2n/2) and those that run in time 2O((log p)1/3·(log log p)2/3) = 2O(l1/3·(log l)2/3). Fixing some desired security
parameter n, the parameter l should be chosen so as to balance these times. (If l is any smaller, security is reduced; if l is any larger, operations in G will be less efficient without any gain in security.) See also Section 9.3.
ALGORITHM 8.65
A group-generation algorithm G
Input: Security parameter 1n, parameter l = l(n)
Output: Cyclic group G, its (prime) order q, and a generator g
generate a uniform n-bit prime q
generate an l-bit prime p such that q | (p − 1)
// we omit the details of how this is done
choose a uniform h ∈ Z∗p with h ̸= 1
set g := [h(p−1)/q mod p]
return p, q, g // G is the order-q subgroup of Z∗p
324 Introduction to Modern Cryptography
In practice, standardized values (e.g., recommended by NIST) for p, q, and a
generator g are used, and there is no need to generate parameters of one’s own.
Example 8.66
Consider the group Z∗11 of order 10. Let us try to find a generator of this group. Consider trying 2:
Powersof2: 20 21 22 23 24 25 26 27 28 29 Values: 1 2 4 8 5 10 9 7 3 6
(All values above are computed modulo 11.) We got lucky the first time—the number 2 is a generator! Let’s try 3:
Powersof3: 30 31 32 33 34 35 36 37 38 39 Values: 1 3 9 5 4 1 3 9 5 4
We see that 3 is not a generator of the entire group. Rather, it generates a subgroup G = {1, 3, 4, 5, 9} of order 5. Now, let’s see what happens with 10:
Powers of 10: 100 101 102 103 104 105 106 107 108 109 Values: 1 10 1 10 1 10 1 10 1 10
In this case we generate a subgroup of order 2.
For cryptographic purposes we want to work in a prime-order group. Since
11 = 2 · 5 + 1 we can apply Theorem 8.64 with q = 5 and r = 2, or with q = 2 and r = 5. In the first case, the theorem tells us that the squares of all the elements of Z∗11 should give a subgroup of order 5. This can be easily verified:
Element: 1 2 3 4 5 6 7 8 9 10 Square:149533594 1
We have seen above that 3 is a generator of this subgroup. (In fact, since the subgroup is prime, every element of the subgroup besides 1 is a generator of the subgroup.) Taking q = 2 and r = 5, Theorem 8.64 tells us that taking 5th powers will give a subgroup of order 2. One can check that this gives the order-2 subgroup generated by 10 that we encountered earlier. ♦
Subgroups of finite fields. The discrete-logarithm problem is also believed to be hard in the multiplicative group of a finite field of large characteristic when the polynomial representation is used. (Appendix A.5 provides a brief background on finite fields.) Recall that for any prime p and integer k ≥ 1
there is a (unique) field Fp k ∗pk
k
of order p ; the multiplicative group F of that
field is a cyclic group of order pk − 1 (cf. Theorem A.21). If q is a large prime factor of pk − 1, then Theorem 8.64 shows that F∗pk has a cyclic subgroup of order q. (The only property of Z∗p we used in the proof of that theorem was that Z∗p is cyclic.) This offers another choice of prime-order groups in which the discrete-logarithm and Diffie–Hellman problems are believed to be hard. Our treatment of Z∗p in this section corresponds to the special case k = 1.
Number Theory and Cryptographic Hardness Assumptions 325 8.3.4 Elliptic Curves
The groups we have concentrated on thus far have all been based directly on modular arithmetic. Another class of groups important for cryptography is given by groups consisting of points on elliptic curves. Such groups are especially interesting from a cryptographic perspective since, in contrast to Z∗p or the multiplicative group of a finite field, there are currently no known sub-exponential time algorithms for solving the discrete-logarithm problem in elliptic-curve groups when chosen appropriately. (See Section 9.3 for fur- ther discussion.) For cryptosystems based on the discrete-logarithm or Diffie– Hellman assumptions, this means that implementations based on elliptic-curve groups will be more efficient than implementations based on prime-order sub- groups of Z∗p at any given level of security. In this section we provide only a brief introduction to this area. A deeper understanding of the issues discussed here requires more sophisticated mathematics than we are willing to assume on the part of the reader. Those interested in further exploring this topic are advised to consult the references at the end of this chapter.
Let p ≥ 5 be a prime.3 Consider an equation E in the variables x and y of the form:
y2 = x3 + Ax + B mod p, (8.1)
where A, B ∈ Zp are constants with 4A3 + 27B2 ̸= 0 mod p. (This ensures that the equation x3 + Ax + B = 0 mod p has no repeated roots.) Let E(Zp) denote the set of pairs (x, y) ∈ Zp × Zp satisfying the above equation along with a special value O whose purpose we will discuss shortly; that is,
def 23
E(Zp) = (x,y)|x,y∈Zp andy =x +Ax+Bmodp ∪{O}.
The elements E(Zp) are called the points on the elliptic curve E defined by Equation (8.1), and O is called the point at infinity.
Example 8.67
An element y ∈ Z∗p is a quadratic residue modulo p if there is an x ∈ Z∗p such thatx2 =ymodp;inthatcase,wesayxisasquarerootofy. Forp>2 prime, half the elements in Z∗p are quadratic residues, and every quadratic residue has exactly two square roots. (See Section 13.4.1.)
Let f(x) = x +3x+3 and consider the curve E : y = f(x) mod 7. Each value of x for which f(x) is a quadratic residue modulo 7 yields two points on the curve; values x for which f(x) is a non-quadratic residue are not on
3The theory can be adapted to deal with the case of p = 2 or 3 but this introduces additional complications. Elliptic curves can, in fact, be defined over arbitrary finite or infinite fields (cf. Section A.5), and our discussion largely carries over to fields of characteristic not equal to 2 or 3. Binary curves (i.e., curves over fields of characteristic 2) are particularly important in cryptography, but we will not discuss them here.
def 3 2
326 Introduction to Modern Cryptography
the curve; values of x for which f(x) = 0 mod 7 give one point on the curve.
This allows us to determine the points on the curve:
• f(0) = 3 mod 7, a quadratic non-residue modulo 7.
• f(1) = 0 mod 7, so we obtain the point (1,0) ∈ E(Z7).
• f(2) = 3 mod 7, a quadratic non-residue modulo 7.
• f(3) = 4 mod 7, a quadratic residue modulo 7 with square roots 2 and 5. This yields the points (3, 2), (3, 5) ∈ E(Z7).
• f(4) = 2 mod 7, a quadratic residue modulo 7 with square roots 3 and 4. This yields the points (4, 3), (4, 4) ∈ E(Z7).
• f(5) = 3 mod 7, a quadratic non-residue modulo 7.
• f(6) = 6 mod 7, a quadratic non-residue modulo 7.
Including the point at infinity, there are 6 points in E(Z7). ♦
A useful way to think about E(Zp) is to look at the graph of Equation (8.1) over the reals (i.e., the equation y2 = x3 +Ax+B without reduction modulo p) as in Figure 8.2. This figure does not correspond exactly to E(Zp) because, for example, E(Zp) has a finite number of points (Zp is, after all, a finite set) while there are an infinite number of solutions to the same equation if we allow x and y to range over all real numbers. Nevertheless, the picture provides useful intuition. In such a figure, one can think of the “point at infinity” O as sitting at the top of the y-axis and lying on every vertical line.
It can be shown that every line intersecting E(Zp) intersects it in exactly 3 points, where (1) a point P is counted twice if the line is tangent to the curve at P, and (2) the point at infinity is also counted when the line is vertical. This fact is used to define a binary operation, called “addition” and denoted by +, on points of E(Zp) in the following way:
• The point O is defined to be an (additive) identity; that is, for all P ∈ E(Zp) we define P + O = O + P = P.
• For two points P1,P2 ̸= O on E, we evaluate their sum P1 + P2 by
drawing the line through P1,P2 (if P1 = P2 then draw the line tangent
to the curve at P1) and finding the third point of intersection P3 of this
line with E(Zp); the third point of intersection may be P3 = O if the def
line is vertical. If P3 = (x,y) ̸= O then we define P1 + P2 = (x,−y). (Graphically, this corresponds to reflecting P3 in the x-axis.) If P3 = O
def thenP1+P2 =O.
If P = (x,y) ̸= O is a point of E(Zp), then −P = (x,−y) (which is clearly also a point of E(Zp)) is the unique inverse of P. Indeed, the line
def
Number Theory and Cryptographic Hardness Assumptions 327
FIGURE 8.2: An elliptic curve over the reals.
through (x,y) and (x,−y) is vertical, and so the addition rule implies that P+(−P)=O. (Ify=0thenP=(x,y)=(x,−y)=−Pbutthenthe tangent line at P will be vertical and so P + (−P ) = O here as well.) Of course, −O = O.
It is straightforward, but tedious, to work out the addition law concretely. Let P1 = (x1,y1) and P2 = (x2,y2) be two points in E(Zp), with P1,P2 ̸= O and E as in Equation (8.1). To keep matters simple, suppose x1 ̸= x2 (dealing with the case x1 = x2 is still straightforward but even more tedious). The slope of the line through these points is
def y2 − y1 m= x −x modp ;
21
our assumption that x1 ̸= x2 means that the inverse of (x2 − x1) modulo p
exists. The line passing through P1 and P2 has the equation y=m·(x−x1)+y1 modp. (8.2)
To find the third point of intersection of this line with E, substitute the above into the equation for E to obtain
23
m·(x−x1)+y1 =x +Ax+Bmodp.
The values of x that satisfy this equation are x1, x2, and def 2
x3 =[m −x1−x2modp].
The first two solutions correspond to the original points P1 and P2, while the third is the x-coordinate of the third point of intersection P3. Plugging x3 into Equation (8.2) we find that the y-coordinate corresponding to x3 is
328 Introduction to Modern Cryptography
y3 = [m·(x3 −x1)+y1 mod p]. To obtain the desired answer P1 +P2, we flip
the sign of y to obtain:
(x1,y1)+(x2,y2)=[m2 −x1 −x2 modp], [m·(x1 −x3)−y1 modp].
We summarize and extend this in the following proposition.
PROPOSITION 8.68 Let p ≥ 5 be prime and let E be the elliptic curve given by y2 = x3+Ax+B mod p where 4A3+27B2 ̸= 0 mod p. Let P1,P2 ̸= O be points on E, with P1 = (x1, y1) and P2 = (x2, y2).
1. If x1 ̸= x2, then P1 + P2 = (x3, y3) with
x3 =[m2 −x1 −x2 modp] and y3 =[m·(x1 −x3)−y1 modp],
wherem=y2−y1 modp. x2 −x1
2. If x1 = x2 but y1 ̸= y2 then P1 = −P2 and so P1 + P2 = O.
3. If P1 = P2 and y1 = 0 then P1 + P2 = 2P1 = O.
4. If P1 = P2 and y1 ̸= 0 then P1 + P2 = 2P1 = (x3, y3) with
x3 =[m2 −2×1 modp] and y3 =[m·(x1 −x3)−y1 modp], where m = 3×21+A mod p.
2y1
Somewhat amazingly, the set of points E(Zp) along with the addition rule defined above form an abelian group, called the elliptic-curve group of E. Commutativity follows from the way addition is defined, O acts as the identity, and we have already seen that each point on E(Zp) has an inverse in E(Zp). The difficult property to verify is associativity, which the disbelieving reader can check through tedious calculation. A more illuminating proof that does not involve explicit calculation relies on algebraic geometry.
Example 8.69
Consider the curve from Example 8.67. We show associativity for three specific points. Let P1 = (1, 0), P2 = Q2 = (4, 3). When computing P1 + P2 wegetm=[(3−0)·(4−1)−1 mod7]=1and[12−1−4mod7]=3. Thus,
def
P3 = P1 +P2 =(3, [1·(1−3)−0mod7])=(3,5);
note that this is indeed a point on E(Z7). If we then compute P3 +Q2 we get m = [(3 − 5) · (4 − 3)−1 mod 7] = 5 and [52 − 3 − 4 mod 7] = 4. Thus,
(P1 +P2)+Q2 =P3 +Q2 =(4, [5·(3−4)−5mod7])=(4,4).
Number Theory and Cryptographic Hardness Assumptions 329 If we compute P2 +Q2 = 2P2 we obtain m = [(3·42 +3)·(2·3)−1 mod 7] = 5
and [52 − 2 · 4 mod 7] = 3. Thus, ′ def
P3 = P2 +Q2 =(3, [5·(4−3)−3mod7])=(3,2).
If we then compute P1 +P3′ we find m = [2·(3−1)−1 mod7] = 1 and
[12 −1−3mod7]=4. So
P1 +(P2 +Q2)=P1 +P3′ =(4, [1·(1−4)−0mod7])=(4,4),
and P1 + (P2 + Q2) = (P1 + P2) + Q2. ♦
Recall that when a group is written additively, “exponentiation” corre- sponds to repeated addition. Thus, if we fix some point P in an elliptic-curve group, the discrete-logarithm problem becomes (informally) the problem of computing the integer x from xP, while the decisional Diffie–Hellman prob- lem becomes (informally) the problem of distinguishing tuples of the form (aP, bP, abP ) from those of the form (aP, bP, cP ). These problems are be- lieved to be hard in elliptic-curve groups (or subgroups thereof) of large prime order, subject to a few technical conditions we will mention in passing below.
If we want an elliptic-curve group (or subgroup) of large prime order, the first question we must address is: how large are elliptic-curve groups? As noted in Example 8.67, the equation y2 = f(x) mod p has two solutions whenever f(x) is a quadratic residue, and one solution when f(x) = 0. Since half the elements in Z∗p are quadratic residues, we heuristically expect to find 2 · (p − 1)/2 + 1 = p points on the curve. Including the point at infinity, this means there should be about p + 1 points in an elliptic-curve group over Zp. The Hasse bound says this heuristic estimate is accurate, in the sense that every elliptic-curve group has “almost” this many points.
p.
THEOREM 8.70 (Hasse bound) Let p be prime, and let E be an
elliptic curve over Zp. Then p + 1 − 2√
p ≤|E(Zp)| ≤p+1+2√
This bound implies that it is always easy to find a point on a given elliptic curve y2 = f(x) mod p: simply choose uniform x ∈ Zp, check whether f(x) is 0 or a quadratic residue, and—if so—let y be a square root of f(x). (Algo- rithms for deciding quadratic residuosity and computing square roots modulo a prime are discussed in Chapter 13.) Since points on the elliptic curve are plentiful, we will not have to try very many values of x before finding a point.
The Hasse bound only gives a range for the size of an elliptic-curve group. For a fixed prime p, however, the order of a random elliptic curve over Zp (namely, a curve defined by Equation (8.1) in which A,B are chosen uni- formly in Zp subject to the constraint 4A3 + 27B2 ̸= 0 mod p) is heuristically found to be “close” to uniformly distributed in the Hasse interval. There also exist efficient algorithms—whose description and analysis are well be- yond the scope of this book—for counting the number of points on an elliptic
330 Introduction to Modern Cryptography
curve. This suggests an approach to elliptic-curve parameter generation as in
Algorithm 8.71.
ALGORITHM 8.71
Elliptic-curve group-generation algorithm G Input: Security parameter 1n
Output: Cyclic group G, its (prime) order q, and a generator g
generate a uniform n-bit prime p until q is an n-bit prime do:
chooseA,B←Zp with4A3+27B2 ̸=0modp, defining elliptic curve E as in Equation (8.1)
let q be the number of points on E(Zp)
choose g ∈ E(Zp) \ {O}
return (A, B, p), q, g // G is the elliptic-curve group of E
Certain classes of curves are considered cryptographically weak and should be avoided. These include elliptic-curve groups over Zp whose order is equal to p (anomalous curves) or p+1 (supersingular curves), or whose order divides pk − 1 for “small” k. A full discussion is beyond the scope of this book. In practice, standardized curves (such as those recommended by NIST) are used, and generating a curve of one’s own is not advised.
Efficiency Considerations
We conclude this section with a very brief discussion of some standard efficiency improvements when using elliptic curves.
Point compression. A useful observation is that the number of bits needed to represent a point on an elliptic curve can be reduced almost by half. To see this, note that for any point (x,y) on an elliptic curve E : y2 = f(x) mod p there are at most two points on the curve that have x-coordinate x: namely, (x, y) and (x, −y). (It is possible that y = 0 in which case these are the same point.) Thus, we can specify any point P = (x, y) by its x-coordinate and a bit b that distinguishes between the (at most) two possibilities for the value of its y-coordinate. One convenient way to do this is to set b = 0 if y < p/2 and b = 1 otherwise. Given x and b we can recover P by computing the two square roots y1, y2 of the equation y2 = f(x) mod p; since y1 = −y2 mod p andsoy1 =p−y2,exactlyoneofy1,y2 willbelessthanp/2.
Projective coordinates. Representing elliptic-curve points as we have been doing until now—in which a point P on an elliptic curve is described by a pair of field elements (x, y)—is called using affine coordinates. There are alternate ways to represent points, using projective coordinates, that can offer efficiency
where
where
P3 = vw, u(v2X1Z2 − w) − v3Y1Z2, Z1Z2v3 (8.3) u = Y2Z1 − Y1Z2, v = X2Z1 − X1Z2,
w = u2Z1Z2 − v3 − 2v2X1Z2. (8.4)
Number Theory and Cryptographic Hardness Assumptions 331
improvements. While these alternate representations can be motivated math- ematically, we treat them simply as useful computational aids.
Points in projective coordinates are represented using three elements of Zp. An interesting feature is that a point has multiple representations. When using standard projective coordinates, a point P ̸= O with representation (x, y) in affine coordinates is represented by any tuple (X, Y, Z) ∈ Z3p for which X/Z = x mod p and Y/Z = y mod p. The point O is represented by any tuple (0,Y,0) with Y ̸= 0, and these are the only points (X,Y,Z) with Z = 0. We can easily translate between coordinate systems: (x,y) in affine coordinates can be mapped to (x,y,1) in projective coordinates, and (X, Y, Z) (with Z ̸= 0) in projective coordinates is mapped to the representa- tion ([X/Z mod p ], [Y /Z mod p ]) in affine coordinates.
The main advantage of using projective coordinates is that we can add points without having to compute inverses modulo p. (Adding points in affine coordinates requires computing inverses; see Proposition 8.68.) We accom- plish this by exploiting the fact that points have multiple representations. To see this, let us work out the addition law for two points P1 = (X1,Y1,Z1) and P2 = (X2,Y2,Z2) with P1,P2 ̸= O (so Z1,Z2 ̸= 0) and P1 ̸= ±P2 (so X1/Z1 ̸= X2/Z2 mod p). (If either P1 or P2 are equal to O, addition is triv- ial. The case of P1 = ±P2 can be handled as well, but we omit details here.) We can express P1 and P2 as (X1/Z1,Y1/Z1) and (X2/Z2,Y2/Z2) in affine coordinates, so
def 2
P3 = P1 +P2 = m −X1/Z1 −X2/Z2,
m·(X1/Z1 −m2 +X1/Z1 +X2/Z2)−Y1/Z1, 1 ,
m = (Y2/Z2 − Y1/Z1)(X2/Z2 − X1/Z1)−1 = (Y2Z1 − Y1Z2)(X2Z1 − X1Z2)−1
and all computations are done modulo p. Note we are using projective coor- dinates to represent P3, setting Z3 = 1 above. But using projective coordi- nates means we are not limited to Z3 = 1. Multiplying each coordinate by Z1Z2(X2Z1 − X1Z2)3 ̸= 0 mod p, we find that P3 can also be represented as
The point to notice is that the computations in Equations (8.3) and (8.4) can be carried out without having to perform any modular inversions.
Precisely because points have multiple representations in projective coor- dinates, some subtleties can arise when projective coordinates are used. (We
332 Introduction to Modern Cryptography
have explicitly assumed until now that group elements have unique represen- tations as bit-strings.) Specifically, a point expressed in projective coordinates may reveal some information about how that point was obtained, which may depend on some secret information. To address this—as well as for reasons of efficiency—affine coordinates should be used for transmitting and storing points, with projective coordinates used only as an intermediate represen- tation during the course of a computation (with points converted to/from projective coordinates at the beginning/end of the computation).
8.4 *Cryptographic Applications
We have spent a fair bit of time discussing number theory and group theory, and introducing computational hardness assumptions that are widely believed to hold. Applications of these assumptions will occupy us for the rest of the book, but we provide some brief examples here.
8.4.1 One-Way Functions and Permutations
One-way functions are the minimal cryptographic primitive, and they are both necessary and sufficient for private-key encryption and message authen- tication codes. A more complete discussion of the role of one-way functions in cryptography appears in Chapter 7; here we only provide a definition of one- way functions and demonstrate that their existence follows from the number- theoretic hardness assumptions we have seen in this chapter.
Informally, a function f is one-way if it is easy to compute but hard to in- vert. The following experiment and definition, a restatement of Definition 7.1, formalizes this.
The inverting experiment InvertA,f (n):
1. Choose uniform x ∈ {0, 1}n and compute y := f (x).
2. A is given 1n and y as input, and outputs x′.
3. The output of the experiment is 1 if and only if f(x′) = y.
DEFINITION 8.72 A function f : {0,1}∗ → {0,1}∗ is one-way if the following two conditions hold:
1. (Easy to compute:) There is a polynomial-time algorithm that on input x outputs f(x).
2. (Hard to invert:) For all ppt algorithms A there is a negligible func- tion negl such that Pr[InvertA,f (n) = 1] ≤ negl(n).
Number Theory and Cryptographic Hardness Assumptions 333
We now show formally that the factoring assumption implies the existence of a one-way function. Let Gen be a polynomial-time algorithm that, on input 1n, outputs (N, p, q) where N = pq and p and q are n-bit primes except with probability negligible in n. (We use Gen rather than GenModulus here purely for notational convenience.) Since Gen runs in polynomial time, there is a polynomial upper bound on the number of random bits the algorithm uses. For simplicity, and in order to get the main ideas across, we assume Gen always uses at most n random bits on input 1n. In Algorithm 8.73 we define a function fGen that uses its input as the random bits for running Gen. Thus, fGen is a deterministic function as required.
ALGORITHM 8.73
Algorithm computing fGen Input: String x of length n
Output: Integer N
compute (N, p, q) := Gen(1n; x)
// i.e., run Gen(1n) using x as the random tape return N
If the factoring problem is hard relative to Gen then, intuitively, fGen is a one-way function. Certainly fGen is easy to compute. As for the hardness of inverting this function, note that the following distributions are identical:
1. The modulus N output by fGen(x), when x ∈ {0, 1}n is chosen uniformly.
2. The modulus N output by (the randomized algorithm) Gen(1n).
If moduli N generated according to the second distribution are hard to factor, then the same holds for moduli N generated according to the first distribution. Moreover, given any preimage x′ of N with respect to fGen (i.e., an x′ for which fGen(x′) = N; note that we do not require x′ = x), it is easy to recover a factor of N by running Gen(1n; x′) to obtain (N, p, q) and outputting the factors p and q. Thus, finding a preimage of N with respect to fGen is as hard as factoring N. One can easily turn this into a formal proof of the following:
THEOREM 8.74 If the factoring problem is hard relative to Gen, then fGen is a one-way function.
One-Way Permutations
We can also use number-theoretic assumptions to construct a family of one- way permutations. We begin with a restatement of Definitions 7.2 and 7.3, specialized to the case of permutations:
334 Introduction to Modern Cryptography
DEFINITION 8.75 A triple Π = (Gen, Samp, f ) of probabilistic polynomial-
time algorithms is a family of permutations if the following hold:
1. The parameter-generation algorithm Gen, on input 1n, outputs parameters I with |I| ≥ n. Each value of I defines a set DI that constitutes the domain and range of a permutation (i.e., bijection) fI : DI → DI .
2. The sampling algorithm Samp, on input I, outputs a uniformly distributed element of DI.
3. The deterministic evaluation algorithm f , on input I and x ∈ DI , outputs an element y ∈ DI. We write this as y := fI(x).
Given a family of functions Π, consider the following experiment for any algorithm A and parameter n:
The inverting experiment InvertA,Π(n):
1. Gen(1n) is run to obtain I, and then Samp(I) is run to choose
a uniform x ∈ DI . Finally, y := fI (x) is computed.
2. A is given I and y as input, and outputs x′.
3. The output of the experiment is 1 if and only if fI (x′) = y.
DEFINITION 8.76 The family of permutations Π = (Gen, Samp, f) is one-way if for all probabilistic polynomial-time algorithms A there exists a negligible function negl such that
Pr[InvertA,Π(n) = 1] ≤ negl(n).
Given GenRSA as in Section 8.2.4, Construction 8.77 defines a family of permutations. It is immediate that if the RSA problem is hard relative to GenRSA then this family is one-way. It can similarly be shown that hardness of the discrete-logarithm problem in Z∗p, with p prime, implies the existence of a one-way family of permutations; see Section 7.1.2.
CONSTRUCTION 8.77
Let GenRSA be as before. Define a family of permutations as follows:
• Gen: on input 1n, run GenRSA(1n) to obtain (N, e, d) and output
I = ⟨ N , e ⟩ . S e t D I = Z ∗N .
• Samp: on input I = ⟨N, e⟩, choose a uniform element of Z∗N . • f : o n i n p u t I = ⟨ N , e ⟩ a n d x ∈ Z ∗N , o u t p u t [ x e m o d N ] .
A family of permutations based on the RSA problem.
Number Theory and Cryptographic Hardness Assumptions 335 8.4.2 Constructing Collision-Resistant Hash Functions
Collision-resistant hash functions were introduced in Section 5.1. Although we have discussed constructions of collision-resistant hash functions used in practice in Section 6.3, we have not yet seen constructions that can be rig- orously based on simpler assumptions. We show here a construction based on the discrete-logarithm assumption in prime-order groups. (A construction based on the RSA problem is described in Exercise 8.20.) Although these con- structions are less efficient than the hash functions used in practice, they are important since they illustrate the feasibility of achieving collision resistance based on standard and well-studied number-theoretic assumptions.
Let G be a polynomial-time algorithm that, on input 1n, outputs a (descrip- tion of a) cyclic group G, its order q (with ∥q∥ = n), and a generator g. Here we also require that q is prime except possibly with negligible probability. A fixed-length hash function based on G is given in Construction 8.78.
CONSTRUCTION 8.78
Let G be as described in the text. Define a fixed-length hash function (Gen,H) as follows:
• Gen: on input 1n, run G(1n) to obtain (G,q,g) and then select a uniform h ∈ G. Output s := ⟨G,q,g,h⟩ as the key.
• H: givenakeys=⟨G,q,g,h⟩andinput(x1,x2)∈Zq×Zq,output Hs(x1, x2) := gx1 hx2 ∈ G.
A fixed-length hash function.
Note that Gen and H can be computed in polynomial time. Before contin- uing with an analysis of the construction, we make some technical remarks:
• For a given s = ⟨G, q, g, h⟩ with n = ∥q∥, the function Hs is described as taking elements of Zq × Zq as input. However, Hs can be viewed as taking bit-strings of length 2 · (n − 1) as input if we parse an input x ∈ {0, 1}2(n−1) as two strings x1, x2, each of length n − 1, and then view x1,x2 as elements of Zq in the natural way.
• The output of Hs is similarly specified as being an element of G, but we can view this as a bit-string if we fix some representation of G. To satisfy the requirements of Definition 5.2 (which requires the output length to be fixed as a function of n) we can pad the output as needed.
• Given the above, the construction only compresses its input for groups G in which elements of G can be represented using fewer than 2n − 2 bits. This holds both for the groups output by Algorithm 8.65 (assuming n ≪ l, which is usually the case), as well as for elliptic-curve groups when point compression is used. A generalization of Construction 8.78
336 Introduction to Modern Cryptography
can be used to obtain compression from any G for which the discrete- logarithm problem is hard, regardless of the number of bits required to represent group elements; see Exercise 8.21.
THEOREM 8.79 If the discrete-logarithm problem is hard relative to G, then Construction 8.78 is a fixed-length collision-resistant hash function (sub- ject to the discussion regarding compression, above).
PROOF Let Π = (Gen,H) be as in Construction 8.78, and let A be a probabilistic polynomial-time algorithm with
def
ε(n) = Pr[Hash-collA,Π(n) = 1]
(cf. Definition 5.2). We show how A can be used by an algorithm A′ to solve the discrete-logarithm problem with success probability ε(n):
Algorithm A′:
The algorithm is given G, q, g, h as input.
1. Let s := ⟨G, q, g, h⟩. Run A(s) and obtain output x and x′. 2. If x ̸= x′ and Hs(x) = Hs(x′) then:
(a) Ifh=1return0.
(b) Otherwise (h ̸= 1), parse x as (x1,x2) and parse x′ as
(x′1,x′2), where x1,x2,x′1,x′2 ∈ Zq, and return the result (x1 − x′1) · (x′2 − x2)−1 mod q.
Clearly, A′ runs in polynomial time. Furthermore, the input s given to A when run as a subroutine by A′ is distributed exactly as in experiment Hash-collA,Π for the same value of the security parameter n. (The input to A′ is generated by running G(1n) to obtain G,q,g and then choosing h ∈ G uniformly at random. This is exactly how s is generated by Gen(1n).) So, with probability exactly ε(n) there is a collision; i.e., x ̸= x′ and Hs(x) = Hs(x′).
We claim that whenever there is a collision, A′ returns the correct answer logg h. If h = 1 then this is clearly true (since logg h = 0 in this case). Assuming h ̸= 1, the existence of a collision means that
Hs(x1, x2) = Hs(x′1, x′2) ⇒ gx1 hx2 = gx′1 hx′2
⇒ gx1−x′1 = hx′2−x2 . (8.5)
Note that x′2 − x2 ̸= 0 mod q; otherwise, we would have x1 = x′1 mod q but then x = x′ and we would not have a collision. Since q is prime, the inverse
def ′ −1
∆ = [(x2 − x2) mod q] exists. Raising each side of Equation (8.5) to this
power gives:
g(x1−x′1)·∆ = hx′2−x2 ∆ = h1 = h,
Number Theory and Cryptographic Hardness Assumptions 337
and so the output returned by A′ is logg h=[(x1 −x′1)·∆modq]= (x1 −x′1)·(x′2 −x2)−1 modq .
We see that A′ correctly solves the discrete-logarithm problem with proba- bility exactly ε(n). Since, by assumption, the discrete-logarithm problem is hard relative to G, we conclude that ε(n) is negligible.
Using Exercise 8.21 in combination with the Merkle–Damg ̊ard transform (see Section 5.2) we obtain:
THEOREM 8.80 If the discrete-logarithm problem is hard, then collision- resistant hash functions exist.
References and Additional Reading
The book by Childs [44] has excellent coverage of the group theory discussed in this chapter (and more), in greater depth but at a similar level of exposition. Shoup [159] gives a more advanced, yet still accessible, treatment of much of this material also, with special focus on algorithmic aspects. (Our statement of Bertrand’s postulate is taken from [159, Theorem 5.8].) Relatively gentle introductions to abstract algebra and group theory that go well beyond what we have space for here are available in the books by Fraleigh [67] and Her- stein [89]; the interested reader will have no trouble finding more advanced algebra texts if they are so inclined.
The first efficient primality test was by Solovay and Strassen [164]. The Miller–Rabin test is due to Miller [127] and Rabin [146]. A deterministic primality test was discovered by Agrawal et al. [5]. See Dietzfelbinger [57] for a comprehensive survey of this area.
The RSA problem was publicly introduced by Rivest, Shamir, and Adle- man [148], although it was revealed in 1997 that Ellis, Cocks, and Williamson, three members of GCHQ (a British intelligence agency), had explored similar ideas—without fully recognizing their importance—in a classified setting sev- eral years earlier. The discrete-logarithm and Diffie–Hellman problems were first considered, at least implicitly, by Diffie and Hellman [58].
Most treatments of elliptic curves require advanced mathematical back- ground on the part of the reader. The book by Silverman and Tate [160] is perhaps an exception. As with many books on the subject written for mathe- maticians, however, that book has little coverage of elliptic curves over finite fields, which is the case most relevant to cryptography. The text by Washing- ton [176], although a bit more advanced, deals heavily (but not exclusively)
338 Introduction to Modern Cryptography
with the finite-field case. Implementation issues related to elliptic-curve cryp- tography are covered by Hankerson et al. [83]. Recommended parameters for elliptic curves, as well as subgroups modulo a prime, are given by NIST [132].
The construction of a collision-resistant hash function based on the discrete- logarithm problem is due to [43], and an earlier construction based on the hardness of factoring is given in [81] (see also Exercise 8.20).
Exercises
8.1 Let G be an abelian group. Prove that there is a unique identity in G, and that every element g ∈ G has a unique inverse.
8.2 Show that Proposition 8.36 does not necessarily hold when G is infinite. Hint: Considertheset{1}∪{2,4,6,8,...}⊂R.
8.3 LetGbeafinitegroup,andg∈G. Showthat⟨g⟩isasubgroupofG. Is the set {g0, g1, . . .} necessarily a subgroup of G when G is infinite?
8.4 This question concerns the Euler phi function.
(a) Let p be prime and e ≥ 1 an integer. Show that φ(pe) = pe−1(p−1).
(b) Let p, q be relatively prime. Show that φ(pq) = φ(p) · φ(q). (You may use the Chinese remainder theorem.)
(c) Prove Theorem 8.19.
8.5 Compute the final two (decimal) digits of 31000 (by hand).
Hint: The answer is [31000 mod 100].
8.6 Compute [1014,800,000,002 mod 35] (by hand).
8.7 Compute [4651 mod 55] (by hand) using the Chinese remainder theorem.
8.8 Prove that if G, H are groups, then G × H is a group.
8.9 Let p, N be integers with p | N. Prove that for any integer X,
[[X modN]modp]=[X modp].
Show that, in contrast, [[X mod p ] mod N ] need not equal [X mod N ].
8.10 Corollary 8.21 shows that if N = pq and ed = 1 mod φ(N) then for all x ∈ Z∗N we have (xe)d = x mod N. Show that this holds for all x ∈ {0, . . . , N − 1}.
Hint: Use the Chinese remainder theorem.
Number Theory and Cryptographic Hardness Assumptions 339
8.11 Complete the details of the proof of the Chinese remainder theorem,
showingthatZ∗N isisomorphictoZ∗p×Z∗q.
8.12 This exercise develops an efficient algorithm for testing whether an in-
teger is a perfect power.
(a) Show that if N = Nˆe for some integers Nˆ,e > 1 then e ≤ ∥N∥.
(b) Given N and e with 2 ≤ e ≤ ∥N∥+1, show how to determine in poly(∥N∥) time whether there exists an integer Nˆ with Nˆe = N.
Hint: Use binary search.
(c) Given N, show how to test in poly(∥N∥) time whether N is a
perfect power.
8.13 Given N and a ∈ Z∗N , show how to test in polynomial time whether a
is a strong witness that N is composite.
8.14 Fix N,e with gcd(e,φ(N)) = 1, and assume there is an adversary A
running in time t for which
Pr[A([xe mod N]) = x] = 0.01,
where the probability is taken over uniform choice of x ∈ Z∗N . Show that it is possible to construct an adversary A′ for which
Pr[A′ ([xe mod N]) = x] = 0.99
for all x. The running time t′ of A′ should be polynomial in t and ∥N∥. Hint: Use the fact that y1/e · r = (y · re)1/e mod N.
8.15 Formally define the CDH assumption. Prove that hardness of the CDH problem relative to G implies hardness of the discrete-logarithm problem relative to G, and that hardness of the DDH problem relative to G implies hardness of the CDH problem relative to G.
8.16 Determine the points on the elliptic curve E : y2 = x3 +2x+1 over Z11. How many points are on this curve?
8.17 Consider the elliptic-curve group from Example 8.67. (See also Exam- ple 8.69.) Compute (1, 0) + (4, 3) + (4, 3) in this group by first converting to projective coordinates and then using Equations (8.3) and (8.4).
8.18 Prove the fourth statement in Proposition 8.68.
8.19 Can the following problem be solved in polynomial time? Given a prime p, a value x ∈ Z∗p−1, and y := [gx mod p] (where g is a uniform value in Z∗p), find g, i.e., compute y1/x mod p. If your answer is “yes,” give a
polynomial-time algorithm. If your answer is “no,” show a reduction to one of the assumptions introduced in this chapter.
340 Introduction to Modern Cryptography
8.20 Let GenRSA be as in Section 8.2.4. Prove that if the RSA problem is hard relative to GenRSA then Construction 8.81 is a fixed-length collision- resistant hash function.
CONSTRUCTION 8.81
Define (Gen,H) as follows:
• Gen: on input 1n, run GenRSA(1n) to obtain N, e, d, and select
y ← Z∗N. The key is s := ⟨N,e,y⟩.
• H: if s = ⟨N, e, y⟩, then Hs maps inputs in {0, 1}3n to outputs
∗sdefe sdefe
inZN. Letf0(x) = [x modN]andf1(x) = [y·x modN].
For a 3n-bit long string x = x1 ···x3n, define
s defss H(x)=fx1 fx2 ··· 1 ··· .
8.21 Consider the following generalization of Construction 8.78:
CONSTRUCTION 8.82
Define a fixed-length hash function (Gen,H) as follows:
(a) Gen: on input 1n, run G(1n) to obtain (G, q, h1) and then select
h2,…,ht ← G. Output s := ⟨G,q,(h1,…,ht)⟩ as the key. (b) H: given a key s = ⟨G,q,(h1,…,ht)⟩ and input (x1,…,xt)
with xi ∈ Zq, output Hs(x1,…,xt) := hxi. ii
(a) Prove that if the discrete-logarithm problem is hard relative to G and q is prime, then for any t = poly(n) this construction is a fixed-length collision-resistant hash function.
(b) Discuss how this construction can be used to obtain compression regardless of the number of bits needed to represent elements of G (as long as it is polynomial in n).
Chapter 9
*Algorithms for Factoring and Computing Discrete Logarithms
In the last chapter, we introduced several number-theoretic problems—most prominently, factoring the product of two large primes and computing dis- crete logarithms in certain groups—that are widely believed to be hard. As defined there, this means there are presumed to be no polynomial-time algo- rithms for these problems. This asymptotic notion of hardness, however, tells us little about how to set the security parameter—sometimes called the key length, although the terms are not interchangeable—to achieve some desired, concrete level of security in practice. A proper understanding of this issue is extremely important for the real-world deployment of cryptosystems based on these problems. Setting the security parameter too low means a cryptosystem may be vulnerable to attacks more efficient than anticipated; being overly con- servative and setting the security parameter too high will give good security, but at the expense of efficiency for the honest users. The relative difficulty of different number-theoretic problems can also play a role in determining which problems to use as the basis for building cryptosystems in the first place.
The fundamental issue, of course, is that a brute-force search may not be the best algorithm for solving a given problem; thus, using key length n does not, in general, give security against attackers running for 2n time. This is in con- trast to the private-key setting where the best attacks on existing block ciphers have roughly the complexity of brute-force search (ignoring pre-computation). As a consequence, the key lengths used in the public-key setting tend to be significantly larger than those used in the private-key setting.
To gain a better appreciation of this point, we explore in this chapter sev- eral nonpolynomial-time algorithms for factoring and computing discrete log- arithms that are far better than brute-force search. The goal is merely to give a taste of existing algorithms for these problems, as well as to provide some ba- sic guidance for setting parameters in practice. Our focus is on the high-level ideas, and we consciously do not address many important implementation- level details that would be critical to deal with if these algorithms were to be used in practice. We also concentrate exclusively on classical algorithms, and refer the reader elsewhere for a discussion of known quantum polynomial- time(!) algorithms for factoring and computing discrete logarithms. (This was another conscious decision, both because we did not want to assume a background in quantum mechanics on the part of the reader, and because quantum computers seem unlikely in the near future.)
341
342 Introduction to Modern Cryptography
The reader may also notice that we only describe algorithms for factoring and computing discrete logarithms, and not algorithms for, say, solving the RSA or decisional Diffie–Hellman problems. Our choice is justified by the facts that the best known algorithms for solving RSA require factoring the modulus, and (in the groups discussed in Sections 8.3.3 and 8.3.4) the best known approaches for solving the decisional Diffie–Hellman problem involve computing discrete logarithms.
9.1 Algorithms for Factoring
Throughout, we assume that N = pq is a product of two distinct primes with p < q. We will be most interested in the case when p and q each has the same (known) length n, and so n = Θ(log N ).
We will frequently use the Chinese remainder theorem along with the nota- tion developed in Section 8.1.5. The Chinese remainder theorem states that
ZN≃Zp×Zq andZ∗N≃Z∗p×Z∗q, def
with isomorphism given by f (x) = ([x mod p ], [x mod q ]). The fact that f is an isomorphism means, in particular, that it gives a bijection between elements x ∈ ZN and pairs (xp, xq) ∈ Zp × Zq. We write x ↔ (xp, xq), with xp = [x mod p] and xq = [x mod q], to denote this bijection.
Recall from Section 8.2 that trial division—a trivial, brute-force factoring method—finds a factor of a given number N in time O(N1/2 · polylog(N)). (This is an exponential-time algorithm, since the size of the input is the length of the binary representation of N, i.e., ∥N∥ = O(logN).1) We cover three factoring algorithms with better performance:
• Pollard’s p−1 method is effective if p−1 has only “small” prime factors.
• Pollard’s rho method applies to arbitrary N. (As such, it is called a general-purpose factoring algorithm.) Its running time for N of the form discussed at the beginning of this section is O(N 1/4 · polylog(N )). Note this is still exponential in n, the length of N.
• The quadratic sieve algorithm is a general-purpose factoring algorithm that runs in time sub-exponential in the length of N. We give a high- level overview of how this algorithm works, but the details are somewhat complex and are beyond the scope of this book.
1Thus, a running time of NO(1) = 2O(∥N∥) is exponential, a running time of 2o(log N) = 2o(∥N∥) is sub-exponential, and a running time of logO(1) N = ∥N∥O(1) is polynomial.
*Algorithms for Factoring and Computing Discrete Logarithms 343
The fastest known general-purpose factoring algorithm is the general num- ber field sieve. Heuristically, this algorithm factors its input N in expected time 2O((log N)1/3·(log log N)2/3), which is sub-exponential in the length of N.
9.1.1 Pollard’s p − 1 Algorithm
If N = pq and p−1 has only “small” prime factors, Pollard’s p−1 algorithm can be used to efficiently factor N. The basic idea is simple. Let B be an integer for which (p − 1)|B and (q − 1)̸ |B; we defer to below the details of howsuchaBisfound. SayB=γ·(p−1)forsomeintegerγ. Choosea uniform x ∈ Z∗N and compute y := [xB −1modN]. (Note that y can be computed using the efficient exponentiation algorithm from Appendix B.2.3.) Since 1 ↔ (1, 1), we have
y=[xB −1modN]↔(xp,xq)B −(1,1)
=(xBp −1modp,xBq −1modq)
= ((xp−1)γ − 1 mod p, xB − 1 mod q) pq
= (0,[xBq −1modq])
using Theorem 8.14 and the fact that the order of Z∗p is p − 1. We show below that, with high probability, xBq ̸= 1 mod q. Assuming this is the case, we have obtained an integer y for which
y=0modp but y̸=0modq;
that is, p | y but q̸ | y. This, in turn, implies that gcd(y, N) = p. Thus, a simple gcd computation (which can be done efficiently as described in Ap- pendix B.1.2) yields a prime factor of N.
ALGORITHM 9.1
Pollard’s p − 1 algorithm for factoring Input: Integer N
Output: A non-trivial factor of N x ← Z ∗N
y:=[xB −1modN]
// B is as in the text p := gcd(y,N)
if p ̸∈ {1,N} return p
Let us first argue that the algorithm works (with high probability). Assume
def
(p−1)|B but (q−1)̸|B. In that case, as long as xq = [xmodq] is a generator of Z∗q , we have xBq ̸= 1 mod q. (This follows from Proposition 8.52.) It remains to analyze the probability that xq is a generator. Here we rely on
344 Introduction to Modern Cryptography
some results proved in Appendix B.3.1. Since q is prime, Z∗q is a cyclic group of order q − 1 that has exactly φ(q − 1) generators (cf. Theorem B.16). If x is chosen uniformly from Z∗N , then xq is uniformly distributed in Z∗q . (This is a consequence of the fact that the Chinese remainder theorem gives a bijection between Z∗N and Z∗p ×Z∗q.) Thus, the probability that xq is a generator is
φ(q−1) = Ω(1/ log q) = Ω(1/n) (cf. Theorem B.15). Multiple values of x can q−1
be chosen to boost the probability of success.
We are left with the problem offinding B such that (p−1) | B but (q−1)̸ | B.
One possibility is to choose B = k p⌊n/ log pi ⌋ for some k, where pi denotes i=1 i
the ith prime (i.e., p1 = 2,p2 = 3,p3 = 5,...) and n is the length of p. (Note
⌊n/logpi⌋
that pi is the largest power of pi that can possibly divide p − 1.) If
p−1 can be written as k pei with ei ≥ 0 (that is, if the largest prime i=1 i
factor of p−1 is less than pk), we will have (p−1)|B. In contrast, if q−1 has any prime factor larger than pk, then (q − 1)̸ | B.
Choosing a larger value for k increases B and so increases the running time of the algorithm (which performs a modular exponentiation to the power B). A larger value of k also makes it more likely that (p − 1) | B, but at the same time makes it less likely that (q − 1)̸ | B. It is, of course, possible to run the algorithm repeatedly using multiple choices for k.
Pollard’s p − 1 algorithm is thwarted if both p − 1 and q − 1 have any large prime factors. (More precisely, the algorithm still works but only for B so large that the algorithm becomes impractical.) If p and q are uniform n-bit primes, then it is unlikely that either p − 1 or q − 1 will have only small prime factors. Nevertheless, when generating a modulus N = pq for cryptographic applications, p and q are sometimes chosen to be strong primes. (Recall that p is a strong prime if (p − 1)/2 is also prime.) Selecting p and q in this way is markedly less efficient than simply choosing p and q as arbitrary (random) primes. Because better factoring algorithms are available anyway (as we will see below), and due to the observation above, the current consensus is that the added computational cost of generating p and q as strong primes does not yield any appreciable security gains.
9.1.2 Pollard’s Rho Algorithm
Pollard’s rho algorithm can be used to factor an arbitrary integer N = pq; in that sense, it is a general-purpose factoring algorithm. Heuristically, the algorithm factors N with constant probability in O N 1/4 · polylog(N ) time; this is still exponential, but is a vast improvement over trial division.
The core idea of the approach is to find distinct values x, x′ ∈ Z∗N that are equivalent modulo p (i.e., for which x = x′ mod p); call such a pair good. Note thatforagoodpairx,x′ itholdsthatgcd(x−x′,N)=p(sincex̸=x′ modN), so computing the gcd gives a non-trivial factor of N.
How can we find a good pair? Say we choose values x(1) , . . . , x(k) uni- formly from Z∗N , where k = 2n/2 = O(√p ). Viewing these in their Chinese-
*Algorithms for Factoring and Computing Discrete Logarithms 345
ALGORITHM 9.2
Pollard’s rho algorithm for factoring Input: Integer N, a product of two n-bit primes
Output: A non-trivial factor of N
x(0) ←Z∗N, x′ :=x:=x(0) for i = 1 to 2n/2:
x := F(x)
x′ :=F(F(x′))
p := gcd(x − x′ , N )
if p ̸∈ {1, N } return p and stop
remaindering representation as (xp (i) def (i) ∗
(1) (1) (k) (k)
, xq ), . . . , (xp , xq ), we have that each xp = [x mod p] is uniform in Zp. (This follows from bijectivity between
Z∗N and Z∗p ×Z∗q .) Thus, using the birthday bound of Lemma A.16, we see that with high probability there exist distinct i, j with x(i) = x(j) or, equivalently,
pp
x(i) = x(j) mod p. Moreover, Lemma A.15 shows that x(i) ̸= x(j) except with negligible probability. Thus, with high probability we obtain a good pair x(i), x(j) that can be used to find a non-trivial factor of N, as discussed earlier.
p)uniformelementsofZ∗N inO(√p)=O(N1/4) We can generate k = O(√
time. Testing all pairs of elements in order to identify a good pair, however, would require k2 = O(k2) = O(p) = ON1/2 time! (Note that since p is
(1) (k)
unknown we cannot simply compute xp , . . . , xp explicitly and then sort the
(i)
xp to find a good pair. Instead, for all distinct pairs i, j we must compute
gcd(x(i)−x(j),N) to see whether this gives a non-trivial factor of N.) Without further optimizations, this will be no better than trial division.
Pollard’s idea was to use a technique we have seen in Section 5.4.2 in the context of small-space birthday attacks. Specifically, we compute the sequence x(1), x(2), . . . by letting each value be a function of the one before it; i.e., we fix some function F : Z∗N → Z∗N , choose a uniform x(0) ∈ Z∗N , and then set x(i) := F(x(i−1)) for i = 1,...,k. We require F to have the property that if x = x′ mod p, then F(x) = F(x′) mod p; this ensures that once equivalence modulo p occurs, it persists. (A standard choice is F(x) = [x2+1 mod N], but any polynomial F will have this property.) If we model F as a random function (which works heuristically), then with high probability there is a good pair in the first k elements of this sequence. Proceeding roughly as in Algorithm 5.9 from Section 5.4.2, we can detect a good pair (if there is one) using only O(k) gcd computations; see Algorithm 9.2. In addition to improving the running time, Pollard’s idea also drastically reduces the amount of memory needed.
9.1.3 The Quadratic Sieve Algorithm
Pollard’s rho algorithm is better than trial division, but still runs in expo- nential time. The quadratic sieve algorithm runs in sub-exponential time. It
346 Introduction to Modern Cryptography
was the fastest known factoring algorithm until the early 1990s and remains the factoring algorithm of choice for numbers up to about 300 bits long. We describe the general principles of the algorithm but caution the reader that several important details are omitted.
An element z ∈ Z∗N is a quadratic residue modulo N if there is an x ∈ Z∗N suchthatx2 =zmodN;inthiscase,wesaythatxisasquarerootofz. The following observations serve as our starting point:
• If N is a product of two distinct, odd primes, then every quadratic residue modulo N has exactly four square roots. (See Section 13.4.2.)
• Given x,y with x2 = y2 modN and x ̸= ±ymodN, it is possible to compute a nontrivial factor of N in polynomial time. This is by virtue of the fact that x2 = y2 mod N implies
0=x2 −y2 =(x−y)(x+y)modN,
andsoN|(x−y)(x+y). However,N̸|(x−y)andN̸|(x+y)because x̸=±ymodN. Soitmustbethecasethatgcd(x−y,N)isequalto one of the prime factors of N. (See also Lemma 13.35.)
The quadratic sieve algorithm tries to generate x, y with x2 = y2 mod N and x ̸= ±y mod N. A naive way of doing this—which forms the basis of an older factoring algorithm due to Fermat—is to choose an x ∈ Z∗N , compute q := [x2 mod N], and then check whether q is a square over the integers (i.e., without reduction modulo N). If so, then q = y2 for some integer y and so x2 = y2 mod N. Unfortunately, the probability that [x2 mod N] is a square is so low that this process must be repeated exponentially many times.
A significant improvement is obtained by generating a sequence of values q1 := [x21 mod N ], . . . and identifying a subset of those values whose prod- uct is a square over the integers. In the quadratic sieve algorithm this is accomplished using the following two steps:
Step 1. Fix some bound B. Say an integer is B-smooth if all its prime
factors are less than or equal to B. In the first phase of the algorithm, we
search for integers of the form qi = [x2i mod N] that are B-smooth and factor
them. (Although factoring is hard, finding and factoring B-smooth numbers
is feasible when B is small enough.) These {x } are chosen by trying x =
√√
i
N + 1, N + 2, . . .; this ensures a nontrivial reduction modulo N (since √ def22
x> N)andhastheadvantagethatq = [x modN]=x −N is“small” so that q is more likely to be B-smooth.
Let {p1, . . . , pk} be the set of prime numbers less than or equal to B. Once we have found and factored the B-smooth {qi} as described above, we have a
*Algorithms for Factoring and Computing Discrete Logarithms 347 set of equations of the form:
2
q1 =[x1 modN]=
. ql =[xl modN]=
k e1,i pi
i=1
k el,i pi .
(9.1)
i=1
(Note that the above equations are over the integers.)
2
Step 2. We next want to find some subset of the {qi} whose product is a square. If we multiply some subset S of the {qi}, we see that the result
k j∈Sej,i z= qj= pi
j∈S i=1
is a square if and only if the exponent of each prime pi is even. This suggests that we care about the exponents {ej,i} in Equation (9.1) only modulo 2; moreover, we can use linear algebra to find a subset of the {qi} whose “expo- nent vectors” sum to the 0-vector modulo 2.
In more detail: if we reduce the exponents in Equation (9.1) modulo 2, we obtain the 0/1-matrix Γ given by
γ1,1 γ1,2 ··· γ1,k [e1,1 mod 2] [e1,2 mod 2] ··· [e1,k mod 2] ….def. …
…..= . . .. . . γl,1 γl,2 ··· γl,k [el,1 mod 2] [el,2 mod 2] ··· [el,k mod 2]
If l = k + 1, then Γ has more rows than columns and there must be some nonempty subset S of the rows that sum to the 0-vector modulo 2. Such a subset can be found efficiently using linear algebra. Then:
def k j∈S ej,i k (j∈S ej,i)/22 z=qj=pi=pi ,
i=1
using the fact that all the j∈S ej,i are even. Since
2
z = q j = x 2j = x j m o d N ,
j∈S j∈S j∈S
we have obtained two square roots (modulo N) of z. Although there is no guarantee that these square roots will enable factorization of N (for reasons
j∈S i=1
348 Introduction to Modern Cryptography
discussed at the beginning of this section), heuristically they do with constant probability. By taking l > k + 1 we can obtain multiple subsets S with the desired property and try to factor N using each possibility.
Example 9.3
Take N = 377753. We have 6647 = [6202 mod N], and we can factor 6647 (over the integers, without any modular reduction) as
Similarly,
6202 modN=6647=172 ·23.
6212 mod N = 24 · 17 · 29 6452 mod N = 27 · 13 · 23 6552 modN=23 ·13·17·29.
Letting our subset S include all four of the above equations, we see that
6202 ·6212 ·6452 ·6552 =214 ·132 ·174 ·232 ·292 modN ⇒[620·621·645·655modN]2 =27 ·13·172 ·23·29modN2 modN
⇒ 1271942 = 453352 mod N,
with 127194 ̸= ±45335 mod N. Computing gcd(127194 − 45335, 377753) =
751 yields a non-trivial factor of N. ♦
Running time. Choosing a larger value of B makes it more likely that a
uniform value q = [x2 mod N] is B-smooth; on the other hand, it means we
have to work harder to identify and factor B-smooth numbers, and we will
have to find more of them (since we require l > k, where k is the number
of primes less than or equal to B). It also means that the matrix Γ will be
larger, and so the linear-algebraic step will be slower. Choosing the optimal
value of B gives an algorithm that (heuristically, at least) factors N in time 2O(√
log N log log N). (In fact, the constant term in the exponent can be deter- mined quite precisely.) The important point for our purposes is that this is sub-exponential in the length of N.
9.2 Algorithms for Computing Discrete Logarithms
Let G be a group of known order q. An instance of the discrete-logarithm problem in G specifies a base g ∈ G (which need not be a generator of G) and an element h ∈ ⟨g⟩, the subgroup generated by h; the goal is to find x
*Algorithms for Factoring and Computing Discrete Logarithms 349
such that gx = h. (See Section 8.3.2.) The solution x is called the discrete logarithm of h with respect to g. A trivial brute-force search for x can be done in time |⟨g⟩| ≤ q (by simply trying all possible values), and so we will only be interested in algorithms whose running time is better than this.
Algorithms for solving the discrete-logarithm problem fall into two cate- gories: those that are generic and apply to arbitrary groups, and those that are tailored to work for some specific class of groups. We begin by discussing three generic algorithms:
• When the group order q is not prime and the factorization of q is known, or easy to determine, the Pohlig–Hellman algorithm reduces the problem of finding discrete logarithms in G to that of finding discrete logarithms in prime-order subgroups of G. Roughly speaking, the effect is that the complexity of solving the discrete logarithm in a group of order q is no greater than the complexity of solving the discrete logarithm in a group of order q′, where q′ is the largest prime dividing q. This explains the preference for using prime-order groups (cf. Section 8.3.2).
• The baby-step/giant-step method, due to Shanks, computes the discrete logarithm in a group of order q in time O(√q · polylog(q)) and storing O(√q) group elements.
• Pollard’s rho algorithm also enables computation of discrete logarithms in time O(√q · polylog(q)), but using constant memory. It can be viewed as exploiting the connection between the discrete-logarithm problem and collision-resistant hashing that we have seen in Section 8.4.2. We de- scribe a different algorithm that more clearly illustrates this connection.
It can be shown that the time complexity of the latter two algorithms is optimal as far as generic algorithms are concerned. Thus, to have any hope of doing better we must look at algorithms for specific groups that exploit the representation of group elements in those groups, where by “representation” we mean the way group elements are encoded as bit-strings.
This point bears some discussion. From a mathematical point of view, any two cyclic groups of the same order are isomorphic, meaning that the groups are identical up to a “renaming” of the group elements. From a com- putational/algorithmic point of view, however, this “renaming” can have a significant impact. For example, consider the cyclic group Zq of integers {0, . . . , q − 1} under addition modulo q. Computing discrete logarithms in this group is trivial. Say we are given g, h ∈ Zq with g a generator (so g ̸= 0), andwewanttofindxsuchthatx·g=hmodq. Sinceg̸=0wehave gcd(g, q) = 1 and so g has a multiplicative inverse g−1 modulo q. Moreover, g−1 can be computed efficiently, as described in Appendix B.2.2. But then x = h · g−1 mod q is the desired solution. Note that, formally, x here de- notes an integer and not a group element—after all, the group operation is addition, not multiplication. Nevertheless, in solving the discrete-logarithm
350 Introduction to Modern Cryptography
problem in Zq we can make use of the fact that another operation (namely, multiplication) can be defined on the elements of that group.
Turning to groups with cryptographic significance, we focus our attention on (subgroups of) Z∗p for p prime. (See Section 8.3.3.) We give a high-level overview of the index calculus algorithm for solving the discrete-logarithm problem in such groups in sub-exponential time. Currently, the best known algorithm for this class of groups is the number field sieve,2 which heuris- tically runs in time 2O((log p)1/3·(log log p)2/3). Sub-exponential algorithms for computing discrete logarithms in multiplicative subgroups of finite fields of large characteristic are also known. In early 2013, significant advances were made in algorithms for computing discrete logarithms in multiplicative sub- groups of finite fields of small characteristic; it seems prudent to avoid using such groups for cryptographic applications.
Importantly, no sub-exponential algorithms are known for computing dis- crete logarithms in certain elliptic-curve groups. This means that for a given security level, we can use groups of smaller order when working in elliptic- curve groups as compared to, say, working in Z∗p , resulting in cryptographic schemes with better asymptotic efficiency.
9.2.1 The Pohlig–Hellman Algorithm
The Pohlig–Hellman algorithm can be used to speed up the computation of discrete logarithms in a group G when any non-trivial factors of the group order q are known. Recall that the order of an element g, which we denote here by ord(g), is the smallest positive integer i for which gi = 1. We will need the following lemma:
LEMMA 9.4 Let ord(g) = q, and say p | q. Then ord(gp) = q/p.
PROOF Since(gp)q/p =gq =1,theorderofgp isatmostq/p. Leti>0 besuchthat(gp)i =1. Thengpi =1and,sinceqistheorderofg,wemust have pi ≥ q or equivalently i ≥ q/p. The order of gp is thus exactly q/p.
We will also use a generalization of the Chinese remainder theorem: if q= ki=1qi andgcd(qi,qj)=1foralli̸=jthen
Zq ≃Zq1×···×Zqk and Z∗q ≃Z∗q1×···×Z∗qk.
(This can be proved by induction on k, using the basic Chinese remainder the-
orem for k = 2.) Moreover, by an extension of the algorithm in Section 8.1.5
2It is no accident that the algorithm’s name and its running time are similar to those of the general number field sieve for factoring, since the algorithms share many of the same underlying steps.
*Algorithms for Factoring and Computing Discrete Logarithms 351
it is possible to convert efficiently between the representation of an element as an element of Zq and its representation as an element of Zq1 × · · · × Zqk .
We now describe the Pohlig–Hellman algorithm. We are given a generator g and an element h and wish to find an x such that gx = h. Say a factorization q = ki=1 qi is known with the {qi} pairwise relatively prime. (This need not be the complete prime factorization of q.) We know that
x
gq/qi = (gx)q/qi = hq/qi for i = 1,…,k. (9.2)
def q/qi Letting gi = g
def q/qi and hi = h
, we thus have k instances of a discrete- logarithm problem in k smaller groups. Specifically, each problem gix = hi is in a subgroup of size ord(gi) = qi (by Lemma 9.4). (Note that each such
problem only determines [x mod qi]; this follows from Proposition 8.53.)
We can solve each of the k resulting instances using any algorithm for
solving the discrete-logarithm problem. Solving these instances gives a set of
answers {xi}ki=1, with xi ∈ Zqi , for which gxi = hi = gix. Proposition 8.53 i
implies that x = xi mod qi for all i. By the generalized Chinese remainder theorem discussed earlier, the constraints
x=x1 modq1 .
x=xk modqk
uniquely determine x modulo q, and the desired solution x can be efficiently
reconstructed from the {xi}.
Example 9.5
Consider the problem of computing discrete logarithms in Z∗31, a group of orderq=30=5·3·2. Sayg=3andh=26=gx withxunknown. Wehave:
(g30/5)x=h30/5⇒(36)x=266 ⇒16x=1 (g30/3)x = h30/3 ⇒ (310)x = 2610 ⇒ 25x = 5 (g30/2)x = h30/2 ⇒ (315)x = 2615 ⇒ 30x = 30.
(All the above equations are modulo 31.) We have ord(16) = 5, ord(25) = 3, and ord(30) = 2. Solving each equation, we obtain
x=0mod5, x=2mod3, and x=1mod2, andsox=5mod30. Indeed,35 =26mod31. ♦
If q has (known) prime factorization q = k pei then, by using the Pohlig– i=1 i
Hellman algorithm, the time to compute discrete logarithms in a group of
order q is dominated by the computation of a discrete logarithm in a subgroup
of size maxi{pei}. This can be further reduced to computation of a discrete i
logarithm in a subgroup of size maxi{pi}; see Exercise 9.5.
352 Introduction to Modern Cryptography 9.2.2 The Baby-Step/Giant-Step Algorithm
The baby-step/giant-step algorithm computes discrete logarithms in a group of order q using O(√q) group operations. The idea is simple. Given a gener- ator g ∈ G, we can imagine the powers of g as forming a cycle
1=g0,g1,g2,…,gq−2,gq−1,gq =1.
We know that h must lie somewhere in this cycle. Computing all the points in
this cycle would take Ω(q) time. Instead, we “mark off” the cycle at intervals def √ √
of size t = ⌊ q ⌋; more precisely, we compute and store the ⌊q/t⌋+1 = O( q ) elements
g0, gt, g2t, . . . , g⌊q/t⌋·t.
(These are the “giant steps.”) Note that the gap between any consecutive “marks” is at most t. Furthermore, we know that h = gx lies in one of these gaps. Thus, if we next take “baby steps” and compute the t elements
h·g1, …, h·gt,
each of which corresponds to a “shift” of h, we know that one of these values will be equal to one of the points we have marked off. Say we find h·gi = gk·t. We can then easily compute logg h := [(kt − i) mod q]. Pseudocode for this algorithm follows.
ALGORITHM 9.6
The baby-step/giant-step algorithm
Input: Elements g, h ∈ G; the order q of G Output: logg h
t := ⌊√
q⌋
for i = 0 to ⌊q/t⌋:
compute gi := gi·t
sort the pairs (i,gi) by their second component for i=1 to t:
compute hi := h · gi
ifhi =gk forsomek,return[kt−imodq]
The algorithm requires O(√q ) exponentiations/multiplications in G. (In fact, other than the first value g1 = gt, each value gi can be computed using a single multiplication as gi := gi−1 · g1. Similarly, each hi can be computed as hi := hi−1 · g.) Sorting the O(√q ) pairs {(i, gi)} takes time O(√q · log q), and we can then use binary search to check if each hi is equal to some gk in time O(log q). The overall algorithm thus runs in time O(√q · polylog(q)).
*Algorithms for Factoring and Computing Discrete Logarithms 353
Example 9.7
We show an application of the algorithm in the cyclic group Z∗29 of order q=29−1=28. Takeg=2andh=17. Wesett=5andcompute:
20 =1, 25 =3, 210 =9, 215 =27, 220 =23, 225 =11.
(It should be understood that all operations are in Z∗29.) Then compute:
17·21 =5, 17·22 =10, 17·23 =20, 17·24 =11, andnoticethat17·24 =11=225. Wethushavelog217=25−4=21. ♦
9.2.3 Discrete Logarithms from Collisions
A drawback of the baby-step/giant-step algorithm is that it uses a large amount of memory, as it requires storage of O(√q) points. We can obtain an algorithm that uses constant memory—and has the same asymptotic running time—by exploiting the connection between the discrete-logarithm problem and collision-resistant hashing shown in Section 8.4.2, and recalling the small- space birthday attack for finding collisions from Section 5.4.2.
We describe the high-level idea. Fix a base g ∈ G and some element h ∈ ⟨g⟩. Recall from the results of Section 8.4.2 that if we define the hash function Hg,h : Zq × Zq → G by Hg,h(x1, x2) = gx1 hx2 , then finding a collision in Hg,h implies the ability to compute logg h. (See the proof of Theorem 8.79.) We have thus reduced the problem of computing logg h to that of finding a collision
q) in a hash function, something we know how to do in time O( |G|) = O(√
using a birthday attack! Moreover, a small-space birthday attack will give a collision in the same time and constant space.
It only remains to address a few technical details. One is that the small- space birthday attack described in Section 5.4.2 assumes that the range of the hash function is a subset of its domain; that is not the case here, and in fact (depending on the representation being used for elements of G) it could even be that Hg,h is not compressing. A second issue is that the analysis in Section 5.4.2 treated the hash function as a random function, whereas Hg,h has a significant amount of algebraic structure.
Pollard’s rho algorithm provides one way to deal with the above issues. We
describe a different algorithm that can be viewed as a more direct implemen-
tation of the above ideas. (In practice, Pollard’s algorithm would be more
efficient, although both algorithms use only O(√q) group operations.) Let
F : G → Zq × Zq denote a cryptographic hash function obtained by, e.g., a def
suitable modification of SHA-1. Define H : G → G by H(k) = Hg,h(F(k)). We can use Algorithm 5.9, with natural modifications, to find a collision in H using O( |G|) = O(√q) evaluations of H in expectation (and constant mem- ory). With overwhelming probability, this yields a collision in Hg,h. You are asked to flesh out the details in Exercise 9.6.
354 Introduction to Modern Cryptography
It is interesting to observe here that a security proof based on the hardness of the discrete-logarithm problem—namely, that it implies a collision-resistant hash function—leads to a better algorithm for solving that same problem! A little reflection should convince us that this is not surprising: a proof by reduc- tion demonstrates that an attack on some construction (in this case, finding collisions in the hash function) directly yields an attack on the underlying assumption (here, the hardness of the discrete-logarithm problem), which is exactly the property the above algorithm exploits.
9.2.4 The Index Calculus Algorithm
We conclude with a brief look at the (non-generic) index calculus algorithm for computing discrete logarithms in the cyclic group Z∗p (for p prime). In con- trast to the preceding (generic) algorithms, this approach has sub-exponential running time. The algorithm bears some resemblance to the quadratic sieve algorithm introduced in Section 9.1.3, and we assume readers are familiar with the discussion there. As in that case, we discuss the main ideas of the index calculus method but leave a detailed analysis outside the scope of our treat- ment. Also, some simplifications are introduced to clarify the presentation.
The index calculus method uses a two-step process. Importantly, the first step requires knowledge only of the modulus p and the base g and so it can be run as a preprocessing step before h—the value whose discrete logarithm we wish to compute—is known. For the same reason, it suffices to run the first step only once in order to solve multiple instances of the discrete-logarithm problem (as long as all those instances share the same p and g).
Step 1. Fix some bound B, and let {p1,…,pk} be the set of prime num- bers less than or equal to B. In this step, we find l ≥ k distinct values
def xi
x1,…,xl ∈ Zp−1 for which gi = [g mod p] is B-smooth. This is done by
simply choosing uniform {xi} until suitable values are found. Factoring the resulting B-smooth numbers, we have the l equations:
i=1
gx1 = .
pe1,i mod p
gxl =
pel,i modp.
k
i
k
i
i=1
*Algorithms for Factoring and Computing Discrete Logarithms 355 Taking discrete logarithms, we can transform these into the linear equations
x1 = .
k i=1
k
e1,i·loggpi el,i ·logg pi
mod(p−1) mod(p−1).
(9.3)
xl =
Note that the {xi} and the {ei,j} are known, while the {logg pi} are unknown.
Step 2. Now we are given an element h and want to compute logg h. Here, we find a value x ∈ Zp−1 for which [gx· h mod p ] is B-smooth. (Once again, this is done simply by choosing x uniformly.) Say
i=1 k
i=1
Wealsohave35·87=32=25 mod101,or 5+log387=5·log32 mod100.
k
i
pei modp ei · logg pi
gx·h= ⇒ x + logg h =
where x and the {ei} are known. Combined with Equations (9.3)–(9.4), we have l + 1 ≥ k + 1 linear equations in the k + 1 unknowns {logg pi}ki=1 and logg h. Using linear-algebraic3 methods (and assuming the system of equations is not under-defined), we can solve for each of the unknowns and in particular obtain the desired solution logg h.
Example 9.8
Letp=101,g=3,andh=87. Wehave[310mod101]=65=5·13. Similarly, [312 mod 101] = 80 = 24 · 5 and [314 mod 101] = 13. We thus have the linear equations
10=log35+log313 mod100 12=4·log32+log35 mod100 14 = log3 13 mod 100.
i=1
mod (p − 1),
(9.5) Adding the second and third equations and subtracting the first, we derive
4 · log3 2 = 16 mod 100. This doesn’t determine log3 2 uniquely (since 4 is
3Technically, things are slightly more complicated since the linear equations are all modulo p − 1, which is not prime. Nevertheless, there exist techniques for dealing with this.
(9.4)
356 Introduction to Modern Cryptography
not invertible modulo 100), but it does tell us that log3 2 = 4, 29, 54, or 79 (cf. Exercise 9.3). Trying all possibilities gives log3 2 = 29. Plugging this into Equation (9.5) gives log3 87 = 40. ♦
Running time. Choosing a larger value of B makes it more likely that a uni-
form value in Z∗p is B-smooth; however, it means we will have to work harder
to identify and factor B-smooth numbers, and we will have to find more of
them. Because the system of equations will be larger, solving the system will
take longer. Choosing the optimal value of B gives an algorithm that (heuris- log p log log p).
tically, at least) computes discrete logarithms in Z∗p in time 2O(√
(In fact, the constant term in the exponent can be determined quite precisely.) The important point for our purposes is that this is sub-exponential in the length of p.
9.3 Recommended Key Lengths
Knowledge of the best available algorithms for solving various cryptographic problems is essential for determining an appropriate key length to achieve some desired level of security. The following table summarizes the key lengths currently recommended by the US National Institute of Standards and Tech- nology4 (NIST) [13]:
RSA
Modulus Length
Order-q Subgroup of Z∗p
2048
3072
7680
15360
p: 2048, q: 224 p: 3072, q: 256 p: 7680, q: 384 p: 15360, q: 512
Effective Key Length
112
128
192
256
Discrete Logarithm Elliptic-Curve
Group Order q
224
256
384
512
All figures in the table are measured in bits. The “effective key length” is a value n such that the best known algorithm for solving a problem takes time roughly 2n; i.e., the computational difficulty of solving a problem is approximately equivalent to that of performing a brute-force search against a symmetric-key scheme with an n-bit key, or the time to find collisions in a hash function with a 2n-bit output length. NIST deems a 112-bit effective key length acceptable for security until the year 2030, but recommends 128-bit or higher key lengths for applications where security is required beyond then.
4Other groups have made their own recommendations; see http://keylength.com.
*Algorithms for Factoring and Computing Discrete Logarithms 357
Given what we have learned in this chapter, it is instructive to look more closely at some of the numbers in the table. The first thing we may notice is that elliptic-curve groups can be used to realize any given level of security with smaller parameters than for RSA or subgroups of Z∗p. This is simply because no sub-exponential algorithms are known for solving the discrete-logarithm problem in such groups (when chosen appropriately). Achieving n-bit security, however, requires an elliptic-curve group whose order q is 2n-bits long. This is a consequence of the generic algorithms we have seen in this chapter, which solve the discrete-logarithm problem (in any group) in time O(√q) ≈ 2n.
Turning to the case of Z∗p we see that here, too, a 2n-bit value of q is needed for n-bit security. The length of p, however, must be significantly larger, because the number field sieve can be used to compute discrete logarithms in Z∗p in time sub-exponential in the length of p. (That is, p and q are chosen such that the running time of the number field sieve, which depends on the length of p, and the running time of a generic algorithm, which depends on the length of q, will be approximately equal.) The practical ramifications of this are that, for any desired security level, elliptic-curve cryptosystems can use significantly smaller parameters, with asymptotically faster group operations for the honest parties, than cryptosystems based on subgroups of Z∗p.
References and Additional Reading
Pollard’s p−1 algorithm was published in 1974 [139], and his rho method for factoring was described the following year [140]. The quadratic sieve algorithm is due to Pomerance [142], based on earlier ideas of Dixon [60].
The baby-step/giant-step algorithm is due to Shanks [153]. The Pohlig– Hellman algorithm was published in 1978 [138], as was Pollard’s rho algorithm for computing discrete logarithms [141]. The index calculus algorithm as we have described it is by Adleman [4].
The texts by Wagstaff [173], Shoup [159], Crandall and Pomerance [51], Joux [96], and Galbraith [69] provide further information on algorithms for fac- toring and computing discrete logarithms, including descriptions of the (gen- eral) number field sieve. Very recently, an improved algorithm for the discrete- logarithm problem in finite fields of small characteristic was announced [11].
Lower bounds on the running time of generic algorithms for computing discrete logarithms, which asymptotically match the running times of the algorithms described here, were given by Nechaev [133] and Shoup [155].
Lenstra and Verheul [113] provide a comprehensive discussion of how known algorithms for factoring and computing discrete logarithms affect the choice of cryptographic parameters in practice.
358 Introduction to Modern Cryptography
Exercises
9.1 In order to speed up the key-generation algorithm for RSA, it has been suggested to generate a prime by generating many small random primes, multiplying them together, and adding one (of course, then checking that the result is prime). Ignoring the question of the probability that such a value really is prime, what do you think of this method?
(i) def (i) (0)
9.2 In an execution of Algorithm 9.2, define x = F (x ). Show that if,
in a given execution of the algorithm, there exist i, j ≤ 2n/2 such that x(i) ̸= x(j) but x(i) = x(j) mod p, then that execution of the algorithm outputs p. (The analysis is a little different from the analysis of Algo- rithm 5.9, since the algorithms—and their goals—are slightly different.)
9.3 (a) Show that if ab = c mod N and gcd(b,N) = d, then: i. d|c;
ii. a · (b/d) = (c/d) mod (N/d); and iii. gcd(b/d, N/d) = 1.
(b) Describe how to use the above to compute discrete logarithms in ZN efficiently even when the base g is not a generator of ZN .
9.4 Here we show how to solve the discrete-logarithm problem in a cyclic
p ). Given as input a gen- group of order q = pe in time O(polylog(q) · √
eratorgoforderq=pe andavalueh,wewanttocomputex=loggh.
p).
(a) Show how to compute [x mod p ] in time O(polylog(q) · √
Hint: Solve the equation
gpe−1x0 =hpe−1
and use the same ideas as in the Pohlig–Hellman algorithm.
(b) Say x = x0+x1·p+···+xe−1·pe−1 with 0 ≤ xi < p. In the previous step we determined x0. Show how to compute in polylog(q) time a value h1 such that (gp)x1+x2·p+···+xe−1·pe−2 = h1.
(c) Use recursion to obtain the claimed running time for the original problem. (Note that e = O(logq).)
9.5 Let q have prime factorization q = k pei . Using the result from the i=1 i
pi }.
previous problem, show a modification of the Pohlig–Hellman algorithm that solves the discrete-logarithm problem in a group of order q in time
O polylog(q) · ki=1 ei√
pi = O polylog(q) · maxi{√
9.6 Give pseudocode for the small-space algorithm for computing discrete logarithms described in Section 9.2.3, and give a heuristic analysis of the probability with which it succeeds.
Chapter 10
Key Management and the Public-Key Revolution
10.1 Key Distribution and Key Management
In Chapters 1–7 we have seen how private-key cryptography can be used to ensure secrecy and integrity for two parties communicating over an insecure channel, assuming the two parties are in possession of a shared, secret key. The question we have deferred since Chapter 1, however, is:
How can the parties share a secret key in the first place?
Clearly, the key cannot simply be sent over the public communication channel because an eavesdropping adversary would then be able to observe it en route. Some other mechanism must be used instead.
In some situations, the parties may have access to a secure channel that they can use to reliably share a secret key. One common example is when the two parties are co-located at some time, at which point they can share a key. Alternatively, the parties might be able to use a trusted courier service as a secure channel. We stress that the fact that the parties can share a key—and thus must have access to a secure channel at some point—does not make private-key cryptography useless: in the first example, the parties have a secure channel at one point in time but not indefinitely; in the second example, utilizing the secure channel might be slower and more costly than communicating over an insecure channel.
The above approaches have been used to share keys in government, diplo- matic, and military contexts. For example, the “red phone” connecting Moscow and Washington in the 1960s was encrypted using a one-time pad, with keys shared by couriers who flew from one country to the other carrying briefcases full of print-outs. Such approaches can also be used in corporations, e.g., to set up a shared key between a central database and a new employee on his/her first day of work. (We return to this example in the next section.)
Relying on a secure channel to distribute keys, however, does not work well in many other situations. For example, consider a large, multinational corpo- ration in which every pair of employees might need the ability to communicate securely, with their communication protected from other employees as well.
359
360 Introduction to Modern Cryptography
It will be inconvenient, to say the least, for each pair of employees to meet so they can securely share a key; for employees working in different cities, this may even be impossible. Even if the current set of employees could somehow share keys with each other, it would be impractical for them to share keys with new employees who join after this initial sharing is done.
Assuming these N employees are somehow able to securely share keys with each other, another significant drawback is that each employee will have to manage and store N − 1 secret keys (one for each other employee). In fact, this may significantly under-count the number of keys stored by each user, because employees may also need keys to communicate securely with remote resources such as databases, servers, printers, and so on. The proliferation of so many secret keys is a significant logistical problem. Moreover, all these keys must be stored securely. The more keys there are, the harder it is to protect them, and the higher the chance of some keys being stolen by an attacker. Computer systems are often infected by viruses, worms, and other forms of malicious software which can steal secret keys and send them quietly over the network to an attacker. Thus, storing keys on employees’ personal computers is not always a safe solution.
To be clear, potential compromise of secret keys is always a concern, irre- spective of the number of keys each party holds. When only a few keys need to be stored, however, there are good solutions available for dealing with this threat. A typical solution today is to store keys on secure hardware such as a smartcard. A smartcard can carry out cryptographic computations using the stored secret keys, ensuring that these keys never make their way onto users’ personal computers. If designed properly, the smartcard can be much more resilient to attack than a personal computer—for example, it typically cannot be infected by malware—and so offers a good means of protecting users’ secret keys. Unfortunately, smartcards are typically quite limited in memory and so cannot store hundreds (or thousands) of keys.
The concerns outlined above can all be addressed—in principle, even if not in practice—in “closed” organizations consisting of a well-defined population of users, all of whom are willing to follow the same policies for distributing and storing keys. They break down, however, in “open systems” where users have transient interactions, cannot arrange a physical meeting, and may not even be aware of each other’s existence until the time they want to commu- nicate. This is, in fact, a more common situation than one might at first realize: consider using encryption to send credit-card information to an In- ternet merchant from whom you have not previously purchased anything, or sending email to someone whom you have never met in person. In such cases, private-key cryptography alone simply does not provide a solution, and we must look further for adequate solutions.
To summarize, there are at least three distinct problems related to the use of private-key cryptography. The first is that of key distribution; the second is that of storing and managing large numbers of secret keys; the third is the inapplicability of private-key cryptography to open systems.
Key Management and the Public-Key Revolution 361
10.2 A Partial Solution: Key-Distribution Centers
One way to address some of the concerns from the previous section is to use a key-distribution center (KDC) to establish shared keys. Consider again the case of a large corporation where all pairs of employees must be able to communicate securely. In such a setting, we can leverage the fact that all employees may trust some entity—say, the system administrator—at least with respect to the security of work-related information. This trusted entity can then act as a KDC and help all the employees share pairwise keys.
When a new employee joins, the KDC can share a key with that employee (in person, in a secure location) as part of that employee’s first day of work. At the same time, the KDC could also distribute shared keys between that employee and all existing employees. That is, when the ith employee joins, the KDC could (in addition to sharing a key between itself and this new employee) generate i−1 keys k1,...,ki−1, give these keys to the new employee, and then send key kj to the jth existing employee by encrypting it using the key that employee already shares with the KDC. Following this, the new employee shares a key with every other employee (as well as with the KDC).
A better approach, which avoids requiring employees to store and manage multiple keys, is to utilize the KDC in an online fashion to generate keys “on demand” whenever two employees wish to communicate securely. As before, the KDC will share a (different) key with each employee, something that can be done securely on an employee’s first day of work. Say the KDC shares key kA with employee Alice, and kB with employee Bob. At some later time, when Alice wishes to communicate securely with Bob, she can simply send the message ‘‘I, Alice, want to talk to Bob’’ to the KDC. (If desired, this message can be authenticated using the key shared by Alice and the KDC.) The KDC then chooses a new, random key—called a session key—and sends this key k to Alice encrypted using kA, and to Bob encrypted using kB. (This protocol is too simplistic to be used in practice; see further discussion below.) Once Alice and Bob both recover this session key, they can use it to communicate securely. When they are done with their conversation, they can (and should) erase the session key because they can always contact the KDC again if they wish to communicate at some later time.
Consider the advantages of this approach:
1. Each employee needs to store only one long-term secret key (namely, the one they share with the KDC). Employees still need to manage and store session keys, but these are short-term keys that are erased once a communication session concludes.
The KDC needs to store many long-term keys. However, the KDC can be kept in a secure location and be given the highest possible protection against network attacks.
362 Introduction to Modern Cryptography
2. When an employee joins the organization, all that must be done is to set up a key between this employee and the KDC. No other employees need to update the set of keys they hold.
Thus, KDCs can alleviate two of the problems we have seen with regard to private-key cryptography: they can simplify key distribution (since only one new key must be shared when an employee joins, and it is reasonable to assume a secure channel between the KDC and that employee on their first day of work), and can reduce the complexity of key storage (since each employee only needs to store a single key). KDCs go a long way toward making private-key cryptography practical in large organizations where there is a single entity who is trusted by everyone.
There are, however, some drawbacks to relying on KDCs:
1. A successful attack on the KDC will result in a complete break of the system: an attacker can compromise all keys and subsequently eavesdrop on all network traffic. This makes the KDC a high-value target. Note that even if the KDC is well-protected against external attacks, there is always the possibility of insider attacks by employees who have access to the KDC (for example, the IT manager).
2. The KDC is a single point of failure: if the KDC is down, secure com- munication is temporarily impossible. If employees are constantly con- tacting the KDC and asking for session keys to be established, the load on the KDC can be very high, thereby increasing the chances that it may fail or be slow to respond.
A simple solution to the second problem is to replicate the KDC. This works (and is done in practice), but also means that there are now more points of attack on the system. Adding more KDCs also makes it more difficult to add new employees, since updates must be securely propagated to every KDC.
Protocols for key distribution using a KDC. There are a number of pro- tocols in the literature for secure key distribution using a KDC. We mention in particular the Needham–Schroeder protocol, which forms the core of Ker- beros, an important and widely used service for performing authentication and supporting secure communication. (Kerberos is used in many universities and corporations, and is the default mechanism for supporting secure networked authentication and communication in Windows and many UNIX systems.) We only highlight one feature of this protocol. When Alice contacts the KDC and asks to communicate with Bob, the KDC does not send the encrypted session key to both Alice and Bob as we have described earlier. Instead, the KDC sends to Alice the session key encrypted under Alice’s key in addition to the session key encrypted under Bob’s key. Alice then forwards the sec- ond ciphertext to Bob as in Figure 10.1. The second ciphertext is sometimes called a ticket, and can be viewed as a credential that allows Alice to talk to Bob (and allows Bob to be assured that he is talking to Alice). Indeed,
Key Management and the Public-Key Revolution 363
FIGURE 10.1: A general template for key-distribution protocols.
although we have not stressed this point in our discussion, a KDC-based ap- proach can provide a useful means of performing authentication as well. Note also that Alice and Bob need not both be users; Alice might be a user and Bob a resource such as a server, a remote disk, or a printer.
The protocol was designed in this way to reduce the load on the KDC. In the protocol as described, the KDC does not need to initiate a second connection to Bob, and need not worry whether Bob is on-line when Alice initiates the protocol. Moreover, if Alice retains the ticket (and her copy of the session key), then she can re-initiate secure communication with Bob by simply re-sending the ticket to Bob, without the involvement of the KDC at all. (In practice, tickets expire and eventually need to be renewed. But a session could be re-established within some acceptable time period.)
We conclude by noting that in practice the key that Alice shares with the KDC might be a short, easy-to-memorize password. In this case, many additional security problems arise that must be dealt with. We have also been implicitly assuming an attacker who only passively eavesdrops, rather than one who might actively try to interfere with the protocol. We refer the interested reader to the references at the end of this chapter for more information about how such issues can be addressed.
10.3 Key Exchange and the Diffie–Hellman Protocol
KDCs and protocols like Kerberos are commonly used in practice. But these approaches to the key-distribution problem still require, at some point, a private and authenticated channel that can be used to share keys. (In particular, we assumed the existence of such a channel between the KDC and the employees on their first day.) Thus, they still cannot solve the problem
364 Introduction to Modern Cryptography
of key distribution in open systems like the Internet, where there may be no private channel available between two users who wish to communicate.
To achieve private communication without ever communicating over a pri- vate channel, a radically different approach is needed. In 1976, Whitfield Diffie and Martin Hellman published a paper with the innocent-looking title “New Directions in Cryptography.” In that work they observed that there is often asymmetry in the world; in particular, there are certain actions that can be easily performed but not easily reversed. For example, padlocks can be locked without a key (i.e., easily), but then cannot be reopened. More strikingly, it is easy to shatter a glass vase but extremely difficult to put it back together again. Algorithmically (and more germane for our purposes), it is easy to multiply two large primes but difficult to recover those primes from their product. (This is exactly the factoring problem discussed in previous chapters.) Diffie and Hellman realized that such phenomena could be used to derive interactive protocols for secure key exchange that allow two parties to share a secret key, via communication over a public channel, by having the parties perform operations that they can reverse but that an eavesdropper cannot.
The existence of secure key-exchange protocols is quite amazing. It means that you and a friend could agree on a secret by simply shouting across a room (and performing some local computation); the secret would be unknown to anyone else, even if they had listened to everything that was said. Indeed, until 1976 it was generally believed that secure communication could not be achieved without first sharing some secret information using a private channel.
The influence of Diffie and Hellman’s paper was enormous. In addition to introducing a fundamentally new way of looking at cryptography, it was one of the first steps toward moving cryptography out of the private domain and into the public one. We quote the first two paragraphs of their paper:
We stand today on the brink of a revolution in cryptography. The development of cheap digital hardware has freed it from the design limitations of mechanical computing and brought the cost of high grade cryptographic devices down to where they can be used in such commercial applications as remote cash dispensers and computer terminals.
In turn, such applications create a need for new types of crypto- graphic systems which minimize the necessity of secure key distri- bution channels. . . . At the same time, theoretical developments in information theory and computer science show promise of provid- ing provably secure cryptosystems, changing this ancient art into a science.
Diffie and Hellman were not exaggerating, and the revolution they spoke of was due in great part to their work.
In this section we present the Diffie–Hellman key-exchange protocol. We prove its security against eavesdropping adversaries or, equivalently, under
Key Management and the Public-Key Revolution 365
the assumption that the parties communicate over a public but authenticated channel (so an attacker cannot interfere with their communication). Secu- rity against an eavesdropping adversary is a relatively weak guarantee, and in practice key-exchange protocols must satisfy stronger notions of security that are beyond our present scope. (Moreover, we are interested here in the setting where the communicating parties have no prior shared information, in which case there is nothing that can be done to prevent an adversary from impersonating one of the parties. We return to this point later.)
The setting and definition of security. We consider a setting with two parties—traditionally called Alice and Bob—who run a probabilistic protocol Π in order to generate a shared, secret key; Π can be viewed as the set of instructions for Alice and Bob in the protocol. Alice and Bob begin by holding the security parameter 1n; they then run Π using (independent) random bits. At the end of the protocol, Alice and Bob output keys kA,kB ∈ {0,1}n, respectively. The basic correctness requirement is that kA = kB. Since we will only deal with protocols that satisfy this requirement, we will speak simply of the key k = kA = kB generated by an honest execution of Π.
We now turn to defining security. Intuitively, a key-exchange protocol is secure if the key output by Alice and Bob is completely unguessable by an eavesdropping adversary. This is formally defined by requiring that an ad- versary who has eavesdropped on an execution of the protocol should be unable to distinguish the key k generated by that execution (and now shared by Alice and Bob) from a uniform key of length n. This is much stronger than simply requiring that the adversary be unable to compute k exactly, and this stronger notion is necessary if the parties will subsequently use k for some cryptographic application (e.g., as a key for a private-key encryption scheme).
Formalizing the above, let Π be a key-exchange protocol, A an adversary, and n the security parameter. We have the following experiment:
The key-exchange experiment KEeav (n): A,Π
1. Two parties holding 1n execute protocol Π. This results in a transcript trans containing all the messages sent by the par- ties, and a key k output by each of the parties.
2. Auniformbitb∈{0,1}ischosen. Ifb=0setkˆ:=k,and if b = 1 then choose kˆ ∈ {0, 1}n uniformly at random.
3. A is given trans and kˆ, and outputs a bit b′.
4. The output of the experiment is defined to be 1 if b′ = b, and
0 otherwise. (In case KEeav (n) = 1, we say that A succeeds.) A,Π
A is given trans to capture the fact that A eavesdrops on the entire execution of the protocol and thus sees all messages exchanged by the parties. In the real world, A would not be given any key; in the experiment the adversary is given kˆ only as a means of defining what it means for A to “break” the security of Π.
366 Introduction to Modern Cryptography
That is, the adversary succeeds in “breaking” Π if it can correctly determine whether the key kˆ is the real key corresponding to the given execution of the protocol, or whether kˆ is a uniform key that is independent of the transcript. As expected, we say Π is secure if the adversary succeeds with probability that is at most negligibly greater than 1/2. That is:
DEFINITION 10.1 A key-exchange protocol Π is secure in the presence of an eavesdropper if for all probabilistic polynomial-time adversaries A there is a negligible function negl such that
Pr KEeav (n) = 1 ≤ 1 + negl(n). A,Π 2
The aim of a key-exchange protocol is almost always to generate a shared key k that will be used by the parties for some further cryptographic purpose, e.g., to encrypt and authenticate their subsequent communication using, say, an authenticated encryption scheme. Intuitively, using a shared key generated by a secure key-exchange protocol should be “as good as” using a key shared over a private channel. It is possible to prove this formally; see Exercise 10.1.
The Diffie–Hellman key-exchange protocol. We now describe the key- exchange protocol that appeared in the original paper by Diffie and Hellman (although they were less formal than we will be here). Let G be a probabilis- tic polynomial-time algorithm that, on input 1n, outputs a description of a cyclic group G, its order q (with ∥q∥ = n), and a generator g ∈ G. (See Sec- tion 8.3.2.) The Diffie–Hellman key-exchange protocol is described formally as Construction 10.2 and illustrated in Figure 10.2.
CONSTRUCTION 10.2
• Common input: The security parameter 1n
• The 1. 2. 3. 4.
5.
protocol:
Alice runs G(1n) to obtain (G, q, g).
Alice chooses a uniform x ∈ Zq, and computes hA := gx. Alice sends (G, q, g, hA) to Bob.
Bob receives (G, q, g, hA). He chooses a uniform y ∈ Zq, and computes hB := gy. Bob sends hB to Alice and outputs the k e y k B : = h yA .
Alice receives hB and outputs the key kA := hxB .
The Diffie–Hellman key-exchange protocol.
Key Management and the Public-Key Revolution 367
FIGURE 10.2: The Diffie–Hellman key-exchange protocol.
In our description, we have assumed that Alice generates (G, q, g) and sends these parameters to Bob as part of her first message. In practice, these pa- rameters are standardized and are fixed and known to both parties before the protocol begins. In that case Alice need only send hA, and Bob need not wait to receive Alice’s message before computing and sending hB.
It is not hard to see that the protocol is correct: Bob computes the key kB =hyA =(gx)y =gxy
and Alice computes the key
k A = h xB = ( g y ) x = g x y ,
and so kA = kB. (The observant reader will note that the shared key is a group element, not a bit-string. We will return to this point later.)
Diffie and Hellman did not prove security of their protocol; indeed, the appropriate notions (both the definitional framework as well as the idea of formulating precise assumptions) were not yet in place. Let us see what sort of assumption will be needed in order for the protocol to be secure. A first observation, made by Diffie and Hellman, is that a minimal require- ment for security here is that the discrete-logarithm problem be hard relative to G. If not, then an adversary given the transcript (which, in particular, includes hA) can compute the secret value of one of the parties (i.e., x) and then easily compute the shared key using that value. So, hardness of the discrete-logarithm problem is necessary for the protocol to be secure. It is not, however, sufficient, as it is possible that there are other ways of comput- ing the key kA = kB without explicitly computing x or y. The computational Diffie–Hellman assumption—which would only guarantee that the key gxy is
368 Introduction to Modern Cryptography
hard to compute in its entirety from the transcript—does not suffice either. What is required by Definition 10.1 is that the shared key gxy should be in- distinguishable from uniform for any adversary given g, gx, and gy. This is exactly the decisional Diffie–Hellman assumption introduced in Section 8.3.2.
As we will see, a proof of security for the protocol follows almost immedi- ately from the decisional Diffie–Hellman assumption. This should not be sur- prising, as the Diffie–Hellman assumptions were introduced—well after Diffie and Hellman published their paper—as a way of abstracting the properties underlying the (conjectured) security of the Diffie–Hellman protocol. Given this, it is fair to ask whether anything is gained by defining and proving secu- rity here. By this point in the book, hopefully you are convinced the answer is yes. Precisely defining secure key exchange forces us to think about exactly what security properties we require; specifying a precise assumption (namely, the decisional Diffie–Hellman assumption) means we can study this assump- tion independently of any particular application and—once we are convinced of its plausibility—construct other protocols based on it; finally, proving secu- rity shows that the assumption does, indeed, suffice for the protocol to meet our desired notion of security.
In our proof of security, we use a modified version of Definition 10.1 in which it is required that the shared key be indistinguishable from a uniform element of G rather than from a uniform n-bit string. This discrepancy will need to be addressed before the protocol can be used in practice—after all, group elements are not typically useful as cryptographic keys, and the representation of a uniform group element will not, in general, be a uniform bit-string—and
we briefly discuss one standard way to do so following the proof. For now, we
eav
let KEA,Π(n) denote a modified experiment where if b = 1 the adversary is
given kˆ chosen uniformly from G instead of a uniform n-bit string. THEOREM 10.3 If the decisional Diffie–Hellman problem is hard rela-
tive to G, then the Diffie–Hellman key-exchange protocol Π is securein the eav
presence of an eavesdropper (with respect to the modified experiment KEA,Π). PROOF Let A be a ppt adversary. Since Pr[b = 0] = Pr[b = 1] = 1/2, we
have
eav Pr KEA,Π(n) = 1
1 eav 1 eav = 2·Pr KEA,Π(n)=1|b=0 +2·Pr KEA,Π(n)=1|b=1 .
eav ˆ
In experiment KEA,Π(n) the adversary A receives (G,q,g,hA,hB,k), where
(G,q,g,hA,hB) represents the transcript of the protocol execution, and kˆ is either the actual key computed by the parties (if b = 0) or a uniform group element (if b = 1). Distinguishing between these two cases is exactly
Key Management and the Public-Key Revolution 369 equivalent to solving the decisional Diffie–Hellman problem. That is
eav Pr KEA,Π(n) = 1
1 eav 1 eav = 2·Pr KEA,Π(n)=1|b=0 +2·Pr KEA,Π(n)=1|b=1
= 1·Pr[A(G,g,q,gx,gy,gxy)=0]+1·Pr[A(G,q,g,gx,gy,gz)=1] 2 2
= 1· 1−Pr[A(G,g,q,gx,gy,gxy)=1] +1·Pr[A(G,q,g,gx,gy,gz)=1] 22
= 1+1· Pr[A(G,g,q,gx,gy,gz)=1]−Pr[A(G,q,g,gx,gy,gxy)=1] 22
≤ 1+1·Pr[A(G,g,q,gx,gy,gz)=1]−Pr[A(G,q,g,gx,gy,gxy)=1], 22
where the probabilities are all taken over (G,q,g) output by G(1n), and uni- form choice of x, y, z ∈ Zq . (Note that since g is a generator, gz is a uniform element of G when z is uniformly distributed in Zq.) If the decisional Diffie– Hellman assumption is hard relative to G, that exactly means that there is a negligible function negl for which
P r [ A ( G , g , q , g x , g y , g z ) = 1 ] − P r [ A ( G , q , g , g x , g y , g x y ) = 1 ] ≤ n e g l ( n ) . We conclude that
eav 1 1
Pr KEA,Π(n)=1 ≤2+2·negl(n),
completing the proof.
Uniform group elements vs. uniform bit-strings. The previous theorem shows that the key output by Alice and Bob in the Diffie–Hellman protocol is indistinguishable (for a polynomial-time eavesdropper) from a uniform group element. In order to use the key for subsequent cryptographic applications—as well as to meet Definition 10.1—the key output by the parties should instead be indistinguishable from a uniform bit-string of the appropriate length. The Diffie–Hellman protocol can be modified to achieve this by having the parties apply an appropriate key-derivation function (cf. Section 5.6.4) to the shared group element gxy they each compute.
Active adversaries. So far we have considered only an eavesdropping ad- versary. Although eavesdropping attacks are by far the most common (as they are the easiest to carry out), they are by no means the only possible at- tack. Active attacks, in which the adversary sends messages of its own to one or both of the parties, are also a concern, and any protocol used in practice must be resilient to such attacks as well. When considering active attacks, it is useful to distinguish, informally, between impersonation attacks where
370 Introduction to Modern Cryptography
the adversary impersonates one party while interacting with the other party, and man-in-the-middle attacks where both honest parties are executing the protocol and the adversary is intercepting and modifying messages being sent from one party to the other. We will not formally define security against either class of attacks, as such definitions are rather involved and cannot be achieved without the parties sharing some information in advance. Never- theless, it is worth remarking that the Diffie–Hellman protocol is completely insecure against man-in-the-middle attacks. In fact, a man-in-the-middle ad- versary can act in such a way that Alice and Bob terminate the protocol with different keys kA and kB that are both known to the adversary, yet neither Alice nor Bob can detect that any attack was carried out. We leave the details of this attack as an exercise.
Diffie–Hellman key exchange in practice. The Diffie–Hellman protocol in its basic form is typically not used in practice due to its insecurity against man-in-the-middle attacks, as discussed above. This does not detract in any way from its importance. The Diffie–Hellman protocol served as the first demonstration that asymmetric techniques (and number-theoretic problems) could be used to alleviate the problems of key distribution in cryptography. Furthermore, the Diffie–Hellman protocol is at the core of standardized key- exchange protocols that are resilient to man-in-the-middle attacks and are in wide use today. One notable example is TLS; see Section 12.8.
10.4 The Public-Key Revolution
In addition to key exchange, Diffie and Hellman also introduced in their ground-breaking work the notion of public-key (or asymmetric) cryptography. In the public-key setting (in contrast to the private-key setting we have studied in Chapters 1–7), a party who wishes to communicate securely generates a pair of keys: a public key that is widely disseminated, and a private key that it keeps secret. (The fact that there are now two different keys is what makes the scheme asymmetric.) Having generated these keys, a party can use them to ensure secrecy for messages it receives using a public-key encryption scheme, or integrity for messages it sends using a digital signature scheme. (See Figure 10.3.) We provide a brief taste of these primitives here, and discuss them in extensive detail in Chapters 11 and 12, respectively.
In a public-key encryption scheme, the public key generated by some party serves as an encryption key; anyone who knows that public key can use it to encrypt messages and generate corresponding ciphertexts. The private key serves as a decryption key and is used by the party who knows it to recover the original message from any ciphertext generated using the matching public key. Furthermore—and it is amazing that something like this exists!—
Key Management and the Public-Key Revolution 371
Private-Key Setting Secrecy Private-key encryption
Public-Key Setting
Public-key encryption Digital signature schemes
the secrecy of encrypted messages is preserved even against an adversary who knows the encryption key (but not the decryption key). In other words, the (public) encryption key is of no use for an attacker trying to decrypt ciphertexts encrypted using that key. To enable secret communication, then, a receiver can simply send her public key to a potential sender (without having to worry about an eavesdropping adversary who observes her public key), or publicize her public key on her webpage or in some central database. A public- key encryption scheme thus enables private communication without relying on a private channel for key distribution.1
A digital signature scheme is a public-key analogue of message authentica- tion codes (MACs). Here, the private key serves as an “authentication key” (more typically called a signing key) that enables the party who knows this key to generate “authentication tags” (i.e., signatures) for messages it sends. The public key acts as a verification key, allowing anyone who knows it to verify signatures issued by the sender. As with MACs, a digital signature scheme can be used to prevent undetected tampering of a message. The fact that verification can be done by anyone who knows the public key of the sender, however, turns out to have far-reaching ramifications. Specifically, it makes it possible to take a document that was signed by Alice and present it to a third party (say, a judge) as proof that Alice indeed signed the docu- ment. This property is called non-repudiation and has extensive applications in electronic commerce. For example, it is possible to digitally sign contracts, send signed electronic purchase orders or promises of payments, and so on. Digital signatures are also used for the secure distribution of public keys as part of a public-key infrastructure, as discussed in more detail in Section 12.7.
In their paper, Diffie and Hellman set forth the notion of public-key cryp- tography but did not give any candidate constructions, A year later, Ron Rivest, Adi Shamir, and Len Adleman proposed the RSA problem and pre- sented the first public-key encryption and digital signature schemes based on the hardness of this problem. Variants of their schemes are now among the most widely used cryptographic primitives today. In 1985, Taher El Gamal presented an encryption scheme that is essentially a slight twist on the Diffie– Hellman key-exchange protocol, and is now also widely used. Thus, although
1For now, however, we do assume an authenticated channel that allows the sender to obtain a legitimate copy of the receiver’s public key. In Section 12.7 we show how public-key cryptography can be used to reduce this assumption as well.
Integrity Message authentication codes
FIGURE 10.3: Cryptographic primitives in the private-key and the
public-key settings.
372 Introduction to Modern Cryptography
Diffie and Hellman did not succeed in constructing a (non-interactive) public- key encryption scheme, they came very close.
We close by summarizing how public-key cryptography addresses the limi- tations of the private-key setting discussed in Section 10.1:
1. Public-key encryption allows key distribution to be done over public (but authenticated) channels. This can simplify the distribution and updating of key material.
2. Public-key cryptography reduces the need for users to store many secret keys. Consider again the setting of a large corporation where each pair of employees needs the ability to communicate securely. Using public-key cryptography, it suffices for each employee to store just a single private key (their own) and the public keys of all other employees. Importantly, these latter keys do not need to be stored in secret; they could even be stored in some central (public) repository.
3. Finally, public-key cryptography is (more) suitable for open environ- ments where parties who have never previously interacted want the abil- ity to communicate securely. As one commonplace example, an Internet merchant can post their public key on-line; a user making a purchase can then obtain the merchant’s public key, as needed, when they need to encrypt their credit card information.
The invention of public-key encryption was a revolution in cryptography. It is no coincidence that until the late 1970s and early 1980s, encryption and cryptography in general belonged to the domain of intelligence and military organizations, and only with the advent of public-key techniques did the use of cryptography spread to the masses.
References and Additional Reading
We have only briefly discussed the problems of key distribution and key management. For more information, we recommend looking at textbooks on network security. Kaufman et al. [102] provide a good treatment of protocols for secure key distribution, what they aim to achieve, and how they work.
We have not made any attempt to capture the full history of the devel- opment of public-key cryptography. Others besides Diffie and Hellman were working on similar ideas in the 1970s. One researcher in particular doing similar and independent work was Ralph Merkle, considered by many to be a co-inventor of public-key cryptography (although he published after Diffie and Hellman). We also mention Michael Rabin, who developed constructions of signature schemes and public-key encryption schemes based on the hardness of factoring about one year after the work of Rivest, Shamir, and Adleman [148].
Key Management and the Public-Key Revolution 373
We highly recommend reading the original paper by Diffie and Hellman [58], and refer the reader to the book by Levy [114] for more on the political and historical aspects of the public-key revolution.
Interestingly, aspects of public-key cryptography were discovered in the in- telligence community before being published in the open scientific literature. In the early 1970s, James Ellis, Clifford Cocks, and Malcolm Williamson of the British intelligence agency GCHQ invented the notion of public-key cryp- tography, a variant of RSA encryption, and a variant of the Diffie–Hellman key-exchange protocol. Their work was not declassified until 1997. Although the underlying mathematics of public-key cryptography may have been dis- covered before 1976, it is fair to say that the widespread ramifications of this new technology were not appreciated until Diffie and Hellman came along.
Exercises
10.1 Let Π be a key-exchange protocol, and (Enc, Dec) be a private-key en- cryption scheme. Consider the following interactive protocol Π′ for en- crypting a message: first, the sender and receiver run Π to generate a shared key k. Next, the sender computes c ← Enck(m) and sends c to the other party, who decrypts and recovers m using k.
(a) Formulate a definition of indistinguishable encryptions in the pres- ence of an eavesdropper (cf. Definition 3.8) appropriate for this interactive setting.
(b) Prove that if Π is secure in the presence of an eavesdropper and (Enc, Dec) has indistinguishable encryptions in the presence of an eavesdropper, then Π′ satisfies your definition.
10.2 Show that, for either of the groups considered in Sections 8.3.3 or 8.3.4, a uniform group element (expressed using the natural representation) is easily distinguishable from a uniform bit-string of the same length.
10.3 Describe a man-in-the-middle attack on the Diffie–Hellman protocol where the adversary shares a key kA with Alice and a (different) key kB with Bob, and Alice and Bob cannot detect that anything is wrong.
10.4 Consider the following key-exchange protocol:
(a) Alice chooses uniform k, r ∈ {0, 1}n, and sends s := k ⊕ r to Bob. (b) Bob chooses uniform t ∈ {0, 1}n, and sends u := s ⊕ t to Alice.
(c) Alice computes w := u ⊕ r and sends w to Bob. (d) Alice outputs k and Bob outputs w ⊕ t.
Show that Alice and Bob output the same key. Analyze the security of the scheme (i.e., either prove its security or show a concrete attack).
Chapter 11 Public-Key Encryption
11.1 Public-Key Encryption – An Overview
The introduction of public-key encryption marked a revolution in cryptogra- phy. Until that time, cryptographers had relied exclusively on shared, secret keys to achieve private communication. Public-key techniques, in contrast, enable parties to communicate privately without having agreed on any secret information in advance. As we have already noted, it is quite amazing and counterintuitive that this is possible: it means that two people on opposite sides of a room who can only communicate by shouting to each other, and have no initial secret, can talk in such a way that no one else in the room learns anything about what they are saying!
In the setting of private-key encryption, two parties agree on a secret key that can be used, by either party, for both encryption and decryption. Public- key encryption is asymmetric in both these respects. One party (the receiver) generates a pair of keys (pk,sk), called the public key and the private key, respectively. The public key is used by a sender to encrypt a message; the receiver uses the private key to decrypt the resulting ciphertext.
Since the goal is to avoid the need for two parties to meet in advance to agree on any information, how does the sender learn pk? At an abstract level, there are two ways this can occur. Call the receiver Alice and the sender Bob. In the first approach, when Alice learns that Bob wants to communicate with her, she can at that point generate (pk, sk) (assuming she hasn’t done so already) and then send pk to Bob in the clear ; Bob can then use pk to encrypt his message. We emphasize that the channel between Alice and Bob may be public, but is assumed to be authenticated, meaning that the adversary cannot modify the public key sent by Alice to Bob (and, in particular, cannot replace it with its own key). See Section 12.7 for a discussion of how public keys can be distributed over unauthenticated channels.
An alternative approach is for Alice to generate her keys (pk, sk) in advance, independently of any particular sender. (In fact, at the time of key generation Alice need not even be aware that Bob wants to talk to her, or even that Bob exists.) Alice can widely disseminate her public key pk by, say, publishing it on her webpage, putting it on her business cards, or placing it in a public directory. Now, anyone who wishes to communicate privately with Alice can
375
376 Introduction to Modern Cryptography
look up her public key and proceed as above. Note that multiple senders can communicate multiple times with Alice using the same public key pk for encrypting all their communication.
Note that pk is inherently public—and can thus be learned easily by an attacker—in either of the above scenarios. In the first case, an adversary eavesdropping on the communication between Alice and Bob obtains pk di- rectly; in the second case, an adversary could just as well look up Alice’s public key on its own. We see that the security of public-key encryption cannot rely on secrecy of pk, but must instead rely on secrecy of sk. It is therefore crucial that Alice not reveal her private key to anyone, including the sender Bob.
Comparison to Private-Key Encryption
Perhaps the most obvious difference between private- and public-key en- cryption is that the former assumes complete secrecy of all cryptographic keys, whereas the latter requires secrecy for only the private key sk. Al- though this may seem like a minor distinction, the ramifications are huge: in the private-key setting the communicating parties must somehow be able to share the secret key without allowing any third party to learn it, whereas in the public-key setting the public key can be sent from one party to the other over a public channel without compromising security. For parties shouting across a room or, more realistically, communicating over a public network like a phone line or the Internet, public-key encryption is the only option.
Another important distinction is that private-key encryption schemes use the same key for both encryption and decryption, whereas public-key encryp- tion schemes use different keys for each operation. That is, public-key en- cryption is inherently asymmetric. This asymmetry in the public-key setting means that the roles of sender and receiver are not interchangeable as they are in the private-key setting: a single key-pair allows communication in one direction only. (Bidirectional communication can be achieved in a number of ways; the point is that a single invocation of a public-key encryption scheme forces a distinction between one user who acts as a receiver and other users who act as senders.) In addition, a single instance of a public-key encryption scheme enables multiple senders to communicate privately with a single re- ceiver, in contrast to the private-key case where a secret key shared between two parties enables private communication only between those two parties.
Summarizing and elaborating the preceding discussion, we see that public- key encryption has the following advantages relative to private-key encryption:
• Public-key encryption addresses (to some extent) the key-distribution problem, since communicating parties do not need to secretly share a key in advance of their communication. Two parties can communicate secretly even if all communication between them is monitored.
• When a single receiver is communicating with N senders (e.g., an on-line merchant processing credit card orders from multiple purchasers), it is
Public-Key Encryption 377
much more convenient for the receiver to store a single private key sk rather than to share, store, and manage N different secret keys (i.e., one for each sender). In fact, when using public-key encryption the number and identities of potential senders need not be known at the time of key generation. This allows enormous flexibility in “open systems.”
The fact that public-key encryption schemes allow anyone to act as a sender can be a drawback when a receiver only wants to receive messages from one specific individual. In that case, an authenticated (private-key) encryption scheme would be a better choice than public-key encryption.
The main disadvantage of public-key encryption is that it is roughly 2 to 3 ordersofmagnitudeslowerthanprivate-keyencryption.1 Itcanbeachallenge to implement public-key encryption in severely resource-constrained devices like smartcards or radio-frequency identification (RFID) tags. Even when a desktop computer is performing cryptographic operations, carrying out thou- sands of such operations per second (as in the case of an on-line merchant processing credit card transactions) may be prohibitive. Thus, when private- key encryption is an option (i.e., if two parties can securely share a key in advance), then it typically should be used.
In fact, as we will see in Section 11.3, private-key encryption is used in the public-key setting to improve efficiency for the (public-key) encryption of long messages. A thorough understanding of private-key encryption is therefore crucial to appreciate how public-key encryption is implemented in practice.
Secure Distribution of Public Keys
In our entire discussion thus far, we have implicitly assumed that the ad- versary is passive; that is, the adversary only eavesdrops on communication between the sender and receiver but does not actively interfere with the com- munication. If the adversary has the ability to tamper with all communication between the honest parties, and these honest parties share no keys in advance, then privacy simply cannot be achieved. For example, if a receiver Alice sends her public key pk to Bob but the adversary replaces it with a key pk′ of his own (for which it knows the matching private key sk′), then even though Bob encrypts his message using pk′ the adversary will easily be able to recover the message (using sk′). A similar attack works if an adversary is able to change the value of Alice’s public key that is stored in some public directory, or if the adversary can tamper with the public key as it is transmitted from the public directory to Bob. If Alice and Bob do not share any information in advance, and are not willing to rely on some mutually trusted third party, there is nothing Alice or Bob can do to prevent active attacks of this sort, or
1It is difficult to give an exact comparison since the relative efficiency depends on the exact schemes under consideration as well as various implementation details.
378 Introduction to Modern Cryptography
even to tell that such an attack is taking place.2
Importantly, our treatment of public-key encryption in this chapter assumes
that senders are able to obtain a legitimate copy of the receiver’s public key. (This will be implicit in the security definitions we provide.) That is, we assume secure key distribution. This assumption is made not because active attacks of the type discussed above are of no concern—in fact, they repre- sent a serious threat that must be dealt with in any real-world system that uses public-key encryption. Rather, this assumption is made because there exist other mechanisms for preventing active attacks (see, for example, Sec- tion 12.7), and it is therefore convenient (and useful) to decouple the study of secure public-key encryption from the study of secure public-key distribution.
11.2 Definitions
We begin by defining the syntax of public-key encryption. The definition is very similar to Definition 3.7, with the exception that instead of working with just one key, we now have distinct encryption and decryption keys.
DEFINITION 11.1 A public-key encryption scheme is a triple of proba- bilistic polynomial-time algorithms (Gen, Enc, Dec) such that:
1. The key-generation algorithm Gen takes as input the security parameter 1n and outputs a pair of keys (pk,sk). We refer to the first of these as the public key and the second as the private key. We assume for convenience that pk and sk each has length at least n, and that n can be determined from pk, sk.
2. The encryption algorithm Enc takes as input a public key pk and a mes- sage m from some message space (that may depend on pk). It outputs a ciphertext c, and we write this as c ← Encpk(m). (Looking ahead, Enc will need to be probabilistic to achieve meaningful security.)
3. The deterministic decryption algorithm Dec takes as input a private key sk and a ciphertext c, and outputs a message m or a special symbol ⊥ denoting failure. We write this as m := Decsk(c).
It is required that, except possibly with negligible probability over (pk, sk) out- put by Gen(1n), we have Decsk(Encpk(m)) = m for any (legal) message m.
The important difference from the private-key setting is that the key- generation algorithm Gen now outputs two keys instead of one. The public
2In our “shouting-across-a-room” scenario, Alice and Bob can detect when an adversary in- terferes with the communication. But this is only because: (1) the adversary cannot prevent Alice’s messages from reaching Bob, and (2) Alice and Bob “share” in advance information (e.g., the sound of their voices) that allows them to “authenticate” their communication.
Public-Key Encryption 379
key pk is used for encryption, while the private key sk is used for decryption. Reiterating our earlier discussion, pk is assumed to be widely distributed so that anyone can encrypt messages for the party who generated this key, but sk must be kept private by the receiver in order for security to possibly hold.
We allow for a negligible probability of decryption error and, indeed, some of the schemes we present will have a negligible error probability (e.g., if a prime needs to be chosen but with negligible probability a composite is obtained instead). Despite this, we will generally ignore the issue from here on.
For practical usage of public-key encryption, we will want the message space to be {0,1}n or {0,1}∗ (and, in particular, to be independent of the public key). Although we will sometimes describe encryption schemes using some message space M that does not contain all bit-strings of some fixed length (and that may also depend on the public key), we will in such cases also specify how to encode bit-strings as elements of M. This encoding must be both efficiently computable and efficiently reversible, so the receiver can recover the bit-string that was encrypted.
11.2.1 Security against Chosen-Plaintext Attacks
We initiate our treatment of security by introducing the “natural” coun- terpart of Definition 3.8 in the public-key setting. Since extensive motivation for this definition (as well as the others we will see) has already been given in Chapter 3, the discussion here will be relatively brief and will focus primarily on the differences between the private-key and the public-key settings.
Given a public-key encryption scheme Π = (Gen, Enc, Dec) and an adversary A, consider the following experiment:
The eavesdropping indistinguishability experiment PubKeav (n): A,Π
1. Gen(1n) is run to obtain keys (pk,sk).
2. Adversary A is given pk, and outputs a pair of equal-length
messages m0, m1 in the message space.
3. A uniform bit b ∈ {0,1} is chosen, and then a ciphertext c ← Encpk(mb) is computed and given to A. We call c the challenge ciphertext.
4. A outputs a bit b′. The output of the experiment is 1 if b′ = b, and 0 otherwise. If b′ = b we say that A succeeds.
DEFINITION 11.2 A public-key encryption scheme Π = (Gen, Enc, Dec) has indistinguishable encryptions in the presence of an eavesdropper if for all probabilistic polynomial-time adversaries A there is a negligible function negl such that
Pr[PubKeav (n) = 1] ≤ 1 + negl(n). A,Π 2
380 Introduction to Modern Cryptography
The main difference between the above definition and Definition 3.8 is that here A is given the public key pk. Furthermore, we allow A to choose its messages m0 and m1 based on this public key. This is essential when defining security of public-key encryption since, as discussed previously, we assume that the adversary knows the public key of the recipient.
The seemingly “minor” modification of giving the adversary A the public key pk being used to encrypt the message has a tremendous impact: it ef- fectively gives A access to an encryption oracle for free. (The concept of an encryption oracle is explained in Section 3.4.2.) This is true because the ad- versary, given pk, can encrypt any message m on its own by simply computing Encpk(m). (As always, A is assumed to know the algorithm Enc.) The up- shot is that Definition 11.2 is equivalent to CPA-security (i.e., security against chosen-plaintext attacks), defined in a manner analogous to Definition 3.22, with the only difference being that the attacker is given the public key in the corresponding experiment. We thus have:
PROPOSITION 11.3 If a public-key encryption scheme has indistin- guishable encryptions in the presence of an eavesdropper, it is CPA-secure.
This is in contrast to the private-key setting, where there exist schemes that have indistinguishable encryptions in the presence of an eavesdropper but are insecure under a chosen-plaintext attack (see Proposition 3.20). Further differences from the private-key setting that follow almost immediately as consequences of the above are discussed next.
Impossibility of perfectly secret public-key encryption. Perfectly se- cret public-key encryption could be defined analogously to Definition 2.3 by conditioning on the entire view of an eavesdropper (i.e., including the public key). Equivalently, it could be defined by extending Definition 11.2 to require that for all adversaries A (not only efficient ones), we have:
Pr[PubKeav (n) = 1] = 1 . A,Π 2
In contrast to the private-key setting, however, perfectly secret public-key encryption is impossible, regardless of how long the keys are or how small the message space is. In fact, an unbounded adversary given pk and a ciphertext c computed via c ← Encpk(m) can determine m with probability 1. A proof of this is left as Exercise 11.1.
Insecurity of deterministic public-key encryption. As noted in the context of private-key encryption, no deterministic encryption scheme can be CPA-secure. The same is true here:
THEOREM 11.4 No deterministic public-key encryption scheme is CPA- secure.
Public-Key Encryption 381
Because Theorem 11.4 is so important, it merits a bit more discussion. The theorem is not an “artifact” of our security definition, or an indication that our definition is too strong. Deterministic public-key encryption schemes are vulnerable to practical attacks in realistic scenarios and should never be used. The reason is that a deterministic scheme not only allows the adversary to determine when the same message is sent twice (as in the private-key setting), but also allows the adversary to recover the message, with probability 1, if the set of possible messages being encrypted is small. For example, consider a professor encrypting students’ grades. Here, an eavesdropper knows that each student’s grade must be one of {A, B, C, D, F }. If the professor uses a deterministic public-key encryption scheme, an eavesdropper can quickly determine any student’s actual grade by encrypting all possible grades and comparing the result to the given ciphertext.
Although the above theorem seems deceptively simple, for a long time many real-world systems were designed using deterministic public-key encryption. When public-key encryption was introduced, it is fair to say that the impor- tance of probabilistic encryption was not yet fully realized. The seminal work of Goldwasser and Micali, in which (something equivalent to) Definition 11.2 was proposed and Theorem 11.4 was stated, marked a turning point in the field of cryptography. The importance of pinning down one’s intuition in a formal definition and looking at things the right way for the first time—even if seemingly simple in retrospect—should not be underestimated.
11.2.2 Multiple Encryptions
As in Chapter 3, it is important to understand the effect of using the same key (in this case, the same public key) for encrypting multiple messages. We could formulate security in such a setting by having an adversary output two lists of plaintexts, as in Definition 3.19. For the reasons discussed in Section 3.4.2, however, we choose instead to use a definition in which the attacker is given access to a “left-or-right” oracle LRpk,b that, on input a pair of equal-length messages m0 , m1 , computes the ciphertext c ← Encpk (mb ) and returns c. The attacker is allowed to query this oracle as many times as it likes, and the definition therefore models security when multiple (unknown) messages are encrypted using the same public key.
Formally, consider the following experiment defined for a public-key encryp- tion scheme Π = (Gen, Enc, Dec) and adversary A:
The LR-oracle experiment PubKLR-cpa(n): A,Π
1. Gen(1n) is run to obtain keys (pk,sk).
2. A uniform bit b ∈ {0, 1} is chosen.
3. The adversary A is given input pk and oracle access to LRpk,b (·, ·).
4. The adversary A outputs a bit b′.
5. The output of the experiment is defined to be 1 if b′ = b, and
0 otherwise. If PubKLR-cpa(n) = 1, we say that A succeeds. A,Π
382 Introduction to Modern Cryptography
DEFINITION 11.5 A public-key encryption scheme Π = (Gen, Enc, Dec) has indistinguishable multiple encryptions if for all probabilistic polynomial-time adversaries A there exists a negligible function negl such that:
Pr[PubKLR-cpa(n) = 1] ≤ 1 + negl(n). A,Π 2
We will show that any CPA-secure scheme automatically has indistinguish- able multiple encryptions; that is, in the public-key setting, security for en- cryption of a single message implies security for encryption of multiple mes- sages. This means we can prove security of some scheme with respect to Definition 11.2, which is simpler and easier to work with, and conclude that the scheme satisfies Definition 11.5, a seemingly stronger definition that more accurately models real-world usage of public-key encryption. A proof of the following theorem is given below.
THEOREM 11.6 If public-key encryption scheme Π is CPA-secure, then it also has indistinguishable multiple encryptions.
An analogous result in the private-key setting was stated, but not proved, as Theorem 3.24.
Encrypting arbitrary-length messages. An immediate consequence of Theorem 11.6 is that a CPA-secure public-key encryption scheme for fixed- length messages implies a public-key encryption scheme for arbitrary-length messages satisfying the same notion of security. We illustrate this in the extreme case when the original scheme encrypts only 1-bit messages. Say Π = (Gen, Enc, Dec) is an encryption scheme for single-bit messages. We can construct a new scheme Π′ = (Gen, Enc′, Dec′) that has message space {0, 1}∗ by defining Enc′ as follows:
Enc′pk (m) = Encpk (m1 ), . . . , Encpk (ml ), (11.1) where m = m1 · · · ml. (The decryption algorithm Dec′ is constructed in the
obvious way.) We have:
CLAIM 11.7 Let Π and Π′ be as above. If Π is CPA-secure, then so is Π′.
The claim follows since we can view encryption of the message m using Π′ as encryption of t messages (m1,...,mt) using scheme Π.
A note on terminology. We have introduced three definitions of secu- rity for public-key encryption schemes—indistinguishable encryptions in the presence of an eavesdropper, CPA-security, and indistinguishable multiple encryptions—that are all equivalent. Following the usual convention in the cryptographic literature, we will simply use the term “CPA-security” to refer to schemes meeting these notions of security.
Public-Key Encryption 383 *Proof of Theorem 11.6
The proof of Theorem 11.6 is rather involved. We therefore provide some intuition before turning to the details. For this intuitive discussion we assume for simplicity that A makes only two calls to the LR oracle in experiment PubKLR-cpa(n). (In the full proof, the number of calls is arbitrary.)
A,Π
Fix an arbitrary ppt adversary A and a CPA-secure public-key encryption
scheme Π, and consider an experiment PubKLR-cpa2(n) where A can make A,Π
only two queries to the LR oracle. Denote the queries made by A to the oracle by (m1,0, m1,1) and (m2,0, m2,1); note that the second pair of messages may depend on the first ciphertext obtained by A from the oracle. In the experiment, A receives either a pair of ciphertexts (Encpk(m1,0), Encpk(m2,0)) (if b = 0), or a pair of ciphertexts (Encpk(m1,1),Encpk(m2,1)) (if b = 1). We write A(pk,Encpk(m1,0),Encpk(m2,0)) to denote the output of A in the first case, and analogously for the second.
Let C⃗0 denote the distribution of ciphertext pairs in the first case, and C⃗1
the distribution of ciphertext pairs in the second case. To show that Defini-
tion 11.5 holds (for PubKLR-cpa2), we need to prove that A cannot distinguish A,Π
between being given a pair of ciphertexts distributed according to C⃗0, or a pair of ciphertexts distributed according to C⃗1. That is, we need to prove that there is a negligible function negl such that
Pr[A (pk, Encpk(m1,0), Encpk(m2,0)) = 1] −Pr[A(pk,Encpk(m1,1),Encpk(m2,1))=1] ≤negl(n). (11.2)
(This is equivalent to Definition 11.5 for the same reason that Definition 3.9 is equivalent to Definition 3.8.) To prove this, we will show that
1. CPA-security of Π implies that A cannot distinguish between the case
when it is given a pair of ciphertexts distributed according to C⃗0, or
a pair of ciphertexts (Encpk(m1,0),Encpk(m2,1)), which corresponds to
encrypting the first message in A’s first oracle query and the second
message in A’s second oracle query. (Although this cannot occur in
PubKLR-cpa2(n), we can still ask what A’s behavior would be if given A,Π
such a ciphertext pair.) Let C⃗01 denote the distribution of ciphertext pairs in this latter case.
2. Similarly, CPA-security of Π implies that A cannot distinguish between the case when it is given a pair of ciphertexts distributed according to C⃗01, or a pair of ciphertexts distributed according to C⃗1.
The above says that A cannot distinguish between distributions C⃗0 and C⃗01, nor between distributions C⃗01 and C⃗1. We conclude (using simple algebra) that A cannot distinguish between distributions C⃗0 and C⃗1.
The crux of the proof, then, is showing that A cannot distinguish between being given a pair of ciphertexts distributed according to C⃗0, or a pair of
384 Introduction to Modern Cryptography
ciphertexts distributed according to C⃗01. (The other case follows similarly.)
That is, we want to show that there is a negligible function negl for which Pr[A (pk, Encpk(m1,0), Encpk(m2,0)) = 1]
−Pr[A(pk,Encpk(m1,0),Encpk(m2,1))=1] ≤negl(n). (11.3)
Note that the only difference between the input of the adversary A in each
case is in the second element. Intuitively, indistinguishability follows from
the single-message case since A can generate Encpk(m1,0) by itself. Formally,
consider the following ppt adversary A′ running in experiment PubKeav (n): A′ ,Π
Adversary A′:
1. On input pk, adversary A′ runs A(pk) as a subroutine.
2. When A makes its first query (m1,0,m1,1) to the LR oracle, A′ computes c1 ← Encpk(m1,0) and returns c1 to A as the response from the oracle.
3. When A makes its second query (m2,0,m2,1) to the LR or- acle, A′ outputs (m2,0,m2,1) and receives back a challenge ciphertext c2. This is returned to A as the response from the LR oracle.
4. A′ outputs the bit b′ output by A.
Looking at experiment PubKeav (n), we see that when b = 0 then the chal-
A′ ,Π
lenge ciphertext c2 is computed as Encpk(m2,0). Thus,
Pr[A′ (Encpk(m2,0)) = 1] = Pr[A(Encpk(m1,0),Encpk(m2,0)) = 1]. (11.4) (We suppress explicit mention of pk to save space.) In contrast, when b = 1
in experiment PubKeav (n), then c is computed as Enc (m A′,Π 2 pk 2,1
) and so Pr[A′ (Encpk(m2,1)) = 1] = Pr[A(Encpk(m1,0),Encpk(m2,1)) = 1]. (11.5)
CPA-security of Π implies that there is a negligible function negl such that |Pr[A′(Encpk(m2,0)) = 1] − Pr[A′(Encpk(m2,1)) = 1]| ≤ negl(n).
This, together with Equations (11.4) and (11.5), yields Equation (11.3). In almost exactly the same way, we can prove that:
Pr[A (pk, Encpk(m1,0), Encpk(m2,1)) = 1] −Pr[A(pk,Encpk(m1,1),Encpk(m2,1))=1] ≤negl(n). (11.6)
Equation (11.2) follows by combining Equations (11.3) and (11.6).
The main complication that arises in the general case is that the number of queries to the LR oracle is no longer fixed but may instead be an arbitrary
Public-Key Encryption 385 polynomial of n. In the formal proof this is handled using a hybrid argument.
(Hybrid arguments were used also in Chapter 7.)
PROOF (of Theorem 11.6) Let Π be a CPA-secure public-key encryp-
tion scheme and A an arbitrary ppt adversary in experiment PubKLR-cpa(n). A,Π
Let t = t(n) be a polynomial upper bound on the number of queries made by A to the LR oracle, and assume without loss of generality that A always queries the oracle exactly this many times. For a given public key pk and 0 ≤ i ≤ t, let LRipk denote the oracle that on input (m0,m1) returns Encpk(m0) for the first i queries it receives, and returns Encpk(m1) for the next t − i queries it receives. (That is, for the first i queries the first message in the input pair is encrypted, and in response to the remaining queries the second message in the input pair is encrypted.) We stress that each encryption is computed using uniform, independent randomness. Using this notation, we have
Pr PubKLR-cpa(n) = 1 = 1 · Pr[ALRtpk (pk) = 0] + 1 · Pr[ALR0pk (pk) = 1] A,Π 2 2
because, from the point of view of A (who makes exactly t queries), oracle LRtpk is equivalent to LRpk,0, and oracle LR0pk is equivalent to LRpk,1. To prove that Π satisfies Definition 11.5, we will show that for any ppt A there is a negligible function negl′ such that
Pr[ALRtpk (pk) = 1] − Pr[ALR0pk (pk) = 1] ≤ negl′(n). (11.7)
(As before, this is equivalent to Definition 11.5 for the same reason that Def- inition 3.9 is equivalent to Definition 3.8.)
Consider the following ppt adversary A′ that eavesdrops on the encryption of a single message:
Adversary A′:
1. A′, given pk, chooses a uniform index i ← {1,...,t}.
2. A′ runs A(pk), answering its jth oracle query (mj,0,mj,1) as follows:
(a) For j < i, adversary A′ computes cj ← Encpk(mj,0) and returns cj to A as the response from its oracle.
(b) For j = i, adversary A′ outputs (mj,0, mj,1) and receives back a challenge ciphertext cj. This is returned to A as the response from its oracle.
(c) For j > i, adversary A′ computes cj ← Encpk(mj,1) and returns cj to A as the response from its oracle.
3. A′ outputs the bit b′ that is output by A.
386 Introduction to Modern Cryptography
Consider experiment PubKeav (n). Fixing some choice of i = i∗, note that
if ci
identical to an interaction with oracle LRi∗ . Thus,
is an encryption of m
Pr[A′ outputs1|b=1]=
= t 1 · P r A L R p k
A′ ,Π
then the interaction of A with its oracle is pk
∗
i∗=1 t
On the other hand, if ci∗ is an encryption of mi∗,1 then the interaction of A
with its oracle is identical to an interaction with oracle LRi∗−1, and so pk
∗
i ,0
t i∗ =1
Pr[i=i∗]·Pr[A′ outputs1|b=0∧i=i∗] ∗
i
Pr[A′ outputs1|b=0]=
=t 1·PrALRpk(pk)=1.
t i∗ =1
Pr[i=i∗]·Pr[A′ outputs1|b=1∧i=i∗]
( p k ) = 1 =t−1 1·PrALRpk(pk)=1,
∗
i −1
i∗=1 t
i
∗
where the third equality is obtained just by shifting the indices of summation. Since A′ runs in polynomial time, the assumption that Π is CPA-secure
means that there exists a negligible function negl such that
Pr[A′ outputs 1 | b = 0] − Pr[A′ outputs 1 | b = 1] ≤ negl(n).
But this means that
i∗=0 t
t1 t−1
∗
1
∗
i
negl(n)≥ ·Pr ALRpk(pk)=1 − ·Pr ALRpk(pk)=1
i
i∗=1 t
i∗=0 t t0
1LR LR = ·Pr A pk(pk)=1 −Pr A pk(pk)=1 ,
t
since all but one of the terms in each summation cancel. We conclude that P r A L R tp k ( p k ) = 1 − P r A L R 0p k ( p k ) = 1 ≤ t ( n ) · n e g l ( n ) .
Because t is polynomial, the function t · negl(n) is negligible. Since A was an arbitrary ppt adversary, this shows that Equation (11.7) holds and so completes the proof that Π has indistinguishable multiple encryptions.
Public-Key Encryption 387 11.2.3 Security against Chosen-Ciphertext Attacks
Chosen-ciphertext attacks, in which an adversary is able to obtain the de- cryption of arbitrary ciphertexts of its choice (with one technical restriction described below), are a concern in the public-key setting just as they are in the private-key setting. In fact, they are arguably more of a concern in the public-key setting since there a receiver expects to receive ciphertexts from multiple senders who are possibly unknown in advance, whereas a receiver in the private-key setting intends to communicate only with a single, known sender using any particular secret key.
Assume an eavesdropper A observes a ciphertext c sent by a sender S to a receiver R. Broadly speaking, in the public-key setting there are two classes of chosen-ciphertext attacks:
• A might send a modified ciphertext c′ to R on behalf of S. (For example, in the context of encrypted e-mail, A might construct an encrypted e- mail c′ and forge the “From” field so that it appears the e-mail originated from S.) In this case, although it is unlikely that A would be able to obtain the entire decryption m′ of c′, it might be possible for A to infer some information about m′ based on the subsequent behavior of R. Based on this information, A might be able to learn something about the original message m.
• A might send a modified ciphertext c′ to R in its own name. In this case, A might obtain the entire decryption m′ of c′ if R responds directly to A. Even if A learns nothing about m′, this modified message may have a known relation to the original message m that can be exploited by A; see the third scenario below for an example.
The second class of attacks is specific to the context of public-key encryption, and has no analogue in the private-key setting.
It is not hard to identify a number of realistic scenarios illustrating the above types of attacks:
Scenario 1. Say a user S logs in to her bank account by sending to her bank an encryption of her password pw concatenated with a timestamp. Assume further that there are two types of error messages the bank sends: it returns “password incorrect” if the encrypted password does not match the stored password of S, and “timestamp incorrect” if the password is correct but the timestamp is not.
If an adversary obtains a ciphertext c sent by S to the bank, the adversary can now mount a chosen-ciphertext attack by sending ciphertexts c′ to the bank on behalf of S and observing the error messages that result. (This is similar to the padding-oracle attack that we saw in Section 3.7.2.) In some cases, this information may be enough to allow the adversary to determine the user’s entire password.
388 Introduction to Modern Cryptography
Scenario 2. Say S sends an encrypted e-mail c to R, and this e-mail is observed by A. If A sends, in its own name, an encrypted e-mail c′ to R, then R might reply to this e-mail and quote the decrypted text m′ corresponding to c′. In this case, R is essentially acting as a decryption oracle for A and might potentially decrypt any ciphertext that A sends it.
Scenario 3. An issue that is closely related to that of chosen-ciphertext secu- rity is potential malleability of ciphertexts. Since a formal definition is quite involved, we do not provide one here but instead only give the intuitive idea. A scheme is malleable if it has the following property: given an encryption c of some unknown message m, it is possible to come up with a ciphertext c′ that is an encryption of a message m′ that is related in some known way to m. For example, perhaps given an encryption of m, it is possible to construct an encryption of 2m. (We will see natural examples of CPA-secure schemes with this and similar properties later; see Section 13.2.3.)
Now imagine that R is running an auction, where two parties S and A submit their bids by encrypting them using the public key of R. If a malleable encryption scheme is used, it may be possible for an adversary A to always place the highest bid (without bidding the maximum) by carrying out the following attack: wait until S sends a ciphertext c corresponding to its bid m (that is unknown to A); then send a ciphertext c′ corresponding to the bid m′ = 2m. Note that m (and m′, for that matter) remain unknown to A until R announces the results, and so the possibility of such an attack does not contradict the fact that the encryption scheme is CPA-secure. CCA-secure schemes, on the other hand, can be shown to be non-malleable, meaning they are not vulnerable to such attacks.
The definition. Security against chosen-ciphertext attacks is defined by suitable modification of the analogous definition from the private-key setting (Definition 3.33). Given a public-key encryption scheme Π and an adversary A, consider the following experiment:
The CCA indistinguishability experiment PubKcca (n): A,Π
1. Gen(1n) is run to obtain keys (pk,sk).
2. The adversary A is given pk and access to a decryption or-
acle Decsk (·). It outputs a pair of messages m0 , m1 of the same length. (These messages must be in the message space associated with pk.)
3. A uniform bit b ∈ {0,1} is chosen, and then a ciphertext c ← Encpk(mb) is computed and given to A.
4. A continues to interact with the decryption oracle, but may not request a decryption of c itself. Finally, A outputs a bit b′.
5. The output of the experiment is defined to be 1 if b′ = b, and 0 otherwise.
Public-Key Encryption 389
DEFINITION 11.8 A public-key encryption scheme Π = (Gen, Enc, Dec) has indistinguishable encryptions under a chosen-ciphertext attack (or is CCA- secure) if for all probabilistic polynomial-time adversaries A there exists a negligible function negl such that
Pr[PubKcca (n) = 1] ≤ 1 + negl(n). A,Π 2
The natural analogue of Theorem 11.6 holds for CCA-security as well. That is, if a scheme has indistinguishable encryptions under a chosen-ciphertext attack then it has indistinguishable multiple encryptions under a chosen- ciphertext attack, where this is defined appropriately. Interestingly, however, the analogue of Claim 11.7 does not hold for CCA-security.
As in Definition 3.33, we must prevent the attacker from submitting the challenge ciphertext c to the decryption oracle for the definition to be achiev- able. But this restriction does not make the definition meaningless and, in particular, for each of the three motivating scenarios given earlier one can argue that setting c′ = c is of no benefit to the attacker:
• In the first scenario involving password-based login, the attacker learns nothing about S’s password by replaying c to the bank since in this case it already knows that no error message will be generated.
• In the second scenario involving encrypted email, sending c′ = c to the receiver would likely make the receiver suspicious and so it would refuse to respond at all.
• In the final scenario involving an auction, R could easily detect cheat- ing if the adversary’s encrypted bid is identical to the other party’s encrypted bid. Anyway, in that case all the attacker achieves by replay- ing c is that it submits the same bid as the honest party.
11.3 Hybrid Encryption and the KEM/DEM Paradigm
Claim 11.7 shows that any CPA-secure public-key encryption scheme for l′-
bit messages can be used to obtain a CPA-secure public-key encryption scheme
for messages of arbitrary length. Encrypting an l-bit message using this def ′
approach requires γ = ⌈l/l ⌉ invocations of the original encryption scheme, meaning that both the computation and the ciphertext length are increased by a multiplicative factor of γ relative to the underlying scheme.
It is possible to do better by using private-key encryption in tandem with public-key encryption. This improves efficiency because private-key encryp- tion is significantly faster than public-key encryption, and improves band- width because private-key schemes have lower ciphertext expansion. The resulting combination is called hybrid encryption and is used extensively in
390 Introduction to Modern Cryptography
practice. The basic idea is to use public-key encryption to obtain a shared key k, and then encrypt the message m using a private-key encryption scheme and key k. The receiver uses its long-term (asymmetric) private key to de- rive k, and then uses private-key decryption (with key k) to recover the original message. We stress that although private-key encryption is used as a compo- nent, this is a full-fledged public-key encryption scheme by virtue of the fact that the sender and receiver do not share any secret key in advance.
FIGURE 11.1: Hybrid encryption. Enc denotes a public-key encryption
scheme, while Enc′ is a private-key encryption scheme.
In a direct implementation of this idea (see Figure 11.1), the sender would share k by (1) choosing a uniform value k and then (2) encrypting k using a public-key encryption scheme. A more direct approach is to use a public-key primitive called a key-encapsulation mechanism (KEM) to accomplish both of these “in one shot.” This is advantageous both from a conceptual point of view and in terms of efficiency, as we will see later.
A KEM has three algorithms similar in spirit to those of a public-key en- cryption scheme. As before, the key-generation algorithm Gen is used to generate a pair of public and private keys. In place of encryption, we now have an encapsulation algorithm Encaps that takes only a public key as input (and no message), and outputs a ciphertext c along with a key k. A corre- sponding decapsulation algorithm Decaps is run by the receiver to recover k from the ciphertext c using the private key. Formally:
DEFINITION 11.9 A key-encapsulation mechanism (KEM) is a tuple of probabilistic polynomial-time algorithms (Gen, Encaps, Decaps) such that:
1. The key-generation algorithm Gen takes as input the security parame- ter 1n and outputs a public-/private-key pair (pk,sk). We assume pk and sk each has length at least n, and that n can be determined from pk.
2. The encapsulation algorithm Encaps takes as input a public key pk and the security parameter 1n. It outputs a ciphertext c and a key k ∈ {0, 1}l(n) where l is the key length. We write this as (c,k) ← Encapspk(1n).
Public-Key Encryption 391
3. The deterministic decapsulation algorithm Decaps takes as input a private key sk and a ciphertext c, and outputs a key k or a special symbol ⊥ denoting failure. We write this as k := Decapssk(c).
It is required that with all but negligible probability over (sk,pk) output by Gen(1n), if Encapspk(1n) outputs (c,k) then Decapssk(c) outputs k.
In the definition we assume for simplicity that Encaps always outputs (a ciphertext c and) a key of some fixed length l(n). One could also consider a more general definition in which Encaps takes 1l as an additional input and outputs a key of length l.
Any public-key encryption scheme trivially gives a KEM by choosing a ran- dom key k and encrypting it. As we will see, however, dedicated constructions of KEMs can be more efficient.
FIGURE 11.2: Hybrid encryption using the KEM/DEM approach.
Using a KEM (with key length n), we can implement hybrid encryption as in Figure 11.2. The sender runs Encapspk(1n) to obtain c along with a key k; it then uses a private-key encryption scheme to encrypt its message m, using k as the key. In this context, the private-key encryption scheme is called a data-encapsulation mechanism (DEM) for obvious reasons. The ciphertext sent to the receiver includes both c and the ciphertext c′ from the private-key scheme. Construction 11.10 gives a formal specification.
What is the efficiency of the resulting hybrid encryption scheme Πhy? For some fixed value of n, let α denote the cost of encapsulating an n-bit key using Encaps, and let β denote the cost (per bit of plaintext) of encryption using Enc′. Assume |m| > n, which is the interesting case. Then the cost, per bit of plaintext, of encrypting a message m using Πhy is
α+β·|m| = α +β, (11.8) |m| |m|
which approaches β for sufficiently long m. In the limit of very long messages, then, the cost per bit incurred by the public-key encryption scheme Πhy is the
392 Introduction to Modern Cryptography
CONSTRUCTION 11.10
Let Π = (Gen,Encaps,Decaps) be a KEM with key length n, and let Π′ = (Gen′, Enc′, Dec′) be a private-key encryption scheme. Construct a public-key encryption scheme Πhy = (Genhy,Enchy,Dechy) as follows:
• Genhy: on input 1n run Gen(1n) and use the public and private keys (pk, sk) that are output.
• Enchy: on input a public key pk and a message m ∈ {0, 1}∗ do:
1. Compute (c,k) ← Encapspk(1n). 2. Compute c′ ← Enc′k(m).
3. Output the ciphertext ⟨c,c′⟩.
• Dechy: on input a private key sk and a ciphertext ⟨c,c′⟩ do:
1. Compute k := Decapssk(c).
2. Output the message m := Dec′k(c′).
Hybrid encryption using the KEM/DEM paradigm.
same as the cost per bit of the private-key scheme Π′. Hybrid encryption thus allows us to achieve the functionality of public-key encryption at the efficiency of private-key encryption, at least for sufficiently long messages.
A similar calculation can be used to measure the effect of hybrid encryption on the ciphertext length. For some fixed value of n, let L denote the length of the ciphertext output by Encaps, and say the private-key encryption of a message m using Enc′ results in a ciphertext of length n + |m| (this can be achieved using one of the modes of encryption discussed in Section 3.6; actually, even ciphertext length |m| is possible since, as we will see, Π′ need not be CPA-secure). Then the total length of a ciphertext in scheme Πhy is
L + n + |m|. (11.9)
In contrast, when using block-by-block encryption as in Equation (11.1), and assuming that public-key encryption of an n-bit message using Enc results in a ciphertext of length L, encryption of a message m would result in a ciphertext of length L · ⌈|m|/n⌉. The ciphertext length reported in Equation (11.9) is a significant improvement for sufficiently long m.
We can use some rough estimates to get a sense for what the above results mean in practice. (We stress that these numbers are only meant to give the reader a feel for the improvement; realistic values would depend on a variety of factors.) A typical value for the length of the key k might be n = 128. Furthermore, a “native” public-key encryption scheme might yield 256-bit ciphertexts when encrypting 128-bit messages; assume a KEM has ciphertexts of the same length when encapsulating a 128-bit key. Letting α, as before, denote the computational cost of public-key encryption/encapsulation of a 128-bit key, we see that block-by-block encryption as in Equation (11.1) would
Public-Key Encryption 393
encrypt a 1 MB (= 106-bit) message with computational cost α · ⌈106/128⌉ ≈ 7800·α and the ciphertext would be 2 MB long. Compare this to the efficiency of hybrid encryption. Letting β, as before, denote the per-bit computational cost of private-key encryption, a reasonable approximation is β ≈ α/105. Using Equation (11.8), we see that the overall computational cost for hybrid encryption for a 1 Mb message is
α+106· α =11·α, 105
and the ciphertext would be only slightly longer than 1 MB. Thus, hybrid en- cryption improves the computational efficiency in this case by a factor of 700, and the ciphertext length by a factor of 2.
It remains to analyze the security of Πhy. This, of course, depends on the security of its underlying components Π and Π′. In the following sections we define notions of CPA-security and CCA-security for KEMs, and show:
• If Π is a CPA-secure KEM and the private-key scheme Π′ has indis- tinguishable encryptions in the presence of an eavesdropper, then Πhy is a CPA-secure public-key encryption scheme. Notice that it suffices for Π′ to satisfy a weaker definition of security—which, recall, does not imply CPA-security in the private-key setting—in order for the hybrid scheme Πhy to be CPA-secure. Intuitively, the reason is that a fresh, uniform key k is chosen each time a new message is encrypted. Since each key k is used only once, indistinguishability of a single encryption of Π′ suffices for security of the hybrid scheme Πhy. This means that ba- sic private-key encryption using a pseudorandom generator (or stream cipher), as in Construction 3.17, suffices.
• If Π is a CCA-secure KEM and Π′ is a CCA-secure private-key encryp- tion scheme, then Πhy is a CCA-secure public-key encryption scheme.
11.3.1 CPA-Security
For simplicity, we assume in this and the next section a KEM with key length n. We define a notion of CPA-security for KEMs by analogy with Def- inition 11.2. As there, the adversary here eavesdrops on a single ciphertext c. Definition 11.2 requires that the attacker is unable to distinguish whether c is an encryption of some message m0 or some other message m1. With a KEM there is no message, and we require instead that the encapsulated key k is indistinguishable from a uniform key that is independent of the ciphertext c.
394 Introduction to Modern Cryptography
Let Π = (Gen, Encaps, Decaps) be a KEM and A an arbitrary adversary.
The CPA indistinguishability experiment KEMcpa (n): A,Π
1. Gen(1n) is run to obtain keys (pk,sk). Then Encapspk(1n) is run to generate (c, k) with k ∈ {0, 1}n.
2.Auniformbitb∈{0,1}ischosen. Ifb=0setkˆ:=k. If b = 1 then choose a uniform kˆ ∈ {0, 1}n.
3. Give (pk,c,kˆ) to A, who outputs a bit b′. The output of the experiment is defined to be 1 if b′ = b, and 0 otherwise.
In the experiment, A is given the ciphertext c and either the actual key k corresponding to c, or an independent, uniform key. The KEM is CPA-secure if no efficient adversary can distinguish between these possibilities.
DEFINITION 11.11 A key-encapsulation mechanism Π is CPA-secure if for all probabilistic polynomial-time adversaries A there exists a negligible function negl such that
Pr[KEMcpa (n) = 1] ≤ 1 + negl(n). A,Π 2
In the remainder of this section we prove the following theorem:
THEOREM 11.12 If Π is a CPA-secure KEM and Π′ is a private-key encryption scheme that has indistinguishable encryptions in the presence of an eavesdropper, then Πhy as in Construction 11.10 is a CPA-secure public-key encryption scheme.
Before proving the theorem formally, we give some intuition. Let the no- tation “X ≡c Y ” denote the event that no polynomial-time adversary can distinguish between two distributions X and Y . (This concept is treated more formally in Section 7.8, although we do not rely on that section here.)
For example, let Encaps(1)(1n) (resp., Encaps(2)(1n)) denote the ciphertext pk pk
(resp., key) output by Encaps. The fact that Π is CPA-secure means that pk, Encaps(1)(1n), Encaps(2)(1n) ≡c pk, Encaps(1)(1n), k′ ,
where pk is generated by Gen(1n) and k′ is chosen independently and uni- formly from {0, 1}n. Similarly, the fact that Π′ has indistinguishable encryp- tions in the presence of an eavesdropper means that for any m0,m1 output by A we have Enc′k (m0 ) ≡c Enc′k (m1 ) if k is chosen uniformly at random.
In order to prove CPA-security of Πhy we need to show that
pk, Encaps(1)(1n), Enc′ (m ) ≡c pk, Encaps(1)(1n), Enc′ (m ) (11.10) pkk0 pkk1
pk pk pk
Public-Key Encryption
395
(by “transitivity”)
(by security of Π)
pk, Encaps(1)(1n), Enc′ (m ) – pk k0
pk, Encaps(1)(1n), Enc′ (m ) pk k1
66
(by security of Π)
n?
pk, Encaps(1)(1 ), Enc′ (m ) pk, Encaps(1)(1 ), Enc′ (m )
(by security of Π′)
FIGURE 11.3: High-level structure of the proof of Theorem 11.12 (the
arrows represent indistinguishability).
for m0,m1 output by a ppt adversary A. (Equation (11.10) suffices to show that Πhy has indistinguishable encryptions in the presence of an eavesdropper, and by Proposition 11.3 this implies that Πhy is CPA-secure.)
- ?n pk k′0 pk k′1
The proof proceeds in three steps. (See Figure 11.3.) First we prove that pk,Encaps(1)(1n),Enc′ (m )≡c pk,Encaps(1)(1n),Enc′ (m ), (11.11)
pkk0 pkk′0
where on the left k is output by Encaps(2)(1n), and on the right k′ is an inde-
pendent, uniform key. This follows via a fairly straightforward reduction, since CPA-security of Π means exactly that Encaps(2)(1n) cannot be distinguished
pk
from a uniform key k′ even given pk and Encaps(1)(1n).
Next, we prove that
pk,Encaps(1)(1n),Enc′ (m )≡c pk,Encaps(1)(1n),Enc′ (m ). (11.12)
Here the difference is between encrypting m0 or m1 using Π′ and a uniform, independent key k′. Equation (11.12) thus follows using the fact that Π′ has indistinguishable encryptions in the presence of an eavesdropper.
pk
pkk′0 pkk′1
Exactly as in the case of Equation (11.11), we can also show that pk,Encaps(1)(1n),Enc′ (m )≡c pk,Encaps(1)(1n),Enc′ (m ), (11.13)
by relying again on the CPA-security of Π. Equations (11.11)–(11.13) imply, by transitivity, the desired result of Equation (11.10). (Transitivity will be implicit in the proof we give below.)
We now present the full proof.
PROOF (of Theorem 11.12) We prove that Πhy has indistinguishable encryptions in the presence of an eavesdropper; by Proposition 11.3, this implies it is CPA-secure.
pkk1 pkk′1
pk
396 Introduction to Modern Cryptography
Fix an arbitrary ppt adversary Ahy , and consider experiment PubKeav (n).
Ahy ,Πhy Our goal is to prove that there is a negligible function negl such that
Pr[PubKeav (n) = 1] ≤ 1 + negl(n). Ahy ,Πhy 2
By definition of the experiment, we have
Pr[PubKeav (n) = 1] (11.14)
Ahy ,Πhy
= 1 · Pr[Ahy(pk, Encaps(1)(1n), Enc′ (m )) = 0]
2 pk k0
+ 1 · Pr[Ahy(pk, Encaps(1)(1n), Enc′ (m )) = 1],
2 pk k1
where in each case k equals Encaps(2)(1n). Consider the following ppt adver-
sary A1 attacking Π. Adversary A1:
pk
1. A1 is given (pk, c, kˆ).
2. A1 runs Ahy(pk) to obtain two messages m0,m1. Then A1
computes c′ ← Enc′ˆ(m0), gives ciphertext ⟨c,c′⟩ to Ahy, and k
the form ⟨c,c′⟩ = ⟨c,Enc′k(m0)⟩, where k is the key encapsulated by c. So, Pr[A1 outputs 0 | b = 0] = Pr[Ahy(pk, Encaps(1)(1n), Enc′k(m0)) = 0].
On the other hand, when b = 1 in experiment KEMcpa (n) then A1 is given A1 ,Π
(pk, c, kˆ) with kˆ uniform and independent of c. If we denote such a key by k′, this means Ahy is given a ciphertext of the form ⟨c, Enc′k′ (m0 )⟩, and
Pr[A1 outputs 1 | b = 1] = Pr[Ahy(pk, Encaps(1)(1n), Enc′k′ (m0)) = 1]. pk
outputs the bit b′ that Ahy outputs.
Consider the behavior of A1 when attacking Π in experiment KEMcpa (n).
A1 ,Π When b = 0 in that experiment, then A1 is given (pk, c, kˆ) where c and kˆ were both output by Encapspk(1n). This means that Ahy is given a ciphertext of
Since Π is a CPA-secure KEM, there is a negligible function negl1 such that 1 + negl (n) ≥ Pr[KEMcpa (n) = 1] (11.15)
21 A1,Π
= 1 ·Pr[A1 outputs 0|b=0]+ 1 ·Pr[A1 outputs 1|b=1]
pk
22
= 1 · Pr[Ahy(pk, Encaps(1)(1n), Enc′ (m )) = 0]
2 pk k0
+ 1 · Pr[Ahy(pk, Encaps(1)(1n), Enc′ (m )) = 1] 2 pk k′0
Public-Key Encryption 397 where k is equal to Encaps(2)(1n) and k′ is a uniform and independent key.
Next, consider the following ppt adversary A′ that eavesdrops on a message encrypted using the private-key scheme Π′.
Adversary A′:
1. A′(1n) runs Gen(1n) on its own to generate keys (pk,sk). It
also computes c ← Encaps(1)(1n). pk
2. A′ runs Ahy(pk) to obtain two messages m0,m1. These are output by A′, and it is given in return a ciphertext c′.
3. A′ gives the ciphertext ⟨c,c′⟩ to Ahy, and outputs the bit b′ that Ahy outputs.
When b = 0 in experiment PrivKeav (n), adversary A′ is given a cipher- A′ ,Π′
text c′ which is an encryption of m0 using a key k′ that is uniform and indepen- dent of anything else. So Ahy is given a ciphertext of the form ⟨c, Enc′k′ (m0)⟩ where k′ is uniform and independent of c, and
Pr[A′ outputs 0 | b = 0] = Pr[Ahy(pk, Encaps(1)(1n), Enc′ (m )) = 0]. pk k′0
On the other hand, when b = 1 in experiment PrivKeav (n), then A′ is given A′ ,Π′
an encryption of m1 using a uniform,′ independent key k′. This means Ahy is
given a ciphertext of the form ⟨c, Enck 1
pk
(m )⟩ and so
Pr[A′ outputs 1 | b = 1] = Pr[Ahy(pk, Encaps(1)(1n), Enc′ (m )) = 1].
′
pk k′1
Since Π′ has indistinguishable encryptions in the presence of an eavesdrop-
per, there is a negligible function negl′ such that
1 + negl′(n) ≥ Pr[PrivKeav (n) = 1] (11.16)
2 A′,Π′
= 1·Pr[A′ outputs0|b=0]+1·Pr[A′ outputs1|b=1]
22
= 1 · Pr[Ahy(pk, Encaps(1)(1n), Enc′ (m )) = 0]
2 pk k′0
+ 1 · Pr[Ahy(pk, Encaps(1)(1n), Enc′ (m )) = 1].
2 pk k′1
Proceeding exactly as we did to prove Equation (11.15), we can show that
there is a negligible function negl2 such that
1 + negl (n) ≥ Pr[KEMcpa (n) = 1] (11.17)
22 A2,Π
= 1 ·Pr[A2 outputs 0|b=0]+ 1 ·Pr[A2 outputs 1|b=1]
22
= 1 · Pr[Ahy(pk, Encaps(1)(1n), Enc′ (m )) = 1]
2 pk k1
+ 1 · Pr[Ahy(pk, Encaps(1)(1n), Enc′ (m )) = 0]. 2 pk k′1
398 Introduction to Modern Cryptography
Summing Equations (11.15)–(11.17) and using the fact that the sum of three negligible functions is negligible, we see there exists a negligible function negl such that
3 + n e g l ( n ) ≥ 2
1 · Pr[Ahy(pk, c, Enc′k(m0)) = 0] + Pr[Ahy(pk, c, Enc′k 0
2
+Pr[Ahy(pk,c,Enc′k 0 hy ′k 1
(m )) = 0]+Pr[A (pk,c,Enc
+ Pr[Ahy(pk, c, Enc′k(m1)) = 1] + Pr[Ahy(pk, c, Enc′k 1
′
(m ))=1]
where c = Encaps(1)(1n) in all the above. Note that pk
′
(m ))=1] (m))=0] ,
′ ′
Pr[Ahy(pk, c, Enc′k′ (m0)) = 1] + Pr[Ahy(pk, c, Enc′k′ (m0)) = 0] = 1, since the probabilities of complementary events always sum to 1. Similarly,
Pr[Ahy(pk, c, Enc′k′ (m1)) = 1] + Pr[Ahy(pk, c, Enc′k′ (m1)) = 0] = 1. Therefore,
1 + negl(n) 2
≥ 1 · Pr[Ahy(pk, c, Enc′k(m0)) = 0] + Pr[Ahy(pk, c, Enc′k(m1)) = 1] 2
= Pr[PubKeav (n) = 1] Ahy ,Πhy
(using Equation (11.14) for the last equality), proving the theorem.
11.3.2 CCA-Security
If the private-key encryption scheme Π′ is not itself secure against chosen- ciphertext attacks, then (regardless of the KEM used) neither is the resulting hybrid encryption scheme Πhy. As a simple, illustrative example, say we take Construction 3.17 as our private-key encryption scheme. Then, leaving the KEM unspecified, encryption of a message m by Πhy is done by computing (c,k) ← Encapspk(1n) and then outputting the ciphertext
⟨c, G(k) ⊕ m⟩,
where G is a pseudorandom generator. Given a ciphertext ⟨c, c′⟩, an attacker can simply flip the last bit of c′ to obtain a modified ciphertext that is a valid encryption of m with its last bit flipped.
The natural way to fix this is to use a CCA-secure private-key encryption scheme. But this is clearly not enough if the KEM is susceptible to chosen- ciphertext attacks. Since we have not yet defined this notion, we do so now.
Public-Key Encryption 399
As in Definition 11.11, we require that an adversary given a ciphertext c cannot distinguish the key k encapsulated by that ciphertext from a uniform and independent key k′. Now, however, we additionally allow the attacker to request decapsulation of ciphertexts of its choice (as long as they are different from the challenge ciphertext).
Formally, let Π = (Gen, Encaps, Decaps) be a KEM with key length n and A an adversary, and consider the following experiment:
The CCA indistinguishability experiment KEMcca (n): A,Π
1. Gen(1n) is run to obtain keys (pk,sk). Then Encapspk(1n) is run to generate (c, k) with k ∈ {0, 1}n.
2.Auniformbitb∈{0,1}ischosen. Ifb=0setkˆ:=k. If b = 1 then choose a uniform kˆ ∈ {0, 1}n.
3. A is given (pk,c,kˆ) and access to an oracle Decapssk(·), but may not request decapsulation of c itself.
4. A outputs a bit b′. The output of the experiment is defined to be 1 if b′ =b, and 0 otherwise.
DEFINITION 11.13 A key-encapsulation mechanism Π is CCA-secure if for all probabilistic polynomial-time adversaries A there is a negligible function
negl such that
Pr[KEMcca (n) = 1] ≤ 1 + negl(n). A,Π 2
Fortunately, we can show that using a CCA-secure KEM in combination with a CCA-secure private-key encryption scheme results in a public-key en- cryption scheme secure against chosen-ciphertext attacks.
THEOREM 11.14 If Π is a CCA-secure KEM and Π′ is a CCA-secure private-key encryption scheme, then Πhy as in Construction 11.10 is a CCA- secure public-key encryption scheme.
A proof is obtained by suitable modification of the proof of Theorem 11.12.
11.4 CDH/DDH-Based Encryption
So far we have discussed public-key encryption abstractly, but have not yet seen any concrete examples of public-key encryption schemes (or KEMs). Here we explore some constructions based on the Diffie–Hellman problems. (The Diffie–Hellman problems are introduced in Section 8.3.2.)
400 Introduction to Modern Cryptography 11.4.1 El Gamal Encryption
In 1985, Taher El Gamal observed that the Diffie–Hellman key-exchange protocol (cf. Section 10.3) could be adapted to give a public-key encryption scheme. Recall that in the Diffie–Hellman protocol, Alice sends a message to Bob and then Bob responds with a message to Alice; based on these messages, Alice and Bob can derive a shared value k which is indistinguishable (to an eavesdropper) from a uniform element of some group G. We could imagine Bob using that shared value to encrypt a message m ∈ G by simply sending k · m to Alice; Alice can clearly recover m using her knowledge of k, and we will argue below that an eavesdropper learns nothing about m.
In the El Gamal encryption scheme we simply change our perspective on the above interaction. We view Alice’s initial message as her public key, and Bob’s reply (both his initial response and k · m) as a ciphertext. CPA-security based on the decisional Diffie–Hellman (DDH) assumption follows fairly easily from security of the Diffie–Hellman key-exchange protocol (Theorem 10.3).
In our formal treatment, we begin by stating and proving a simple lemma that underlies the El Gamal encryption scheme. Let G be a finite group, and let m ∈ G be an arbitrary element. The lemma states that multiplying m by a uniform group element k yields a uniformly distributed group element k′. Importantly, the distribution of k′ is independent of m; this means that k′ contains no information about m.
LEMMA 11.15 Let G be a finite group, and let m ∈ G be arbitrary. Then choosing uniform k ∈ G and setting k′ := k · m gives the same distribution for k′ as choosing uniform k′ ∈ G. Put differently, for any gˆ ∈ G we have
P r [ k · m = gˆ ] = 1 / | G | ,
where the probability is taken over uniform choice of k ∈ G.
PROOF Let gˆ ∈ G be arbitrary. Then
P r [ k · m = gˆ ] = P r [ k = gˆ · m − 1 ] .
Since k is uniform, the probability that k is equal to the fixed element gˆ · m−1 is exactly 1/|G|.
The above lemma suggests a way to construct a perfectly secret private-key encryption scheme with message space G. The sender and receiver share as their secret key a uniform element k ∈ G. To encrypt the message m ∈ G, the sender computes the ciphertext k′ := k · m. The receiver can recover the message from the ciphertext k′ by computing m := k′/k. Perfect secrecy follows immediately from the lemma above. In fact, we have already seen this scheme in a different guise—the one-time pad encryption scheme is an
Public-Key Encryption 401
instantiation of this approach, with the underlying group being the set of strings of some fixed length under the operation of bit-wise XOR.
We can adapt the above ideas to the public-key setting by providing the parties with a way to generate a shared, “random-looking” value k by inter- acting over a public channel. This should sound familiar since it is exactly what the Diffie–Hellman protocol provides. We proceed with the details.
As in Section 8.3.2, let G be a polynomial-time algorithm that takes as in- put 1n and (except possibly with negligible probability) outputs a description of a cyclic group G, its order q (with ∥q∥ = n), and a generator g. The El Gamal encryption scheme is described in Construction 11.16.
CONSTRUCTION 11.16
Let G be as in the text. Define a public-key encryption scheme as follows:
• Gen: on input 1n run G(1n) to obtain (G,q,g). Then choose a uniform x ∈ Zq and compute h := gx. The public key is ⟨G, q, g, h⟩ and the private key is ⟨G, q, g, x⟩. The message space is G.
• Enc: on input a public key pk = ⟨G,q,g,h⟩ and a message m ∈ G, choose a uniform y ∈ Zq and output the ciphertext
⟨gy, hy ·m⟩.
• Dec: on input a private key sk = ⟨G, q, g, x⟩ and a ciphertext
⟨c1,c2⟩, output
mˆ : = c 2 / c x1 .
The El Gamal encryption scheme.
To see that decryption succeeds, let ⟨c1, c2⟩ = ⟨gy, hy · m⟩ with h = gx. Then
mˆ=c2 =hy·m=(gx)y·m=gxy·m=m. c x1 ( g y ) x g x y g x y
Example 11.17
Let q = 83 and p = 2q+1 = 167, and let G denote the group of quadratic residues (i.e., squares) modulo p. (Since p and q are prime, G is a subgroup of Z∗p with order q. See Section 8.3.3.) Since the order of G is prime, any element of G except 1 is a generator; take g = 22 = 4 mod 167. Say the receiver chooses secret key 37 ∈ Z83 and so the public key is
pk = ⟨p,q,g,h⟩ = ⟨167,83,4,[437 mod 167]⟩ = ⟨167,83,4,76⟩,
where we use p to represent G (it is assumed that the receiver knows that the group is the set of quadratic residues modulo p).
402 Introduction to Modern Cryptography
Say a sender encrypts the message m = 65 ∈ G (note 65 = 302 mod 167
and so 65 is an element in the subgroup). If y = 71, the ciphertext is ⟨[471 mod 167], [7671 · 65 mod 167]⟩ = ⟨132, 44⟩.
To decrypt, the receiver first computes 124 = [13237 mod 167]; then, since 66 = [124−1 mod 167], the receiver recovers m = 65 = [44 · 66 mod 167]. ♦
We now prove security of the scheme. (The reader may want to compare the proof of the following to the proofs of Theorems 3.18 and 10.3.)
THEOREM 11.18 If the DDH problem is hard relative to G, then the El Gamal encryption scheme is CPA-secure.
PROOF Let Π denote the El Gamal encryption scheme. We prove that Π has indistinguishable encryptions in the presence of an eavesdropper; by Proposition 11.3, this implies it is CPA-secure.
Let A be a probabilistic polynomial-time adversary. We want to show that there is a negligible function negl such that
Pr[PubKeav (n) = 1] ≤ 1 + negl(n). A,Π 2
Consider the modified “encryption scheme” Π where Gen is the same as in Π, but encryption of a message m with respect to the public key ⟨G,q,g,h⟩ is done by choosing uniform y, z ∈ Zq and outputting the ciphertext
⟨gy, gz ·m⟩.
Although Π is not actually an encryption scheme (as there is no way for the
receiver to decrypt), the experiment PubKeav (n) is still well-defined since that A,Π
experiment depends only on the key-generation and encryption algorithms. Lemma 11.15 and the discussion that immediately follows it imply that the second component of the ciphertext in scheme Π is a uniformly distributed group element and, in particular, is independent of the message m being en- crypted. (Remember that gz is a uniform element of G when z is chosen uniformly from Zq.) The first component of the ciphertext is trivially inde- pendent of m. Taken together, this means that the entire ciphertext contains
no information about m. It follows that Pr[PubKeav (n) = 1] = 1 .
Now consider the following ppt algorithm D that attempts to solve the DDH problem relative to G. Recall that D receives (G, q, g, h1, h2, h3) where h1 = gx, h2 = gy, and h3 is either gxy or gz (for uniform x,y,z); the goal of D is to determine which is the case.
A,Π 2
Public-Key Encryption 403 The algorithm is given (G, q, g, h1, h2, h3) as input.
Algorithm D:
• Set pk = ⟨G, q, g, h1⟩ and run A(pk) to obtain two messages
m0,m1 ∈G.
• Chooseauniformbitb,andsetc1 :=h2 andc2 :=h3·mb.
• Give the ciphertext ⟨c1, c2⟩ to A and obtain an output bit b′. If b′ = b, output 1; otherwise, output 0.
Let us analyze the behavior of D. There are two cases to consider:
Case 1: Say the input to D is generated by running G(1n) to obtain (G, q, g), then choosing uniform x,y,z ∈ Zq, and finally setting h1 := gx, h2 := gy, and h3 := gz. Then D runs A on a public key constructed as
pk = ⟨G,q,g,gx⟩ and a ciphertext constructed as
⟨c1,c2⟩=⟨gy, gz ·mb⟩.
We see that in this case the view of A when run as a subroutine by D is dis-
tributed identically to A’s view in experiment PubKeav (n). Since D outputs 1 ′ A,Π
exactly when the output b of A is equal to b, we have that Pr[D(G,q,g,gx,gy,gz) = 1] = Pr[PubKeav(n) = 1] = 1 .
Case 2: Say the input to D is generated by running G(1n) to obtain (G, q, g), then choosing uniform x,y ∈ Zq, and finally setting h1 := gx, h2 := gy, and h3 := gxy. Then D runs A on a public key constructed as
pk = ⟨G,q,g,gx⟩ and a ciphertext constructed as
⟨c1, c2⟩ = ⟨gy, gxy · mb⟩ = ⟨gy, (gx)y · mb⟩.
We see that in this case the view of A when run as a subroutine by D is dis-
tributed identically to A’s view in experiment PubKeav (n). Since D outputs 1
′ A,Π
exactly when the output b of A is equal to b, we have that Pr[D(G, q, g, gx, gy, gxy) = 1] = Pr[PubKeav (n) = 1] .
A,Π
Under the assumption that the DDH problem is hard relative to G, there
is a negligible function negl such that
n e g l ( n ) ≥ P r [ D ( G , q , g , g x , g y , g z ) = 1 ] − P r [ D ( G , q , g , g x , g y , g x y ) = 1 ]
= 1 − Pr[PubKeav (n) = 1] . 2 A,Π
This implies Pr[PubKeav (n) = 1] ≤ 1 + negl(n), completing the proof. A,Π 2
A,Π 2
404 Introduction to Modern Cryptography
El Gamal Implementation Issues
We briefly discuss some practical issues related to El Gamal encryption.
Sharing public parameters. Our description of the El Gamal encryption scheme in Construction 11.16 requires the receiver to run G to generate G, q, g. In practice, it is common for these parameters to be generated and fixed “once- and-for-all,” and then shared by multiple receivers. (Of course, each receiver must choose their own secret value x and publish their own public key h = gx.) For example, NIST has published a set of recommended parameters suitable for use in the El Gamal encryption scheme. Sharing parameters in this way does not impact security (assuming the parameters were generated correctly and honestly in the first place). Looking ahead, we remark that this is in contrast to the case of RSA, where parameters cannot safely be shared (see Section 11.5.1).
Choice of group. As discussed in Section 8.3.2, the group order q is generally chosen to be prime. As far as specific groups are concerned, elliptic curves are one increasingly popular choice; an alternative is to let G be a prime- order subgroup of Z∗p, for p prime. We refer to Section 9.3 for a tabulation of recommended key lengths for achieving different levels of security.
The message space. An inconvenient aspect of the El Gamal encryption scheme is that the message space is a group G rather than bit-strings of some specified length. For some choices of the group, it is possible to address this by defining a reversible encoding of bit-strings as group elements. In such cases, the sender can first encode their message m ∈ {0, 1}l as a group element mˆ ∈ G and then apply El Gamal encryption to mˆ . The receiver can decrypt as in Construction 11.16 to obtain the encoded message mˆ , and then reverse the encoding to recover the original message m.
A simpler approach is to use (a variant of) El Gamal encryption as part of a hybrid encryption scheme. For example, the sender could choose a uniform group element m ∈ G, encrypt this using the El Gamal encryption scheme, and then encrypt their actual message using a private-key encryption scheme and key H(m), where H : G → {0,1}n is an appropriate key-derivation function (see the following section). In this case, it would be more efficient to use the DDH-based KEM that we describe next.
11.4.2 DDH-Based Key Encapsulation
At the end of the previous section we noted that El Gamal encryption can be used as part of a hybrid encryption scheme by simply encrypting a uniform group element m and using a hash of that element as a key. But this is wasteful! The proof of security for El Gamal encryption shows that cx1 (where c1 is the first component of the ciphertext, and x is the private key of the receiver) is already indistinguishable from a uniform group element, so the sender/receiver may as well use that. Construction 11.19 illustrates the KEM
Public-Key Encryption 405
that follows this approach. Note that the resulting encapsulation consists of just a single group element. In contrast, if we were to use El Gamal encryption of a uniform group element, the ciphertext would contain two group elements.
CONSTRUCTION 11.19
Let G be as in the previous section. Define a KEM as follows:
• Gen: on input 1n run G(1n) to obtain (G, q, g). choose a uniform x ∈ Zq and set h := gx. Also specify a function H : G → {0,1}l(n) for some function l (see text). The public key is ⟨G, q, g, h, H⟩ and the private key is ⟨G, q, g, x⟩.
• Encaps: on input a public key pk = ⟨G, q, g, h, H⟩ choose a uniform y ∈ Zq and output the ciphertext gy and the key H(hy).
• Decaps: on input a private key sk = ⟨G, q, g, x⟩ and a ciphertext c ∈ G, output the key H(cx).
An “El Gamal-like” KEM.
As described, the construction leaves the key-derivation function H unspec- ified, and there are several options for it. (See Section 5.6.4 for more on key derivation in general.) One possibility is to choose a function H : G → {0, 1}l that is (close to) regular, meaning that for each possible key k ∈ {0, 1}l the number of group elements that map to k is approximately the same. (For- mally, we need a negligible function negl such that for each k ∈ {0, 1}l
2l · Pr[H(g) = k] − 2−l ≤ negl(n),
where the probability is taken over uniform choice of g ∈ G. This ensures that the distribution on the key k is statistically close to uniform.) Both the complexity of H, as well as the achievable key length l, will depend on the specific group G being used.
A second possibility is to let H be a keyed function, where the (uniform) key for H is included as part of the receiver’s public key. This works if H is a strong extractor, as mentioned briefly in Section 5.6.4. Appropriate choice of l here (to ensure that the resulting key is statistically close to uniform) will depend on the size of G.
In either of the above cases, a proof of CPA-security based on the decisional Diffie–Hellman (DDH) assumption follows easily by adapting the proof of security for the Diffie–Hellman key-exchange protocol (Theorem 10.3).
THEOREM 11.20 If the DDH problem is hard relative to G, and H is chosen as described, then Construction 11.19 is a CPA-secure KEM.
If one is willing to model H as a random oracle, then Construction 11.19 can be proven CPA-secure based on the (weaker) computational Diffie–Hellman (CDH) assumption. We discuss this in the following section.
406 Introduction to Modern Cryptography
11.4.3 *A CDH-Based KEM in the Random-Oracle Model
In this section, we show that if one is willing to model H as a random oracle, then Construction 11.19 can be proven CPA-secure based on the CDH assumption. (Readers may want to review Section 5.5 to remind themselves of the random-oracle model.) Intuitively, the CDH assumption implies that an attacker observing h = gx (from the public key) and the ciphertext c = gy cannot compute DHg (h, c) = hy . In particular, then, an attacker cannot query hy to the random oracle. But this means that the encapsulated key H(hy) is completely random from the attacker’s point of view. This intuition is turned into a formal proof below.
As indicated by the intuition above, the proof inherently relies on modeling H as a random oracle.3 Specifically, the proof relies on the facts that (1) the only way to learn H(hy) is to explicitly query hy to H, which would mean that the attacker has solved a CDH instance (this is called “extractability” in Section 5.5.1), and (2) if an attacker does not query hy to H, then the value H(hy) is uniform from the attacker’s point of view. These properties only hold—indeed, they only make sense—if H is modeled as a random oracle.
THEOREM 11.21 If the CDH problem is hard relative to G, and H is modeled as a random oracle, then Construction 11.19 is CPA-secure.
PROOF Let Π denote Construction 11.19, and let A be a ppt adversary. We want to show that there is a negligible function negl such that
Pr[KEMcpa (n) = 1] ≤ 1 + negl(n). A,Π 2
The above probability is also taken over uniform choice of the function H, to which A is given oracle access. cpa
Consider an execution of experiment KEMA,Π(n) in which the public key is ⟨G,q,g,h⟩ and the ciphertext is c = gy, and let Query be the event that A queriesDHg(h,c)=hy toH.Wehave
Pr[KEMcpa (n)=1]=Pr[KEMcpa (n)=1∧Query] A,Π A,Π
+Pr[KEMcpa (n)=1∧Query] cpa A,Π
≤ Pr[KEMA,Π(n) = 1 ∧ Query] + Pr[Query]. (11.18) If Pr[Query] = 0 then Pr[KEMcpa (n) = 1 ∧ Query] = 0. Otherwise,
A,Π
Pr[KEMcpa (n)=1∧Query]=Pr[KEMcpa (n)=1|Query]·Pr[Query]
A,Π A,Π
≤Pr[KEMcpa (n)=1|Query].
A,Π
3This is true as long as we wish to rely only on the CDH assumption. As noted earlier, a proof without random oracles is possible if we rely on the stronger DDH assumption.
Public-Key Encryption 407 In experiment KEMcpa (n), the adversary A is given the public key and the
A,Π
ciphertext, plus either the encapsulated key k = H(h ) or a uniform key. If
Query does not occur, then k is uniformly distributed from the perspective of the adversary, and so there is no way A can distinguish between these two possibilities. This means that
Pr[KEMcpa (n) = 1 | Query] = 1. A,Π 2
def y
Returning to Equation (11.18), we thus have
Pr[KEMcpa (n) = 1] ≤ 1 + Pr[Query].
We next show that Pr[Query] is negligible, completing the proof.
Let t = t(n) be a (polynomial) upper bound on the number of queries that A makes to the random oracle H. Define the following ppt algorithm A′ for
the CDH problem relative to G: Algorithm A′:
The algorithm is given G, q, g, h, c as input.
• Set pk = ⟨G,q,g,h⟩ and choose a uniform k ∈ {0,1}l.
• Run A(pk, c, k). When A makes a query to H, answer it by choosing a fresh, uniform l-bit string.
• At the end of A’s execution, let y1,…,yt be the list of queries that A has made to H. Choose a uniform index i ∈ {1,…,t} and output yi.
We are interested in the probability with which A′ solves the CDH problem,
i.e., Pr[A′(G, q, g, h, c) = DHg(h, c)], where the probability is taken over G, q, g
output by G(1n), uniform h,c ∈ G, and the randomness of A′. To analyze
this probability, note first that event Query is still well-defined in the execu-
tion of A′, even though A′ cannot detect whether it occurs. Moreover, the
probability of event Query when A is run as a sub-routine by A′ is identical
to the probability of event Query in experiment KEMcpa (n). This follows be- A,Π
cause the view of A is identical in both cases until event Query occurs: in each
case, G,q,g are output by G(1n); in each case, h and c are uniform elements
of G and k is a uniform, l-bit string; and in each case, queries to H other
than H(DHg(h,c)) are answered with a uniform l-bit string. (In KEMcpa (n), A,Π
the query H(DHg(h,c)) is answered with the actual encapsulated key, which is equal to k with probability 1/2, whereas when A is run as a subroutine by A′ the query H(DHg(h,c)) is answered with a uniform l-bit string that is independent of k. But when this query is made, event Query occurs.)
Finally, observe that when Query occurs then DHg(h,c) ∈ {y1,…,yt} by definition, and so A′ outputs the correct result DHg (h, c) with probability at
A,Π 2
408 Introduction to Modern Cryptography least 1/t. We therefore conclude that
Pr[A′(G,q,g,h,c) = DHg(h,c)] ≥ Pr[Query]/t,
or Pr[Query] ≤ t · Pr[A′(G,q,g,h,c) = DHg(h,c)]. Since the CDH problem is hard for G, this latter probability is negligible; since t is polynomial, this implies that Pr[Query] is negligible as well. This completes the proof.
In the next section we will see that Construction 11.19 can be shown to be CCA-secure under a stronger variant of the CDH assumption (if we continue to model H as a random oracle).
11.4.4 Chosen-Ciphertext Security and DHIES/ECIES
The El Gamal encryption scheme is vulnerable to chosen-ciphertext attacks. This follows from the fact that it is malleable. Recall that an encryption scheme is malleable, informally, if given a ciphertext c that is an encryption of some unknown message m, it is possible to generate a modified ciphertext c′ that is an encryption of a message m′ having some known relation to m. In the case of El Gamal encryption, consider an adversary A who intercepts a ciphertext c = ⟨c1, c2⟩ encrypted using the public key pk = ⟨G, q, g, h⟩, and who then constructs the modified ciphertext c′ = ⟨c1,c′2⟩ where c′s = c2 ·α forsomeα∈G. Ifcisanencryptionofamessagem∈G(whichmaybe unknowntoA),wehavec1 =gy andc2 =hy ·mforsomey∈Zq. Butthen
c1 =gy and c′2 =hy ·(α·m),
and so c′ is a valid encryption of the message α · m. In other words, A can transform an encryption of the (unknown) message m into an encryption of the (unknown) message α · m. As discussed in Scenario 3 in Section 11.2.3, this sort of attack can have serious consequences.
The KEM discussed in the previous section might also be malleable depend- ing on the specific key-derivation function H being used. If H is modeled as a random oracle, however, then such attacks no longer seem possible. In fact, one can prove in this case that Construction 11.19 is CCA-secure based on the so-called gap-CDH assumption. Recall that the CDH assumption says that given group elements gx and gy (for some generator g), it is infeasible to compute gxy. The gap-CDH assumption says that this remains infeasible even given access to an oracle O such that O(U, V ) returns 1 exactly when V = U y . Stated differently, the CDH problem remains hard even given an oracle that solves the DDH problem. (We do not give a formal definition since we will not use this assumption in the rest of the book.) This assumption is believed to hold for the classes of groups we have discussed in this book. A proof of the following is very similar to the proof of Theorem 11.38.
Public-Key Encryption 409 THEOREM 11.22 If the gap-CDH problem is hard relative to G, and H is
modeled as a random oracle, then Construction 11.19 is a CCA-secure KEM.
It is interesting to observe that the same construction (namely, Construc- tion 11.19) can be analyzed under different assumptions and in different mod- els, yielding different results. Assuming only that the DDH problem is hard (and for H chosen appropriately), the scheme is CPA-secure. If we model H as a random oracle (which imposes more stringent requirements on H), then under the weaker CDH assumption we obtain CPA-security, and under the stronger gap-CDH assumption we obtain CCA-security.
CONSTRUCTION 11.23
Let G be as in the text. Let ΠE = (Enc′,Dec′) be a private-key en- cryption scheme, and let ΠM = (Mac, Vrfy) be a message authentication code. Define a public-key encryption scheme as follows:
• Gen: On input 1n run G(1n) to obtain (G, q, g). choose a uniform x∈Zq,seth:=gx,andspecifyafunctionH:G→{0,1}2n. The public key is ⟨G, q, g, h, H⟩ and the private key is ⟨G, q, g, x, H⟩.
• Enc: On input a public key pk = ⟨G, q, g, h, H⟩, choose a uniform y ∈ Zq and set kE∥kM := H(hy). Compute c′ ← Enc′kE (m), and output the ciphertext ⟨gy, c′, MackM (c′)⟩.
• Dec: On input a private key sk = ⟨G, q, g, x, H⟩ and a ciphertext ⟨c,c′,t⟩, output ⊥ if c ̸∈ G. Else, compute kE∥kM := H(cx). If VrfykM (c′ , t) ̸= 1 then output ⊥; otherwise, output Dec′kE (c′ ).
DHIES/ECIES.
CCA-secure encryption with Construction 11.19. Combining the KEM in Construction 11.19 with any CCA-secure private-key encryption scheme yields a CCA-secure public-key encryption scheme. (See Theorem 11.14.) Instantiating this approach using Construction 4.18 for the private-key com- ponent matches what is done in DHIES/ECIES, variants of which are included in the ISO/IEC 18033-2 standard for public-key encryption. (See Construc- tion 11.23.) Encryption of a message m in these schemes takes the form
⟨gy, Enc′kE (m), MackM (c′)⟩,
where Enc′ denotes a CPA-secure private-key encryption scheme and c′ de- notes Enc′kE (m). DHIES, the Diffie–Hellman Integrated Encryption Scheme, can be used generically to refer to any scheme of this form, or to refer specif- ically to the case when the group G is a cyclic subgroup of a finite field. ECIES, the Elliptic Curve Integrated Encryption Scheme, refers to the case when G is an elliptic-curve group. We remark that, in Construction 11.23, it is critical to check during decryption that c, the first component of the cipher- text, is in G. Otherwise, an attacker might request decryption of a malformed
410 Introduction to Modern Cryptography
ciphertext ⟨c, c′, t⟩ in which c ̸∈ G; decrypting such a ciphertext (i.e., without returning ⊥) might leak information about the private key.
By Theorem 4.19, encrypting a message and then applying a (strong) mes- sage authentication code yields a CCA-secure private-key encryption scheme. Combining this with Theorem 11.14, we conclude:
COROLLARY 11.24 Let ΠE be a CPA-secure private-key encryption scheme, and let ΠM be a strongly secure message authentication code. If the gap-CDH problem is hard relative to G, and H is modeled as a random oracle, then Construction 11.23 is a CCA-secure public-key encryption scheme.
11.5 RSA Encryption
In this section we turn our attention to encryption schemes based on the RSA assumption defined in Section 8.2.4. We remark that although RSA- based encryption is in widespread use today, there is also currently a gradual shift away from using RSA—and toward using CDH/DDH-based cryptosys- tems relying on elliptic-curve groups—because of the longer key lengths re- quired for RSA-based schemes. We refer to Section 9.3 for further discussion.
11.5.1 Plain RSA
We begin by describing a simple encryption scheme based on the RSA problem. Although the scheme is insecure, it provides a useful starting point for the secure schemes that follow.
Let GenRSA be a ppt algorithm that, on input 1n, outputs a modulus N that is the product of two n-bit primes, along with integers e,d satisfying ed = 1 mod φ(N). (As usual, the algorithm may fail with negligible probabil- ity but we ignore that here.) Recall from Section 8.2.4 that such an algorithm can be easily constructed from any algorithm GenModulus that outputs a composite modulus N along with its factorization; see Algorithm 11.25.
ALGORITHM 11.25
RSA key generation GenRSA Input: Security parameter 1n
Output: N, e, d as described in the text
(N, p, q) ← GenModulus(1n)
φ(N) := (p − 1)(q − 1)
choose e > 1 such that gcd(e,φ(N)) = 1 compute d := [e−1 mod φ(N)]
return N, e, d
Public-Key Encryption 411
Let N, e, d be as above, and let c = me mod N . RSA encryption relies on the fact that someone who knows d can recover m from c by computing [cd mod N]; this works because
cd =(me)d =med =mmodN,
as discussed in Section 8.2.4. On the other hand, without knowledge of d (even if N and e are known) the RSA assumption (cf. Definition 8.46) implies that it is difficult to recover m from c, at least if m is chosen uniformly from Z∗N . This naturally suggests the public-key encryption scheme shown as Construction 11.26: The receiver runs GenRSA to obtain N, e, d; it publishes N and e as its public key, and keeps d in its private key. To encrypt a message4 m ∈ Z∗N , a sender computes the ciphertext c := [me mod N ]. As we have just noted, the receiver—who knows d—can decrypt c and recover m.
CONSTRUCTION 11.26
Let GenRSA be as in the text. Define a public-key encryption scheme as follows:
• Gen: on input 1n run GenRSA(1n) to obtain N,e, and d. The public key is ⟨N, e⟩ and the private key is ⟨N, d⟩.
• Enc: on input a public key pk = ⟨N,e⟩ and a message m ∈ Z∗N, compute the ciphertext
c:=[me modN].
• Dec: on input a private key sk = ⟨N,d⟩ and a ciphertext c ∈ Z∗N, compute the message
m:=[cd modN].
The plain RSA encryption scheme.
The following gives a worked example of the above (see also Example 8.49).
Example 11.27
Say GenRSA outputs (N, e, d) = (391, 3, 235). (Note that 391 = 17 · 23 and so φ(391)=16·22=352. Moreover,3·235=1mod352.) Sothepublickeyis (391, 3) and the private key is (391, 235).
To encrypt the message m = 158 ∈ Z∗391 using the public key (391, 3), we simply compute c := [1583 mod 391] = 295; this is the ciphertext. To decrypt, the receiver computes [295235 mod 391] = 158. ♦
Is the plain RSA encryption scheme secure? The factoring assumption im- plies that it is computationally infeasible for an attacker who is given the
4 We assume that m ∈ Z∗N . If factoring N is hard, it is computationally difficult to find an m ∈ {1,…,N − 1} with m ̸∈ Z∗N (since then gcd(m,N) is a nontrivial factor of N).
412 Introduction to Modern Cryptography
public key to derive the corresponding private key; see Section 8.2.5. This is necessary—but not sufficient—for a public-key encryption scheme to be se- cure. The RSA assumption implies that if the message m is chosen uniformly from Z∗N then an eavesdropper given N, e, and c (namely, the public key and the ciphertext) cannot recover m. But these are weak guarantees, and fall far short of the level of security we want! In particular, they leave open the possibility that an attacker can recover the message when it is not chosen uniformly from Z∗N (indeed, when m is chosen from a small range it is easy to see that an attacker can compute m from the public key and ciphertext). In addition, it does not rule out the possibility that an attacker can learn partial information about the message, even when it is uniform (in fact, this is known to be possible). Moreover, plain RSA encryption is deterministic and so must be insecure, as we have already discussed in Section 11.2.1.
More Attacks on Plain RSA
We have already noted that plain RSA encryption is not CPA-secure. Nev- ertheless, there may be a temptation to use plain RSA for encrypting “random messages” and/or in situations where leaking a few bits of information about the message is acceptable. We warn against this in general, and provide here just a few examples of what can go wrong.
(Some of the attacks that follow assume e = 3. In some cases the attacks can be extended, at least partially, to larger e; in any case, as noted in Sec- tion 8.2.4, setting e = 3 is often done in practice. The attacks should be taken as demonstrating that Construction 11.26 is inadequate, not as indicating that setting e = 3 is necessarily a bad choice.)
A quadratic improvement in recovering m. Since plain RSA encryption is deterministic, we know that if m < B then an attacker can determine m from the ciphertext c = [me mod N] in time O(B) using the brute-force attack discussed in Section 11.2.1. One might hope, however, that plain RSA encryption can be used if B is large, i.e., if the message is chosen from a reasonably large set of values. One possible scenario where this might occur is in the context of hybrid encryption (see Section 11.3), where the “message” is a random n-bit key and so B = 2n. Unfortunately, there is a clever attack that recovers m, with high probability, in time roughly O(√B). This can make a significant difference in practice: a 280-time attack (say) is infeasible, but an attack running in time 240 is relatively easy to carry out.
A description of the attack is given as Algorithm 11.28. In our description, we assume B = 2n and let α ∈ ( 1 , 1) denote some fixed constant (see below).
2
The time complexity of the algorithm is dominated by the time required to sort the 2αn pairs (r,xr); this can be done in time O(n · 2αn). Binary search is used in the second-to-last line to check whether there exists an r with xr = [se mod N].
We now sketch why the attack recovers m with high probability. Let c =
Public-Key Encryption 413
ALGORITHM 11.28
An attack on plain RSA encryption Input: Public key ⟨N, e⟩; ciphertext c
Output: m<2n suchthatme =cmodN
set T := 2αn for r = 1 to T :
xr := [c/re mod N]
sort the pairs {(r,xr)}Tr=1 by their second component for s = 1 to T :
ifxr =? [se modN]forsomer return [r · s mod N]
me modN. For appropriate choice of α > 1, it can be shown that if m 2
is a uniform n-bit integer then with high probability there exist r,s with 1
The RSA-OAEP encryption scheme. CPA-security. During encryption the sender computes
m′ :=m∥0k1, s:=m′ ⊕G(r), t:=r⊕H(s)
for uniform r; the ciphertext is [(s∥t)e mod N]. If the attacker never queries r to G then, since we model G as a random function, the value G(r) is uniform from the attacker’s point of view and so m is masked with a uniform string just as in the one-time pad encryption scheme. Thus, if the attacker never queries r to G then no information about the message is leaked.
Can the attacker query r to G? Note that the value of r is itself masked by H(s). So the attacker has no information about r unless it first queries s to H. If the attacker does not query s to H then the attacker may get lucky and guess r anyway, but if we set the length of r (i.e., k0) sufficiently long then the probability of this is negligible.
So, the only way for the attacker to learn anything about m is to first query s to H. This would require the attacker to compute s from the (uni- form) ciphertext [(s∥t)e mod N]. Note that computing s from [(s∥t)e mod N] is not the RSA problem, which instead involves computing both s and t. Nev- ertheless, for the right settings of the parameters we can use Theorem 11.29 to show that recovering s enables recovery of t in polynomial time, and so recovering s is computationally infeasible if the RSA problem is hard.
Arguing CCA-security involves additional complications, but the basic idea is to show that every decryption-oracle query c made by the attacker falls into one of two categories: either the attacker obtained c by legally encrypting some message m (in which case the attacker learns nothing from the decryp- tion query), or else decryption of c returns an error. This is a consequence of
Public-Key Encryption 425
the fact that the receiver checks that the k1 low-order bits of mˆ are 0 during decryption; if the attacker did not construct some ciphertext c by legally en- crypting some message, the probability that this condition holds is negligible. The formal proof is complicated by the fact that the attacker’s decryption- oracle queries must be answered correctly without knowledge of the private key, which means there must be an efficient way to determine whether to re- turn an error or not and, if not, what message to return. This is accomplished by looking at the adversary’s queries to the random oracles G,H.
Manger’s chosen-ciphertext attack on PKCS #1 v2.0. In 2001, James Manger showed a chosen-ciphertext attack against certain implementations of the RSA encryption scheme specified in PKCS #1 v2.0—even though what was specified was a variant of RSA-OAEP! Since Construction 11.36 is CCA- secure (assuming the RSA problem is hard), how is this possible?
Examining the decryption algorithm in Construction 11.36, note that there are two ways an error can occur: either mˆ ∈ Z∗N is too large, or m′ ∈ {0, 1}l+k1 does not have enough trailing 0s. In Construction 11.36, the receiver is sup- posed to return the same error (denoted ⊥) in either case. In some imple- mentations, however, the receiver would output different errors depending on which step failed. This single bit of additional information enables an attacker to mount a chosen-ciphertext attack that recovers a message m in its entirety from an encryption of that message, using only about ∥N ∥ queries to an oracle that leaks the error message upon decryption. This shows the importance of implementing cryptographic schemes exactly as specified, since the resulting proof and analysis may no longer apply if aspects of the scheme are changed.
Even if the same error is returned in both cases, an attacker can deter- mine where the error occurs if the time to return the error is different. (This is a great example of how an attacker is not limited to examining the in- puts/outputs of an algorithm, but can use side-channel information to attack a scheme.) Implementations must be careful to ensure that the time to return an error is identical regardless of where the error occurs.
11.5.5 *A CCA-Secure KEM in the Random-Oracle Model
We show here a construction of an RSA-based KEM that is CCA-secure in the random-oracle model. (Recall from Theorem 11.14 that any such construc- tion can be used in conjunction with any CCA-secure private-key encryption scheme to give a CCA-secure public-key encryption scheme.) As compared to the RSA-OAEP scheme from the previous section, the main advantage is the simplicity of both the construction and its proof of security. Its main disad- vantage is that it results in longer ciphertexts when encrypting short messages since it requires the KEM/DEM paradigm whereas RSA-OAEP does not. For encrypting long messages, however, RSA-OAEP would also be used as part of a hybrid encryption scheme, and would result in an encryption scheme having similar efficiency to what could be obtained using the KEM shown here.
426 Introduction to Modern Cryptography
The KEM we describe is included as part of the ISO/IEC 18033-2 standard for public-key encryption. In the scheme, the public key includes ⟨N,e⟩ as usual, and a function H : Z∗N → {0,1}n is specified that will be modeled as a random oracle in the analysis. (This function can be based on some underlying cryptographic hash function, as discussed in Section 5.5. We omit the details.) To encapsulate a key, the sender chooses uniform r ∈ Z∗N and then computes the ciphertext c := [re mod N] and the key k := H(r). To decrypt a ciphertext c, the receiver simply recovers r in the usual way and then re-derives the same key k := H(r). See Construction 11.37.
CONSTRUCTION 11.37
Let GenRSA be as usual, and construct a KEM as follows:
• Gen: on input 1n, run GenRSA(1n) to compute (N, e, d). The
public key is ⟨N, e⟩, and the private key is ⟨N, d⟩.
As part of key generation, a function H : Z∗N → {0, 1}n is specified,
but we leave this implicit.
• Encaps: on input public key ⟨N,e⟩ and 1n, choose a uniform r ∈ Z∗N . Output the ciphertext c := [re mod N ] and the key k := H(r).
• Decaps: on input private key ⟨N,d⟩ and a ciphertext c ∈ Z∗N, compute r := [cd mod N] and output the key k := H(r).
A CCA-secure KEM (in the random-oracle model).
CPA-security of the scheme is immediate. Indeed, the ciphertext c is equal
to [re mod N ] for uniform r ∈ Z∗N , and so the RSA assumption implies that
an eavesdropper who observes c will be unable to compute r. This means, in
turn, that the eavesdropper will not query r to H, and thus the value of the def
key k = H(r) remains uniform from the attacker’s point of view.
In fact, the above extends to show CCA-security as well. This is because
answering a decapsulation-oracle query for any ciphertext c ̃ ̸= c only in- d
volves evaluating H at some input [c ̃ mod N] = r ̃ ̸= r. Thus, the attacker’s decapsulation-oracle queries do not reveal any additional information about the key H(r) encapsulated by the challenge ciphertext. (A formal proof is slightly more involved since we must show how it is possible to simulate the answers to decapsulation-oracle queries without knowledge of the private key. Nevertheless, this turns out not to be very difficult.)
THEOREM 11.38 If the RSA problem is hard relative to GenRSA and H is modeled as a random oracle, then Construction 11.37 is CCA-secure.
Public-Key Encryption 427
PROOF Let Π denote Construction 11.37, and let A be a probabilistic polynomial-time adversary. For convenience, and because this is the first proof where we use the full power of the random-oracle model, we explicitly describe the steps of experiment KEMcca (n):
A,Π
1. GenRSA(1n) is run to obtain (N, e, d). In addition, a random function
H : Z∗N → {0, 1}n is chosen.
2. Uniform r ∈ Z∗N is chosen, and the ciphertext c := [re mod N] and key
k := H(r) are computed.
3. Auniformbitb∈{0,1}ischosen. Ifb=0setkˆ:=k. Ifb=1then
choose a uniform kˆ ∈ {0, 1}n.
4. A is given pk = ⟨N,e⟩, c, and kˆ, and may query H(·) (on any input)
and the decapsulation oracle Decaps⟨N,d⟩(·) on any ciphertext cˆ ̸= c.
5. A outputs a bit b′. The output of the experiment is defined to be 1 if
b′ = b, and 0 otherwise.
In an execution of experiment KEMcca (n), let Query be the event that, at
A,Π
any point during its execution, A queries r to the random oracle H. We let
Success denote the event that b′ = b (i.e., the experiment outputs 1). Then Pr[Success] = Pr Success ∧ Query + Pr[Success ∧ Query]
≤ Pr Success ∧ Query + Pr[Query],
where all probabilities are taken over the randomness used in experiment
KEMcca (n). We show that Pr Success ∧ Query ≤ 1 and that Pr[Query] is A,Π 2
negligible. The theorem follows.
We first argue that Pr Success ∧ Query ≤ 1 . If Pr[Query] = 0 this is im-
2
mediate. Otherwise, Pr Success ∧ Query ≤ Pr Success|Query . Now, con-
ditioned on Query, the value of the correct key k = H(r) is uniform because
H is a random function. Consider A’s information about k in experiment
KEMcca (n). The public key pk and ciphertext c, by themselves, do not con- A,Π
tain any information about k. (They do uniquely determine r, but since H is
chosen independently of anything else, this gives no information about H(r).)
Queries that A makes to H also do not reveal any information about r, unless
A queries r to H (in which case Query occurs); this, again, relies on the fact
that H is a random function. Finally, queries that A makes to its decap-
sulation oracle only reveal H(r ̃) for r ̃ ̸= r. This follows from the fact that d
Decaps⟨N,d⟩(c ̃) = H(r ̃) where r ̃ = [c ̃ mod N], but c ̃ ̸= c implies r ̃ ̸= r. Once again, this and the fact that H is a random function mean that no information about H(r) is revealed unless Query occurs.
The above shows that, as long as Query does not occur, the value of the correct key k is uniform even given A’s view of the public key, ciphertext, and
428 Introduction to Modern Cryptography
the answers to all its oracle queries. In that case, then, there is no way A can
We highlight that nowhere in the above argument did we rely on the fact that A is computationally bounded, and in fact PrSuccess∧Query ≤ 1
distinguish (any better than random guessing) whether kˆ is the correct key
or a uniform, independent key. Therefore, Pr Success|Query = 1 . 2
2
even if no computational restrictions are placed on A. This indicates part of the power of the random-oracle model.
To complete the proof of the theorem, we show
CLAIM 11.39 If the RSA problem is hard relative to GenRSA and H is modeled as a random oracle, then Pr[Query] is negligible.
To prove this, we construct an algorithm A′ that uses A as a subroutine. A′ is given an instance N, e, c of the RSA problem, and its goal is to compute r for which re = c mod N. To do so, it will run A, answering its queries to H and Decaps. Handling queries to H is simple, since A′ can just return a random value. Queries to Decaps are trickier, however, since A′ does not know the private key associated with the effective public key ⟨N, e⟩.
On further thought, however, decapsulation queries are also easy to answer
since A′ can just return a random value here as well. That is, although the
query Decaps(c ̃) is supposed to be computed by first computing r ̃ such that e
r ̃ = c ̃ mod N and then evaluating H(r ̃), the result is just a uniform value.
Thus, A′ can simply return a random value without performing the inter-
mediate computation. The only “catch” is that A′ must ensure consistency
between its answers to H-queries and Decaps-queries; namely, it must ensure e
that for any r ̃, c ̃ with r ̃ = c ̃ mod N it holds that H(r ̃) = Decaps(c ̃). This is handled using simple bookkeeping and lists LH and LDecaps that keep track of the answers A′ has given in response to the respective oracle queries. We now give the details.
Algorithm A′:
The algorithm is given (N, e, c) as input.
1. Initialize empty lists LH , LDecaps. choose a uniform k ∈ {0, 1}n and store (c, k) in LDecaps.
2.Chooseauniformbitb∈{0,1}. Ifb=0setkˆ:=k. If b = 1 then choose a uniform kˆ ∈ {0,1}n. Run A on ⟨N,e⟩, c, a n d kˆ .
When A makes a query H(r ̃), answer it as follows:
• If there is an entry in LH of the form (r ̃,k) for some k, return k.
e
• Otherwise, let c ̃ := [r ̃ mod N]. If there is an entry in LDecaps of the form (c ̃, k) for some k, return k and store (r ̃,k) in LH.
Public-Key Encryption 429 • Otherwise, choose a uniform k ∈ {0,1}n, return k, and
store (r ̃,k) in LH.
When A makes a query Decaps(c ̃), answer it as follows:
• If there is an entry in LDecaps of the form (c ̃, k) for some k, return k.
e
• Otherwise, for each entry (r ̃,k) ∈ LH, check if r ̃ =
c ̃ mod N and, if so, output k.
• Otherwise, choose a uniform k ∈ {0,1}n, return k, and
store (c ̃, k) in LDecaps.
3. At the end of A’s execution, if there is an entry (r, k) in LH
for which re = c mod N then return r.
Clearly A′ runs in polynomial time, and the view of A when run as a sub-
to all H-queries are uniform and independent. Finally, A′ outputs the correct solution exactly when Query occurs. Hardness of the RSA problem relative to GenRSA thus implies that Pr[Query] is negligible, as required.
It is worth remarking on the various properties of the random-oracle model (see Section 5.5.1) that are used in the above proof. First, we rely on the fact that the value H(r) is uniform unless r is queried to H—even if H is queried on multiple other values r ̃ ̸= r. We also, implicitly, use extractability to argue that the attacker cannot query r to H; otherwise, we could use this attacker to solve the RSA problem. Finally, the proof relies on programmability in order to simulate the adversary’s decapsulation-oracle queries.
11.5.6 RSA Implementation Issues and Pitfalls
We close this section with a brief discussion of some issues related to the implementation of RSA-based schemes, and some pitfalls to be aware of.
Using Chinese remaindering. In implementations of RSA-based encryp- tion, the receiver can use the Chinese remainder theorem (Section 8.1.5) to speed up computation of eth roots modulo N during decryption. Specifically, let N = pq and say the receiver wishes to compute the eth root of some value y using d = [e−1 mod φ(N)]. The receiver can use the correspondence [yd mod N ] ↔ ([yd mod p ], [yd mod q]) to compute the partial results
routine by A′ in experiment RSA-invA ,GenRSA
in experiment KEMcca (n): the inputs given to A clearly have the right dis-
′
(n) is identical to the view of A
A,Π
tribution, the answers to A’s oracle queries are consistent, and the responses
and
xp := [yd mod p] = y[d mod (p−1)] mod p (11.19) xq := [yd mod q] = y[d mod (q−1)] mod q , (11.20)
430 Introduction to Modern Cryptography
and then combine these to obtain x ↔ (xp,xq), as discussed in Section 8.1.5. Note that [d mod (p − 1)] and [d mod (q − 1)] could be pre-computed since they are independent of y.
Why is this better? Assume exponentiation modulo an l-bit integer takes γ · l3 operations for some constant γ. If p, q are each n bits long, then naively computing [yd mod N] takes γ · (2n)3 = 8γ · n3 steps (because ∥N∥ = 2n). Using Chinese remaindering reduces this to roughly 2 · (γ · n3) steps (because ∥p∥ = ∥q∥ = n), or roughly 1/4 of the time.
Example 11.40
We revisit Example 8.49. Recall that N = 143 = 11 · 13 and d = 103, and y = 64 there. To calculate [64103 mod 143] we compute
[64 mod 11], [64 mod 13]103 = [(−2)103 mod 11], [(−1)103 mod 13] = [(−2)[103 mod 10] mod 11], −1
= [−8 mod 11], −1 = (3, −1).
We can compute 1p = 78 ↔ (1, 0) and 1q = 66 ↔ (0, 1), as discussed in Sec- tion 8.1.5. (Note these values can be pre-computed, as they are independent of y.) Then (3,−1) ↔ 3·1p −1q = 3·78−66 = 168 = 25mod143, in agreement with the answer previously obtained. ♦
A fault attack when using Chinese remaindering. When using Chinese remaindering as just described, one should be aware of a potential attack that can be carried out if faults occur (or can be induced to occur by an attacker, e.g., by hardware tampering) during the course of the computation.
Consider what happens if [yd mod N] is computed twice: the first time with no error (giving the correct result x), but the second time with an error during computation of Equation (11.20) but not Equation (11.19) (the same attack applies in the opposite case). The second computation yields an incorrect result x′ for which x′ = x mod p but x′ ̸= x mod q. This means that p|(x′−x) but q̸ | (x′ − x). But then gcd(x′ − x, N ) = p, yielding the factorization of N .
One possible countermeasure is to verify correctness of the result before using it, by checking that xe = y mod N. (Since ∥e∥ ≪ ∥d∥, using Chinese remaindering still gives better efficiency.) This is recommended in hardware implementations.
Dependent public keys I. When multiple receivers wish to utilize the same encryption scheme, they should use independent public keys. This and the following attack demonstrate what can go wrong when this is not done.
Imagine a company wants to use the same modulus N for each of its em- ployees. Since it is not desirable for messages encrypted to one employee to be read by any other employee, the company issues different (ei,di) pairs to
Public-Key Encryption 431
each employee. That is, the public key of the ith employee is pki = ⟨N,ei⟩ and their private key is sk = ⟨N,di⟩, where ei · di = 1 mod φ(N) for all i.
This approach is insecure and allows any employee to read messages en- crypted to all other employees. The reason is that, as noted in Section 8.2.4, given N and ei,di with ei · di = 1 mod φ(N), the factorization of N can be efficiently computed. Given the factorization of N, of course, it is possible to compute dj := e−1 mod φ(N) for any j.
Dependent public keys II. The attack just shown allows any employee to decrypt messages sent to any other employee. This still leaves the possibility that sharing the modulus N is fine as long as all employees trust each other (or, alternatively, as long as confidentiality need only be preserved against outsiders but not against other members of the company). Here we show a scenario indicating that sharing a modulus is still a bad idea, at least when plain RSA encryption is used.
Say the same message m is encrypted and sent to two different (known) employees with public keys (N, e1) and (N, e2) where e1 ̸= e2. Assume further that gcd(e1, e2) = 1. Then an eavesdropper sees the two ciphertexts
c1 =me1 modN and c2 =me2 modN.
Since gcd(e1, e2) = 1, there exist integers X, Y such that Xe1 + Y e2 = 1 by Proposition 8.2. Moreover, given the public exponents e1 and e2 it is possible to efficiently compute X and Y using the extended Euclidean algorithm (see Appendix B.1.2). We claim that m = [cX1 · cY2 mod N], which can easily be calculated. This is true because
cX1 ·cY2 =mXe1mYe2 =mXe1+Ye2 =m1 =mmodN.
A similar attack applies when using padded RSA or RSA-OAEP if the sender
uses the same transformed message mˆ when encrypting to two users.
Randomness quality in RSA key generation. Throughout this book, we always assume that honest parties have access to sufficient, high-quality randomness. When this assumption is violated then security may fail to hold. In particular, if an l-bit string is chosen from some set S ⊂ {0, 1}l rather than uniformly from {0, 1}l, then an attacker can perform a brute-force search (in time O(|S|)) to attack the system.
In some cases the situation may be even worse. Consider in particular the case of RSA key generation, where one sample of random bits rp is used to choose the first prime p, and a second sample rq is used to generate the second prime q. Assume further that many public/private keys are generated using the same source of poor-quality randomness, in which rp,rq are chosen uniformly from some set S of size 2s. After generating roughly 2s/2 public keys (see Appendix A.4), we expect to obtain two different moduli N,N′ that were generated using identical randomness rp = rp′ . These two moduli share a prime factor which can be easily found by computing gcd(N,N′). An attacker
j
432 Introduction to Modern Cryptography
can thus scrape the Internet for a large set of RSA public keys, compute their pairwise gcd’s, and hope to factor some subset of them. Although computing pairwise gcd’s of 2s/2 moduli would naively take time O(2s), it turns out that this can be significantly improved using a “divide-and-conquer” approach that is beyond the scope of this book. The upshot is that an attacker can perform a brute-force search in time much less than 2s. Moreover, the attack works even if the set S is unknown to the attacker!
The above scenario was verified experimentally by two research teams work- ing independently, who carried out exactly the above attack on public keys scraped over the Internet and were able to successfully factor a significant fraction of the keys they found.
References and Additional Reading
The idea of public-key encryption was first proposed in the open literature by Diffie and Hellman [58]. Rivest, Shamir, and Adleman [148] introduced the RSA assumption and proposed a public-key encryption scheme based on this assumption. As pointed out in the previous chapter, other pioneers of public-key cryptography include Merkle and Rabin (in academic publications) and Ellis, Cocks, and Williamson (in classified publications).
Definition 11.2 is rooted in the seminal work of Goldwasser and Micali [80], who were also the first to recognize the necessity of probabilistic encryption for satisfying this definition. As noted in Chapter 4, chosen-ciphertext at- tacks were first formally defined by Naor and Yung [129] and Rackoff and Simon [147]. The expository article by Shoup [156] discusses the importance of security against chosen-ciphertext attacks. Bellare et al. give a unified, modern treatment of various security notions for public-key encryption [16].
A proof of CPA-security for hybrid encryption was first given by Blum and Goldwasser [36]. The case of CCA-security was treated in [56].
Somewhat amazingly, the El Gamal encryption scheme [70] was not sug- gested until 1984, even though it can be viewed as a direct transformation of the Diffie–Hellman key-exchange protocol (see Exercise 11.4). DHIES was introduced in [2]. The ISO/IEC 18033-2 standard for public-key encryption can be found at http://www.shoup.net/iso.
Plain RSA encryption corresponds to the original scheme introduced by Rivest, Shamir, and Adleman [148]. The attacks on plain RSA encryption de- scribed in Section 11.5.1 are due to [161, 55, 84, 47, 40]; see [120, Chapter 8] and [38] for additional attacks and further information. Proofs of Copper- smith’s theorem can be found in the original work [46] or several subsequent expositions (e.g., [119]).
The PKCS #1 RSA Cryptography Standards (both previous and current
Public-Key Encryption 433
versions) are available at http://www.emc.com/emc-plus/rsa-labs. The chosen-plaintext attack on PKCS #1 v1.5 described here is due to [49]. A description of Bleichenbacher’s chosen-ciphertext attack on PKCS #1 v1.5 can be found in the original paper [34]. See [12] for subsequent improvements.
Proofs of Theorem 11.31, and generalizations, can be found in [8, 86, 66, 7]. See Section 13.1.2 for a general treatment of schemes of this form. Construc- tion 11.37 appears to have been introduced and first analyzed by Shoup [157]. OAEP was introduced by Bellare and Rogaway [22]. The original proof of OAEP was later found to be flawed; the interested reader is referred to [39, 158, 68]. For details of Manger’s chosen-ciphertext attack on implemen- tations of PKCS #1 v2.0, see [117].
The pairwise-gcd attack described in Section 11.5.6 was carried out by Lenstra et al. [112] and Heninger et al. [88].
When using any encryption scheme in practice, the question arises as to what key length to use. This issue should not be taken lightly, and we refer the reader to Section 9.3 and references therein for an in-depth treatment.
The first efficient CCA-secure public-key encryption scheme not relying on the random-oracle model was shown by Cramer and Shoup [50] based on the DDH assumption. Subsequently, Hoffheinz and Kiltz have shown an efficient CCA-secure scheme without random oracles based on the RSA as- sumption [92].
Exercises
11.1 Assume a public-key encryption scheme for single-bit messages with no decryption error. Show that, given pk and a ciphertext c computed via c ← Encpk(m), it is possible for an unbounded adversary to determine m with probability 1.
11.2 Show that for any CPA-secure public-key encryption scheme for single- bit messages, the length of the ciphertext must be superlogarithmic in the security parameter.
Hint: If not, the range of possible ciphertexts has polynomial size.
11.3 Say a public-key encryption scheme (Gen,Enc,Dec) for n-bit messages is one-way if any ppt adversary A has negligible probability of success in the following experiment:
• Gen(1n) is run to obtain keys (pk,sk).
• A message m ∈ {0, 1}n is chosen uniformly at random,
and a ciphertext c ← Encpk(m) is computed.
• A is given pk and c, and outputs a message m′. We say A succeeds if m′ = m.
434 Introduction to Modern Cryptography
(a) Give a construction of a CPA-secure KEM in the random-oracle
model based on any one-way public-key encryption scheme.
(b) Can a deterministic public-key encryption scheme be one-way? If not, prove impossibility; if so, give a construction based on any of the assumptions introduced in this book.
11.4 Show that any two-round key-exchange protocol (that is, where each party sends a single message) satisfying Definition 10.1 can be converted into a CPA-secure public-key encryption scheme.
11.5 Show that Claim 11.7 does not hold in the setting of CCA-security.
11.6 Consider the following public-key encryption scheme. The public key is (G,q,g,h) and the private key is x, generated exactly as in the El Gamal encryption scheme. In order to encrypt a bit b, the sender does the following:
(a) Ifb=0thenchooseauniformy∈Zq andcomputec1 :=gy and c2 := hy. The ciphertext is ⟨c1,c2⟩.
(b) If b = 1 then choose independent uniform y,z ∈ Zq, compute c1 := gy and c2 := gz, and set the ciphertext equal to ⟨c1,c2⟩.
Show that it is possible to decrypt efficiently given knowledge of x. Prove that this encryption scheme is CPA-secure if the decisional Diffie– Hellman problem is hard relative to G.
11.7 Consider the following variant of El Gamal encryption. Let p = 2q + 1, let G be the group of squares modulo p (so G is a subgroup of Z∗p of order q), and let g be a generator of G. The private key is (G, g, q, x) and the public key is (G,g,q,h), where h = gx and x ∈ Zq is chosen uniformly. To encrypt a message m ∈ Zq, choose a uniform r ∈ Zq, compute c1 := gr mod p and c2 := hr + m mod p, and let the ciphertext be ⟨c1, c2⟩. Is this scheme CPA-secure? Prove your answer.
11.8 Consider the following protocol for two parties A and B to flip a fair coin (more complicated versions of this might be used for Internet gambling): (1) a trusted party T publishes her public key pk; (2) then A chooses a uniform bit bA, encrypts it using pk, and announces the ciphertext cA to B and T ; (3) next, B acts symmetrically and announces a ciphertext cB ̸= cA; (4) T decrypts both cA and cB, and the parties XOR the results to obtain the value of the coin.
(a) Argue that even if A is dishonest (but B is honest), the final value of the coin is uniformly distributed.
(b) Assume the parties use El Gamal encryption (where the bit b is encoded as the group element gb before being encrypted—note that efficient decrypt is still possible). Show how a dishonest B can bias the coin to any value he likes.
Public-Key Encryption 435
(c) Suggest what type of encryption scheme would be appropriate to use here. Can you define an appropriate notion of security and prove that your suggestion achieves this definition?
11.9 Prove formally that the El Gamal encryption scheme is not CCA-secure.
11.10 In Section 11.4.4 we showed that El Gamal encryption is malleable, and specifically that given a ciphertext ⟨c1, c2⟩ that is the encryption of some unknown message m, it is possible to produce a ciphertext ⟨c1,c′2⟩ that is the encryption of α · m (for known α). A receiver who receives both these ciphertexts might be suspicious since both ciphertexts share the first component. Show that it is possible to generate ⟨c′1, c′2⟩ that is the encryption of α · m, with c′1 ̸= c1 and c′2 ̸= c2.
11.11 Prove Theorem 11.22.
11.12 One of the attacks on plain RSA discussed in Section 11.5.1 involves a sender who encrypts two related messages using the same public key. Formulate an appropriate definition of security ruling out such attacks, and show that any CPA-secure public-key encryption scheme satisfies your definition.
11.13 One of the attacks on plain RSA discussed in Section 11.5.1 involves a sender who encrypts the same message to three different receivers. Formulate an appropriate definition of security ruling out such attacks, and show that any CPA-secure public-key encryption scheme satisfies your definition.
11.14 Consider the following modified version of padded RSA encryption: As- sume messages to be encrypted have length exactly ∥N ∥ /2. To encrypt, first compute mˆ := 0x00∥r∥0x00∥m where r is a uniform string of length ∥N ∥ /2 − 16. Then compute the ciphertext c := [mˆ e mod N ]. When de- crypting a ciphertext c, the receiver computes mˆ := [cd mod N] and returns an error if mˆ does not consist of 0x00 followed by ∥N ∥ /2 − 16 arbitrary bits followed by 0x00. Show that this scheme is not CCA- secure. Why is it easier to construct a chosen-ciphertext attack on this scheme than on PKCS #1 v1.5?
11.15 Consider the RSA-based encryption scheme in which a user encrypts a message m ∈ {0, 1}l with respect to the public key ⟨N, e⟩ by computing mˆ := H(m)∥m and outputting the ciphertext [mˆ e mod N]. (Here, let H : {0, 1}l → {0, 1}n and assume l + n < ∥N ∥.) The receiver recovers mˆ in the usual way and verifies that it has the correct form before outputting the l least-significant bits as m. Prove or disprove that this scheme is CCA-secure if H is modeled as a random oracle.
11.16 Show a chosen-ciphertext attack on Construction 11.34.
436 Introduction to Modern Cryptography
11.17 Let Π = (Gen, Enc, Dec) be a CPA-secure public-key encryption scheme, and let Π′ = (Gen′, Enc′, Dec′) be a CCA-secure private-key encryption scheme. Consider the following construction:
CONSTRUCTION 11.41
Let H : {0,1}n → {0,1}n be a function. Construct a public-key encryption scheme as follows:
• Gen∗ : on input 1n , run Gen(1n ) to obtain (pk, sk). Output these as the public and private keys, respectively.
• Enc∗: on input a public key pk and a message m ∈ {0,1}n, choose a uniform r ∈ {0, 1}n and output the ciphertext
Encpk(r), Enc′H(r)(m).
• Dec∗: on input a private key sk and a ciphertext ⟨c1,c2⟩, com- pute r := Decsk(c1) and set k := H(r). Then output Dec′k(c2).
Does the above construction have indistinguishable encryptions under a chosen-ciphertext attack, if H is modeled as a random oracle? If yes, provide a proof. If not, where does the approach used to prove Theorem 11.38 break down?
11.18 Consider the following variant of Construction 11.32:
CONSTRUCTION 11.42
Let GenRSA be as usual, and define a public-key encryption scheme as follows:
• Gen: on input 1n, run GenRSA(1n) to obtain (N, e, d). Output the public key pk = ⟨N, e⟩, and the private key sk = ⟨N, d⟩.
• Enc: on input a public key pk = ⟨N,e⟩ and a message m ∈ {0, 1}, choose a uniform r ∈ Z∗N . Output the ciphertext ⟨[re mod N ], lsb(r) ⊕ m⟩.
• Dec: on input a private key sk = ⟨N, d⟩ and a ciphertext ⟨c, b⟩, compute r := [cd mod N ] and output lsb(r) ⊕ b.
Prove that this scheme is CPA-secure. Discuss its advantages and dis- advantages relative to Construction 11.32.
11.19 Say three users have RSA public keys ⟨N1, 3⟩, ⟨N2, 3⟩, and ⟨N3, 3⟩ (i.e., they all use e = 3), with N1 < N2 < N3. Consider the following method for sending the same message m ∈ {0, 1}l to each of these parties: choose a uniform r ← Z∗N1 , and send to everyone the same ciphertext
[r3 modN1],[r3 modN2],[r3 modN3],H(r)⊕m, where H : Z∗N1 → {0,1}l. Assume l ≫ n.
11.21 Fix an RSA public key ⟨N, e⟩and define
Public-Key Encryption 437
(a) Show that this is not CPA-secure, and an adversary can recover m from the ciphertext even when H is modeled as a random oracle.
Hint: See Section 11.5.1.
(b) Show a simple way to fix this and get a CPA-secure method that
transmits a ciphertext of length 3l + O(n).
(c) Show a better approach that is still CPA-secure but with a cipher-
text of length l + O(n).
11.20 Fix an RSA public key ⟨N, e⟩ and assume we have an algorithm A that always correctly computes lsb(x) given [xe mod N]. Write full pseu- docode for an algorithm A′ that computes x from [xe mod N].
0 if0
(c) Suggest how to modify the scheme so as to obtain a one-time-secure signature scheme.
Hint: Include two values y, y′ in the public key.
12.9 A strong one-time-secure signature scheme satisfies the following (infor- mally): given a signature σ′ on a message m′, it is infeasible to output (m, σ) ̸= (m′, σ′) for which σ is a valid signature on m (note that m = m′ is allowed).
(a) Give a formal definition of strong one-time-secure signatures.
(b) Assuming the existence of one-way functions, show a one-way func- tion for which Lamport’s scheme is not a strong one-time-secure signature scheme.
(c) Construct a strong one-time-secure signature scheme based on any assumption used in this book.
Hint: Use a particular one-way function in Lamport’s scheme.
12.10 Consider the Lamport signature scheme. Describe an adversary who obtains signatures on two messages of its choice and can then forge signatures on any message it likes.
12.11 The Lamport scheme uses 2l values in the public key to sign messages of length l. Consider the variant in which the private key contains 2l values x1, . . . , x2l and the public key contains the values y1, . . . , y2l with yi := f(xi). A message m ∈ {0,1}l′ is mapped in a one-to-one fashion to a subset Sm ⊂ {1,…,2l} of size l. To sign m, the signer reveals {xi}i∈Sm. Prove that this gives a one-time-secure signature scheme. What is the maximum message length l′ that this scheme supports?
486 Introduction to Modern Cryptography
12.12 At the end of Section 12.6.3, we show how a pseudorandom function can be used to make Construction 12.20 stateless. Does a similar approach work for the chain-based scheme described in Section 12.6.2? If so, sketch a construction and proof. If not, explain why and modify the scheme to obtain a stateless variant.
12.13 Prove Theorem 12.22.
12.14 Assume revocation of certificates is handled in the following way: when a user Bob claims that the private key corresponding to his public key pkB has been stolen, the user sends to the CA a statement of this fact signed with respect to pkB. Upon receiving such a signed message, the CA revokes the appropriate certificate.
Explain why it is not necessary for the CA to check Bob’s identity in this case. In particular, explain why it is of no concern that an adversary who has stolen Bob’s private key can forge signatures with respect to pkB.
Chapter 13
*Advanced Topics in Public-Key Encryption
In Chapter 11 we saw several examples of public-key encryption schemes used in practice. Here, we explore some schemes that are currently more of theo- retical interest—although in some cases it is possible that these schemes (or variants thereof) will be used more widely in the future.
We begin with a treatment of trapdoor permutations, a generalization of one-way permutations, and show how to use them to construct public-key encryption schemes. Trapdoor permutations neatly encapsulate the key char- acteristics of the RSA permutation that make it so useful. As such, they often provide a useful abstraction for designing new cryptosystems.
Next, we present three schemes based on problems related to factoring:
• The Paillier encryption scheme is an example of an encryption scheme that is homomorphic. This property turns out to be useful for con- structing more-complex cryptographic protocols, something we touch on briefly in Section 13.3.
• The Goldwasser–Micali encryption scheme is of historical interest as the first scheme to be proven CPA-secure. It is also homomorphic, and uses some interesting number theory that can be applied in other contexts.
• Finally, we discuss the Rabin trapdoor permutation, which can be used to construct a public-key encryption scheme. Although superficially similar to the RSA trapdoor permutation, the Rabin trapdoor permutation is distinguished by the fact that its security is based directly on the hardness of factoring. (Recall from Section 8.2.5 that hardness of the RSA problem appears to be a stronger assumption.)
13.1 Encryption from Trapdoor Permutations
In Section 11.5.3 we saw how to construct a CPA-secure public-key encryp- tion scheme based on the RSA assumption. By distilling those properties of
487
488 Introduction to Modern Cryptography
RSA that are used in the construction, and defining an abstract notion that encapsulates those properties, we obtain a general template for constructing secure encryption schemes based on any primitive satisfying the same set of properties. Trapdoor permutations turn out to be the “right” abstraction here.
In the following section we define (families of) trapdoor permutations and observe that the RSA family of one-way permutations (Construction 8.77) satisfies the additional requirements needed to be a family of trapdoor permu- tations. In Section 13.1.2 we generalize the construction from Section 11.5.3 and show that public-key encryption can be constructed from any trapdoor permutation. These results will be used again in Section 13.5, where we show a second example of a trapdoor permutation, this time based directly on the factoring assumption.
In this section we rely on the material from Section 8.4.1 or, alternately, Chapter 7.
13.1.1 Trapdoor Permutations
Recall the definitions of families of functions and families of one-way per- mutations from Section 8.4.1. In that section, we showed that the RSA as- sumption naturally gives rise to a family of one-way permutations. The astute reader may have noticed that the construction we gave (Construction 8.77) has a special property that was not remarked upon there: namely, the parameter- generation algorithm Gen outputs some additional information along with I that enables efficient inversion of fI . We refer to such additional information as a trapdoor, and call families of one-way permutations with this additional property families of trapdoor permutations. A formal definition follows.
DEFINITION 13.1 A tuple of polynomial-time algorithms (Gen, Samp, f, Inv) is a family of trapdoor permutations (or a trapdoor permutation) if:
• The probabilistic parameter-generation algorithm Gen, on input 1n, out- puts (I,td) with |I| ≥ n. Each value of I defines a set DI that constitutes the domain and range of a permutation (i.e., bijection) fI : DI → DI .
• Let Gen1 denote the algorithm that results by running Gen and outputting only I. Then (Gen1,Samp,f) is a family of one-way permutations.
• Let (I, td) be an output of Gen(1n). The deterministic inverting algorithm Inv, on input td and y ∈ DI, outputs x ∈ DI. We denote this by x := Invtd(y). It is required that with all but negligible probability over (I,td) output by Gen(1n) and uniform choice of x ∈ DI, we have
Invtd(fI(x)) = x.
As shorthand, we drop explicit mention of Samp and simply refer to trap- door permutation (Gen, f, Inv). For (I, td) output by Gen we write x ← DI to
*Advanced Topics in Public-Key Encryption 489
denote uniform selection of x ∈ DI (with the understanding that this is done by algorithm Samp).
The second condition above implies that fI cannot be efficiently inverted without td, but the final condition means that fI can be efficiently inverted with td. It is immediate that Construction 8.77 can be modified to give a fam- ily of trapdoor permutations if the RSA problem is hard relative to GenRSA, and so we refer to that construction as the RSA trapdoor permutation.
13.1.2 Public-Key Encryption from Trapdoor Permutations
We now sketch how a public-key encryption scheme can be constructed from an arbitrary family of trapdoor permutations. The construction is sim- ply a generalization of what was already done for the specific RSA trapdoor permutation in Section 11.5.3.
We begin by (re-)introducing the notion of a hard-core predicate. This is the natural adaptation of Definition 7.4 to our context, and also generalizes our previous discussion of one specific hard-core predicate for the RSA trapdoor permutation in Section 11.5.3.
DEFINITION 13.2 Let Π = (Gen,f,Inv) be a family of trapdoor per- mutations, and let hc be a deterministic polynomial-time algorithm that, on input I and x ∈ DI , outputs a single bit hcI (x). We say that hc is a hard-core predicate of Π if for every probabilistic polynomial-time algorithm A there is a negligible function negl such that
Pr[A(I , fI (x)) = hcI (x)] ≤ 1 + negl(n), 2
where the probability is taken over the experiment in which Gen(1n) is run to generate (I,td) and then x is chosen uniformly from DI.
The asymmetry provided by trapdoor permutations implies that anyone who knows the trapdoor td associated with I can recover x from fI(x) and thus compute hcI(x) from fI(x). But given only I, it is infeasible to compute hcI (x) from fI (x) for a uniform x.
The following can be proved by a suitable modification of Theorem 7.5: THEOREM 13.3 Given a family of trapdoor permutations Π, there is a
family of trapdoor permutations Π with a hard-core predicate hc for Π.
Given a family of trapdoor permutations Π = (Gen, f, Inv) with hard-core predicate hc, we can construct a single-bit encryption scheme via the following approach (see Construction 13.4 below, and compare to Construction 11.32):
n
To generate keys, run Gen(1 ) to obtain (I,td); the public key is I and the
490 Introduction to Modern Cryptography
private key is td. Given a public key I, encryption of a message m ∈ {0,1} works by choosing uniform r ∈ DI subject to the constraint that hcI(r) = m, and then setting the ciphertext equal to fI (r). In order to decrypt, the receiver uses td to recover r from fI (r) and then outputs the message m := hcI (r).
CONSTRUCTION 13.4
Let Π = (Gen,f,Inv) be a family of trapdoor permutations with hard- core predicate hc. Define a public-key encryption scheme as follows:
• Gen: on input 1n , run Gen(1n ) to obtain (I , td). Output the public key I and the private key td.
• Enc: on input a public key I and a message m ∈ {0, 1}, choose a uniform r ∈ DI subject to the constraint that hcI (r) = m. Output the ciphertext c := fI (r).
• Dec: on input a private key td and a ciphertext c, compute the value r := InvI (c) and output the message hcI (r).
Public-key encryption from any family of trapdoor permutations.
A proof of security follows along the lines of the proof of Theorem 11.33.
THEOREM 13.5 If Π is a family of trapdoor permutations with hard-core predicate hc, then Construction 13.4 is CPA-secure.
PROOF Let Π denote Construction 13.4. We prove that Π has indistin- guishable encryptions in the presence of an eavesdropper; by Proposition 11.3, this implies it is CPA-secure.
We first observe that hc must be unbiased in the following sense. Let
def δ0(n) =
and
Then there is a negligible function negl such that
def δ1(n) =
n
(I ,td)←Gen(1 );x←DI
Pr [hcI(x) = 0] n
(I ,td)←Gen(1 );x←DI
Pr [hcI(x) = 1].
δ0(n), δ1(n) ≥ 1 − negl(n); 2
if not, then an attacker who simply outputs the more frequently occurring bit would violate Definition 13.2.
Now let A be a probabilistic polynomial-time adversary. Without loss of
generality, we may assume m = 0 and m = 1 in experiment PubKeav (n). 01 A,Π
But then
*Advanced Topics in Public-Key Encryption 491
We then have
Pr[PubKeav (n)=1]= 1·Pr[A(pk,c)=0|cisanencryptionof0]
A,Π 2
+ 1 · Pr[A(pk, c) = 1 | c is an encryption of 1].
2
Pr[A(I , fI (x)) = hcI (x)]
= δ0(n) · Pr[A(I,fI(x)) = 0 | hcI(x) = 0]
+ δ1(n) · Pr[A(I, fI (x)) = 1 | hcI (x) = 1]
≥ 1 − negl(n) · Pr[A(I , fI (x)) = 0 | hcI (x) = 0]
2
+ 1 − negl(n) · Pr[A(I , fI (x)) = 1 | hcI (1) = 1] 2
≥ 1·Pr[A(I,fI(x))=0|hcI(x)=0] 2
+ 1 · Pr[A(I , fI (x)) = 1 | hcI (1) = 1] − 2 · negl(n) 2
= Pr[PubKeav (n) = 1] − 2 · negl(n). A,Π
Since hc is a hard-core predicate for Π , there is a negligible function negl′ such that negl′ (n) ≥ Pr[A(I , fI (x)) = hcI (x)]; this means that
Pr[PubKeav (n) = 1] ≤ negl′(n) + 2 · negl(n), A,Π
completing the proof.
Encrypting longer messages. Using Claim 11.7, we know that we can extend Construction 13.4 to encrypt l-bit messages using ciphertexts l times as long. Better efficiency can be obtained by constructing a KEM, following along the lines of Construction 11.34. We leave the details as an exercise.
13.2 The Paillier Encryption Scheme
In this section we describe the Paillier encryption scheme, a public-key encryption scheme whose security is based on an assumption related (but not known to be equivalent) to the hardness of factoring. This encryption scheme is particularly interesting because it possesses some nice homomorphic properties, as we will discuss further in Section 13.2.3.
492 Introduction to Modern Cryptography
The Paillier encryption scheme utilizes the group Z∗N2, the multiplicative group of elements in the range {1,…,N2} that are relatively prime to N, for N a product of two distinct primes. To understand the scheme it is helpful to first understand the structure of Z∗N2. A useful characterization of this group is given by the following proposition, which says, among other things, that Z∗N2 is isomorphic to ZN × Z∗N (cf. Definition 8.23) for N of the form we will be interested in. We prove the proposition in the next section. (The reader willing to accept the proposition on faith can skip to Section 13.2.2.)
PROPOSITION 13.6 Let N = pq, where p, q are distinct odd primes of equal length. Then:
1. gcd(N,φ(N))=1.
2. For any integer a≥0, we have (1+N)a =(1+aN)modN2.
As a consequence, the order of (1+N) in Z∗N2 is N. That is, (1+N)N = 1modN2 and (1+N)a ̸=1modN2 for any 1≤a
2
*Advanced Topics in Public-Key Encryption 493 since q is prime. But then (p − 1)/q ≥ 2, contradicting the assumption that
p and q have the same length.
CLAIM 13.8 For a≥0 an integer, we have (1+N)a =1+aN modN2.
Thus, the order of (1 + N) in Z∗N
PROOF Using the binomial expansion theorem (Theorem A.1):
(1+N)a=a aiNi. i=0
Reducing the right-hand side modulo N2, all terms with i ≥ 2 become 0 and so (1+N)a = 1+aN mod N2. The smallest nonzero a such that (1+N)a = 1modN2 isthereforea=N.
CLAIM 13.9 The group ZN × Z∗N is isomorphic to the group Z∗N 2 , with isomorphismf:ZN ×Z∗N →Z∗N2 givenbyf(a,b)=[(1+N)a·bN modN2].
PROOF Note that (1 + N )a · bN does not have a factor in common with N2 since gcd((1 + N),N2) = 1 and gcd(b,N2) = 1 (because b ∈ Z∗N). So
(see Theorem 8.19 for the second equality), it suffices to show that f is one- to-one. Say a1, a2 ∈ ZN and b1, b2 ∈ Z∗N are such that f(a1, b1) = f(a2, b2). Then:
(1+N)a1−a2 ·(b1/b2)N =1modN2. (13.1)
(Note that b2 ∈ Z∗N and thus b2 ∈ Z∗N 2 , and so b2 has a multiplicative inverse modulo N2.) Raising both sides to the power φ(N) and using the fact that the order of Z∗N2 is φ(N2) = N · φ(N) we obtain
(1 + N)(a1−a2)·φ(N) · (b1/b2)N·φ(N) = 1 mod N2 ⇒ (1 + N)(a1−a2)·φ(N) = 1 mod N2 .
By Claim 13.8, (1 + N) has order N modulo N2. Applying Proposition 8.53, we see that (a1 −a2)·φ(N) = 0modN and so N divides (a1 −a2)·φ(N). Since gcd(N, φ(N)) = 1 by Claim 13.7, it follows that N | (a1 − a2). Since a1,a2 ∈ ZN, this can only occur if a1 = a2.
[(1+N)a·bN modN2]liesinZ∗N
We first show that f is a bijection. Since
2
is N.
. We now prove that f is an isomorphism. | Z ∗N 2 | = φ ( N 2 ) = p · ( p − 1 ) · q · ( q − 1 ) = p q · ( p − 1 ) ( q − 1 )
2
=|ZN|·|Z∗N|=|ZN ×Z∗N|
494 Introduction to Modern Cryptography
Returning to Equation (13.1) and setting a1 = a2, we thus have bN1 = bN2 mod N2. This implies bN1 = bN2 mod N. Since N is relatively prime to φ(N), the order of Z∗N, exponentiation to the power N is a bijection in Z∗N (cf. Corollary 8.17). This means that b1 = b2 mod N ; since b1, b2 ∈ Z∗N , we have b1 = b2. We conclude that f is one-to-one, and hence a bijection.
To show that f is an isomorphism, we show that f(a1, b1) · f(a2, b2) = f (a1 + a2 , b1 · b2 ). (Note that multiplication on the left-hand side of the equality takes place modulo N2, while addition/multiplication on the right- hand side takes place modulo N.) We have:
f(a1,b1)·f(a2,b2)=(1+N)a1 ·bN1 ·(1+N)a2 ·bN2 modN2 =(1+N)a1+a2 ·(b1b2)N modN2.
Since (1 + N ) has order N modulo N 2 (by Claim 13.8), we can apply Propo- sition 8.52 and obtain
f(a1,b1)·f(a2,b2)=(1+N)a1+a2 ·(b1b2)N modN2
= (1 + N)[a1+a2 mod N] · (b1b2)N mod N2. (13.2)
We are not yet done, since b1b2 in Equation (13.2) represents multiplication modulo N2 whereas we would like it to be modulo N. Let b1b2 = r + γN, where γ,r are integers with 1 ≤ r < N (r cannot be 0 since b1,b2 ∈ Z∗N and so their product cannot be divisible by N). Note that r = b1b2 mod N. We also have
(b1b2)N =(r+γN)N modN2
=N NkrN−k(γN)k modN2 k=0
= rN + N · rN−1 · (γN) = rN = ([b1b2 mod N])N mod N2 , using the binomial expansion theorem as in Claim 13.8. Plugging this in to
Equation (13.2) we get the desired result: f(a1,b1)·f(a2,b2)=(1+N)[a1+a2 modN] ·(b1b2 modN)N modN2
=f(a1 +a2,b1b2),
proving that f is an isomorphism from ZN × Z∗N to Z∗N 2 .
13.2.2 The Paillier Encryption Scheme
Let N = pq be a product of two distinct primes of equal length. Proposi- tion 13.6 says that ZN × Z∗N is isomorphic to Z∗N 2 , with isomorphism given by f(a,b) = [(1+N)a ·bN mod N2]. A consequence is that a uniform element
*Advanced Topics in Public-Key Encryption 495
y ∈ Z∗N 2 corresponds to a uniform element (a, b) ∈ ZN × Z∗N or, in other words,anelement(a,b)withuniforma∈ZN anduniformb∈Z∗N.
Call y ∈ Z∗N2 an Nth residue modulo N2 if y is an Nth power, that is, if there exists an x ∈ Z∗N2 with y = xN modN2. We denote the set of Nth residues modulo N2 by Res(N2). Let us characterize the Nth residues in Z∗N2 . Taking any x ∈ Z∗N2 with x ↔ (a,b) and raising it to the Nth power gives:
[xN modN2]↔(a,b)N =(N·amodN,bN modN)=(0,bN modN).
(Recall that the group operation in ZN ×Z∗N is addition modulo N in the first
component and multiplication modulo N in the second component.) More-
over, we claim that any element y with y ↔ (0, b) is an N th residue. To see def −1
this, recall that gcd(N, φ(N )) = 1 and so d = [N mod φ(N )] exists. So (a, [bd mod N])N = (Na mod N,[bdN mod N]) = (0,b) ↔ y
for any a ∈ ZN . We have thus shown that Res(N2) corresponds to the set (0,b) | b ∈ Z∗N.
The above also demonstrates that the number of Nth roots of any y ∈ Res(N2) is exactly N, and so computing Nth powers is an N-to-1 function. As such, if r ∈ Z∗N2 is uniform then [rN mod N2] is a uniform element of Res(N2).
The decisional composite residuosity problem, roughly speaking, is to distin- guish a uniform element of Z∗N2 from a uniform element of Res(N2). Formally, let GenModulus be a polynomial-time algorithm that, on input 1n, outputs (N, p, q) where N = pq, and p and q are n-bit primes (except with probability negligible in n). Then:
DEFINITION 13.10 The decisional composite residuosity problem is hard relative to GenModulus if for all probabilistic polynomial-time algorithms D there is a negligible function negl such that
Pr[D(N,[rN modN2])=1]−Pr[D(N,r)=1]≤negl(n),
where in each case the probabilities are taken over the experiment in which
GenModulus(1n) outputs (N,p,q), and then a uniform r ∈ Z∗N (Recall that [rN mod N2] is a uniform element of Res(N2).)
2
is chosen.
The decisional composite residuosity (DCR) assumption is the assumption that there is a GenModulus relative to which the decisional composite residu- osity problem is hard.
As we have discussed, elements of Z∗N2 have the form (r′,r) with r′ and r arbitrary (in the appropriate groups), whereas Nth residues have the form (0,r) with r ∈ Z∗N arbitrary. The DCR assumption is that it is hard to
496 Introduction to Modern Cryptography
distinguish uniform elements of the first type from uniform elements of the second type. This suggests the following abstract way to encrypt a message m ∈ ZN with respect to a public key N : choose a uniform N th residue (0, r) and set the ciphertext equal to
c ↔ (m, 1) · (0, r) = (m + 0, 1 · r) = (m, r).
Without worrying for now how this can be carried out efficiently by the sender, or how the receiver can decrypt, let us simply convince ourselves (on an in- tuitive level) that this is secure. Since a uniform Nth residue (0,r) cannot be distinguished from a uniform element (r′,r), the ciphertext as constructed above is indistinguishable (from the point of an eavesdropper who does not know the factorization of N) from the ciphertext
c′ ↔(m,1)·(r′,r)=([m+r′ modN],r)
for uniform r′ ∈ ZN and r ∈ Z∗N. Lemma 11.15 shows that [m+r′ modN] is uniformly distributed in ZN and so, in particular, this ciphertext c′ is independent of the message m. CPA-security follows. A formal proof that proceeds exactly along these lines is given further below.
Before turning to the formal description and proof of security, we show how encryption and decryption can be performed efficiently.
Encryption. We have described encryption above as though it is taking place in ZN × Z∗N . In fact it takes place in the isomorphic group Z∗N 2 . That
1
Observe that
c=(1+N)m ·1N·(1+N)0 ·rNmodN2 ↔(m,1)·(0,r),
and so c ↔ (m, r) as desired.
Decryption. We now describe how decryption can be performed efficiently given the factorization of N. For c constructed as above, we claim that m is recovered by the following steps:
• Setcˆ:=[cφ(N) modN2].
• Set mˆ := (cˆ − 1)/N . (Note that this is carried out over the integers.) • Set m := mˆ · φ(N)−1 mod N.
1We remark that it does not make any difference whether the sender chooses uniform r ∈ Z∗N or uniform r ∈ Z∗N2 , since in either case the distribution of [rN mod N2] is the same (as can be verified by looking at what happens in the isomorphic group ZN × Z∗N ).
is, the sender generates a ciphertext c ∈ Z∗N and then computing
∗N r ∈ Z
by choosing a uniform c:=[(1+N)m ·rN modN2].
2
*Advanced Topics in Public-Key Encryption 497 To see why this works, let c ↔ (m, r) for an arbitrary r ∈ Z∗N . Then
def φ(N) 2 cˆ = [ c m o d N ]
↔ (m, r)φ(N)
= [m · φ(N) mod N], [rφ(N) mod N] = [m·φ(N)modN], 1.
By Proposition 13.6(3), this means that cˆ = (1 + N)[m·φ(N) mod N] mod N2. Using Proposition 13.6(2), we know that
cˆ=(1+N)[m·φ(N)modN] =(1+[m·φ(N)modN]·N)modN2. Since 1+[m·φ(N) mod N]·N is always less that N2 we can drop the mod N2
def
at the end and view the above as an equality over the integers. Thus, mˆ = (cˆ − 1)/N = [m · φ(N ) mod N ] and, finally,
m = [ mˆ · φ ( N ) − 1 m o d N ] ,
as required. (Note that φ(N ) is invertible modulo N since gcd(N, φ(N )) = 1.)
We give a complete description of the Paillier encryption scheme, followed by an example of the above calculations.
CONSTRUCTION 13.11
Let GenModulus be a polynomial-time algorithm that, on input 1n, out- puts (N, p, q) where N = pq and p and q are n-bit primes (except with probability negligible in n). Define the following encryption scheme:
• Gen: on input 1n run GenModulus(1n) to obtain (N, p, q). The public key is N, and the private key is ⟨N,φ(N)⟩.
• Enc: on input a public key N and a message m ∈ ZN, choose a uniform r ← Z∗N and output the ciphertext
c:=[(1+N)m ·rN modN2].
• Dec: on input a private key ⟨N, φ(N)⟩ and a ciphertext c, compute
m:=[cφ(N) modN2]−1·φ(N)−1 modN. N
The Paillier encryption scheme.
Example 13.12
Let N = 11 · 17 = 187 (and so N2 = 34969), and consider encrypting the message m = 175 and then decrypting the corresponding ciphertext. Choosing
498 Introduction to Modern Cryptography
r = 83 ∈ Z∗187, we compute the ciphertext
c := [(1 + 187)175 · 83187 mod 34969] = 23911
corresponding to (175,83). To decrypt, note that φ(N) = 160. So we first compute cˆ := [23911160 mod 34969] = 25620. Subtracting 1 and dividing by 187 gives mˆ := (25620 − 1)/187 = 137; since 90 = [160−1 mod 187], the message is recovered as m := [137 · 90 mod 187] = 175. ♦
THEOREM 13.13 If the decisional composite residuosity problem is hard relative to GenModulus, then the Paillier encryption scheme is CPA-secure.
PROOF Let Π denote the Paillier encryption scheme. We prove that Π has indistinguishable encryptions in the presence of an eavesdropper; by Theorem 11.6 this implies that it is CPA-secure.
Let A be an arbitrary probabilistic polynomial-time adversary. Consider the following ppt algorithm D that attempts to solve the decisional composite residuosity problem relative to GenModulus:
Algorithm D:
The algorithm is given N, y as input.
• Set pk = N and run A(pk) to obtain two messages m0, m1.
• Chooseauniformbitbandsetc:=[(1+N)mb ·ymodN2].
• Give the ciphertext c to A and obtain an output bit b′. If b′ = b, output 1; otherwise, output 0.
Let us analyze the behavior of D. There are two cases to consider:
Case 1: Say the input to D was generated by running GenModulus(1n) to obtain (N, p, q), choosing uniform r ∈ Z∗N2 , and setting y := [rN mod N2]. (That is, y is a uniform element of Res(N2).) In this case,
c=[(1+N)mb ·rN modN2]
for uniform r ∈ Z∗N2 . Recalling that the distribution on [rN mod N2] is the
same whether r is chosen uniformly from Z∗N or from Z∗N 2 , we see that in this
case the view of A when run as a subroutine by D is distributed identically
to A’s view in experiment PubKeav (n). Since D outputs 1 exactly when the
′ A,Π
output b of A is equal to b, we have
PrD(N,[rN mod N2]) = 1 = Pr[PubKeav (n) = 1],
A,Π
where the first probability is taken over the experiment as in Definition 13.10.
Case 2: Say the input to D was generated by running GenModulus(1n) to obtain (N, p, q) and choosing uniform y ∈ Z∗N 2 . We claim that the view of A
*Advanced Topics in Public-Key Encryption 499
in this case is independent of the bit b. This follows because y is a uniform element of the group Z∗N 2 , and so the ciphertext c is uniformly distributed in Z∗N2 (see Lemma 11.15), independent of m. Thus, the probability that b′ = b in this case is exactly 1 . That is,
2
Pr[D(N, r) = 1] = 1 , 2
where the probability is taken over the experiment as in Definition 13.10. Combining the above, we see that
PrD(N,[rN modN2])=1−Pr[D(N,r)=1] = Pr[PubKeav (n) = 1] − 1 .
A,Π 2
By the assumption that the decisional composite residuosity problem is hard
relative to GenModulus, there is a negligible function negl such that Pr[PubKeav (n) = 1] − 1 ≤ negl(n).
A,Π 2
Thus Pr[PubKeav (n) = 1] ≤ 1 + negl(n), completing the proof.
A,Π 2
13.2.3 Homomorphic Encryption
The Paillier encryption scheme is useful in a number of settings because it is homomorphic. Roughly, a homomorphic encryption scheme enables (cer- tain) computations to be performed on encrypted data, yielding a ciphertext containing the encrypted result. In the case of Paillier encryption, the compu- tation that can be performed is (modular) addition. Specifically, fix a public key pk = N. Then the Paillier scheme has the property that multiplying an encryption of m1 and an encryption of m2 (with multiplication done mod- ulo N2) results in an encryption of [m1 + m2 mod N]; this is because
(1+N)m1 ·r1N·(1+N)m2 ·r2N
= (1 + N)[m1+m2 mod N] · (r1r2)N mod N2.
Although the ability to add encrypted values may not seem very useful, it suffices for several interesting applications including voting, discussed below. We present a general definition, of which Pailler encryption is a special case.
DEFINITION 13.14 A public-key encryption scheme (Gen,Enc,Dec) is homomorphic if for all n and all (pk,sk) output by Gen(1n), it is possible to define groups M, C (depending on pk only) such that:
• The message space is M, and all ciphertexts output by Encpk are el- ements of C. For notational convenience, we write M as an additive group and C as a multiplicative group.
500 Introduction to Modern Cryptography
• For any m1,m2 ∈ M, any c1 output by Encpk(m1), and any c2 output
by Encpk(m2), it holds that
Decsk(c1 · c2) = m1 + m2.
Moreover, the distribution on ciphertexts obtained by encrypting m1, encrypting m2, and then multiplying the results is identical to the distri- bution on ciphertexts obtained by encrypting m1 + m2.
The last part of the definition ensures that if ciphertexts c1 ← Encpk(m1) and c2 ← Encpk(m2) are generated and the result c3 := c1 · c2 is computed, then the resulting ciphertext c3 contains no more information about m1 or m2 than the sum m3.
The Paillier encryption scheme with pk = N is homomorphic with M = ZN and C = Z∗N 2 . This is not the first example of a homomorphic encryption scheme we have seen; El Gamal encryption is also homomorphic. Specifically, for public key pk = ⟨G, q, g, h⟩ we can take M = G and C = G × G; then
⟨gy1 , hy1 ·m1⟩ · ⟨gy2 , hy2 ·m2⟩ = ⟨gy1+y2 , hy1+y2 ·m1m2⟩,
where multiplication of ciphertexts is component-wise. The Goldwasser– Micali encryption scheme we will see later is also homomorphic (see Exer- cise 13.10).
A nice feature of Paillier encryption is that it is homomorphic over a large additive group (namely, ZN ). To see an application of this, consider the following distributed voting scheme, where l voters can vote “no” or “yes” and the goal is to tabulate the number of “yes” votes:
1. A voting authority generates a public key N for the Paillier encryption scheme and publicizes N.
2. Let 0 stand for a “no,” and let 1 stand for a “yes.” Each voter casts their vote by encrypting it. That is, voter i casts her vote vi by computing ci :=[(1+N)vi ·(ri)N modN2]forauniformri ∈Z∗N.
3. Each voter broadcasts their vote ci. These votes are then publicly ag- gregated by computing
ctotal := li=1 ci mod N2.
4. The authority is given ctotal. (We assume the authority has not been able to observe what goes on until now.) By decrypting it, the authority obtains the vote total
def l
vtotal = i=1 vi mod N.
If l is small(so that vtotal ≪ N), there is no wrap-around modulo N and vtotal = li=1 vi.
*Advanced Topics in Public-Key Encryption 501
Key features of the above are that no voter learns anyone else’s vote, and calculation of the total is publicly verifiable if the authority is trusted to correctly compute vtotal from ctotal. Also, the authority obtains the correct total without learning any individual votes. (Here, we assume the authority cannot see voters’ ciphertexts. In Section 13.3.3 we show a protocol in which votes are kept hidden from authorities even if they see all the communication.) We assume all voters act honestly (and only try to learn others’ votes based on information they observe); an entire research area of cryptography is dedicated to addressing potential threats from participants who might be malicious and not follow the protocol.
13.3 Secret Sharing and Threshold Encryption
Motivated by the discussion of distributed voting in the previous section, we briefly consider secure (interactive) protocols. Such protocols can be signifi- cantly more complicated than the basic cryptographic primitives (e.g., encryp- tion and signature schemes) we have focused on until now, both because they can involve multiple parties exchanging several rounds of messages, as well as because they are intended to realize more-complex security requirements.
The goal of this section is mainly to give the reader a taste of this fascinating area, and no attempt is made at being comprehensive or complete. Although the protocols presented here can be proven secure (with respect to appropriate definitions), we omit formal definitions, details, and proofs and instead rely on informal discussion.
13.3.1 Secret Sharing
Consider the following problem. A dealer holds a secret s—say, a nuclear- launch code—that it wishes to share among some set of N users P1, . . . , PN by giving each user a share. Any t users should be able to pool their shares and reconstruct the secret, but no coalition of fewer than t users should get any information about s from their collective shares (beyond whatever infor- mation they had about s already). We refer to such a sharing mechanism as a (t,N)-threshold secret-sharing scheme. Such a scheme ensures that s is not revealed without sufficient authorization, while also guaranteeing avail- ability of s when needed (since any t users can reconstruct it). Beyond their direct application, secret-sharing schemes are also a building block of many cryptographic protocols.
Consider a simple solution for the case t = N, assuming s ∈ {0,1}l. The dealer chooses uniform s ,...,s ∈ {0,1}l and sets s := s⊕ N−1 s ;
1N−1 Ni=1i
the share of user Pi is si. Since Ni=1 si = s by construction, clearly all the users together can recover s. However, the shares of any coalition of
502 Introduction to Modern Cryptography
N − 1 users are (jointly) uniform and independent of s, and thus reveal no information about s. This is clear when the coalition is P1, . . . , PN−1. In the general case, when the coalition includes everyone except for Pj (j ̸= N), this is true because s1,...,sj−1,sj+1,...,sN−1 are uniform and independent of s by construction, and
sN = s ⊕ i
Avaluex∈Fisaroot ofapolynomialpifp(x)=0. Weusethewell-known fact that any nonzero, degree-t polynomial over a field has at most t roots. This implies:
COROLLARY 13.15 Any two distinct degree-t polynomials p and q agree on at most t points.
PROOF If not, then the nonzero, degree-t polynomial p − q would have more than t roots.
Shamir’s scheme relies on the fact that for any t pairs of elements (x1,y1), . . . , (xt, yt) from F (with the {xi} distinct), there is a unique polynomial p of degree (t−1) such that p(xi) = yi for 1 ≤ i ≤ t. We can prove this quite easily. The fact that there exists such a p uses standard polynomial interpolation.
2A degree-t polynomial p over F is given by p(X) = ti=0 aiXi, where ai ∈ F and X is a formal variable. (Note that we allow at = 0 and so we really mean a polynomial of degree at most t.) Any such polynomial naturally defines a function mapping F to itself, given by evaluating the polynomial on its input.
*Advanced Topics in Public-Key Encryption
503
In detail: for i = 1, . . . , t, define the degree-(t − 1) polynomial t (X − xj )
def
We now describe Shamir’s (t, N )-threshold secret-sharing scheme. Let F be a finite field that contains the domain of possible secrets, and with |F| > N. Letx1,…,xN ∈Fbedistinct,nonzeroelementsthatarefixedandpublicly known. (Such elements exist since |F| > N.) The scheme works as follows:
def j=1,j̸=i
δ i ( X ) = tj = 1 , j ̸ = i ( x i − x j ) .
Note that δi(xj) = 0 for any j ̸= i, and δi(xi) = 1. So p(X) = isapolynomialofdegree(t−1)withp(xi)=yi for1≤i≤t. (Weremark that this, in fact, demonstrates that the desired polynomial p can be found efficiently.) Uniqueness follows from Corollary 13.15.
Sharing: Given a secret s ∈ F, the dealer chooses uniform a1, . . . , at−1 ∈ F def t−1 i
and defines the polynomial p(X) = s + i=1 aiX . This is a uniform degree-(t − 1) polynomial with constant term s. The share of user Pi is si := p(xi) ∈ F.
Reconstruction: Say t users Pi1,…,Pit pool their shares si1,…,sit. Us- ing polynomial interpolation, they compute the unique degree-(t − 1) polynomial p′ for which p′(xij ) = sij for 1 ≤ j ≤ t. The secret is p′(0).
It is clear that reconstruction works since p′ = p and p(0) = s.
It remains to show that any t − 1 users learn nothing about the secret s from their shares. By symmetry, it suffices to consider the shares of users P1,…,Pt−1. We claim that for any secret s, the shares s1,…,st−1 are (jointly) uniform. Since the dealer chooses a1, . . . , at−1 uniformly, this follows if we show that there is a one-to-one correspondence between the polyno- mial p chosen by the dealer and the shares s1, . . . , st−1. But this is a direct
consequence of Corollary 13.15.
13.3.2 Verifiable Secret Sharing
So far we have considered passive attacks in which t − 1 users may try to use their shares to learn information about the secret. But we may also be concerned about active, malicious behavior. Here there are two separate concerns: First, a corrupted dealer may give inconsistent shares to the users, i.e., such that different secrets are recovered depending on which t users pool their shares. Second, in the reconstruction phase a malicious user may present a different share from the one given to them by the dealer, and thus affect the recovered secret. (While this could be addressed by having the dealer sign the shares, this does not work when the dealer itself may be dishonest.) Verifiable secret-sharing (VSS) schemes prevent both these attacks.
More formally, we allow any t − 1 users to be corrupted and to collude with each other and, possibly, the dealer. We require (1a) at the end of the
t
i=1 δi(X)·yi
504 Introduction to Modern Cryptography
sharing phase, a secret s is defined such that any collection that includes t uncorrupted users (whether or not this collection also includes some corrupted users) will successfully recover s in the reconstruction phase; moreover, (1b) if the dealer is honest, then s corresponds to the dealer’s secret. In addition, (2) when the dealer is honest then, as before, the t − 1 corrupted users learn nothing about the secret from their shares and any public information the dealer publishes. Since we want there to be t uncorrupted users even if t − 1 users are corrupted, we require N ≥ t + (t − 1) > 2(t − 1); in other words, we assume a majority of the users remain uncorrupted.
We describe a VSS scheme due to Feldman that relies on an algorithm G relative to which the discrete-logarithm problem is hard. For simplicity, we describe it in the random-oracle model and let H denote a function to be modeled as a random oracle. We also assume that some trusted parameters (G,q,g), generated using G(1n), are published in advance, where q is prime and so Zq is a field. Finally, we assume that all users have access to a broadcast channel, such that a message broadcast by any user is heard by everyone.
The sharing phase now involves the N users running an interactive protocol with the dealer that proceeds as follows:
1. To share a secret s, the dealer chooses uniform a0 ∈ Zq and then shares a0 as in Shamir’s scheme. That is, the dealer chooses uniform
def t−1 i
a1, . . . , at−1 ∈ Zq and defines the polynomial p(X) = i=0 aiX . The
dealer sends the share si := p(i) = t−1 ai · ij to user Pi.3 j=0
In addition, the dealer publicly broadcasts the values A0 := ga0, …, At−1 := gat−1 , and the “masked secret” c := H(a0) ⊕ s.
(13.3)
2. Each user Pi verifies that its share si satisfies gsi =? t−1 (Aj )ij .
j=0
If not, Pi publicly broadcasts a complaint.
Note that if the dealer is honest, we have
t−1(A )ij =t−1 (gaj)ij =gt−1 aj·ij =gp(i) =gsi,
j=0 j j=0 j=0
and so no honest user will complain. Since there are at most t − 1
corrupted users, there are at most t−1 complaints if the dealer is honest.
3. If more than t − 1 users complain, the dealer is disqualified and the pro- tocol is aborted. Otherwise, the dealer responds to a complaint from Pi by broadcasting si. If this share does not satisfy Equation (13.3) (or if the dealer refuses to respond to a complaint at all), the dealer is dis- qualified and the protocol is aborted. Otherwise, Pi uses the broadcast value (rather than the value it received in the first round) as its share.
3Note that we are now setting xi = i, which is fine since we are using the field Zq.
*Advanced Topics in Public-Key Encryption 505
In the reconstruction phase, say a group of users (that includes at least t uncorrupted users) pool their shares. A share si provided by a user Pi is discarded if it does not satisfy Equation (13.3). Among the remaining shares, any t of them are used to recover a0 exactly as in Shamir’s scheme. The original secret is then computed as s := c ⊕ H(a0).
We now argue that this protocol meets the desired security requirements.
We first show that, assuming the dealer is not disqualified, the value recovered
in the reconstruction phase is uniquely determined by the public information;
specifically, the recovered value is c ⊕ H(logg A0). (Combined with the fact
that an honest dealer is never disqualified, this proves that conditions (1a)
and (1b) hold.) Define ai := logg Ai for 0 ≤ i ≤ t−1; the {ai} cannot be
computed efficiently if the discrete-logarithm problem is hard, but they are def t−1 i
still well-defined. Define the polynomial p(X) = i=0 aiX . Any share si, contributed by party Pi, that is not discarded during the reconstruction phase must satisfy Equation (13.3), and hence satisfies si = p(i). It follows that, regardless of which shares are used, the parties will reconstruct polynomial p, compute a0 = p(0), and then recover s = c ⊕ H(a0).
It is also possible to show that condition (2) holds for computationally bounded adversaries if the discrete-logarithm problem is hard for G. (In con- trast to Shamir’s secret-sharing scheme, secrecy here is no longer uncondi- tional. Unconditionally secure VSS schemes are possible, but are beyond the scope of our treatment.) Intuitively, this is because the secret s is masked by the random value H(a0), and the information given to any t − 1 users in the sharing phase—namely, their shares and the public values {Ai}—reveals only ga0 , from which it is hard to compute a0. This intuition can be made rigorous, but we do not do so here.
13.3.3 Threshold Encryption and Electronic Voting
In Section 13.2.3 we introduced the notion of homomorphic encryption schemes and gave the Paillier encryption scheme as an example. Here we show a different homomorphic encryption scheme that is a variant of El Gamal encryption. Specifically, given a public key pk = ⟨G, q, g, h⟩ as in regular El Gamal encryption, we now encrypt a message m ∈ Zq by setting M := gm, choosing a uniform y ∈ Zq , and sending the ciphertext c := ⟨gy , hy · M ⟩. To decrypt, the receiver recovers M as in standard El Gamal decryption and then computes m := logg M. Although this is not efficient if m comes from a large domain, if m is from a small domain—as it will be in our application— then the receiver can compute logg M efficiently using exhaustive search. The advantage of this variant scheme is that it is homomorphic with respect to addition in Zq. That is,
⟨gy1 , hy1 ·gm1 ⟩ · ⟨gy2 , hy2 ·gm2 ⟩ = ⟨gy1+y2 , hy1+y2 ·gm1+m2 ⟩.
Recall that the basic approach to electronic voting using homomorphic en-
506 Introduction to Modern Cryptography
cryption has each voter i encrypt her vote vi ∈ {0, 1} to obtain a ciphertext ci .
Once everyone has voted, the ciphertexts are multiplied to obtain an encryp-
def
tion of the sum vtotal = i vi mod q = i vi. (The value q is, in practice,
large enough so that no wrap-around modulo q occurs.) Since 0 ≤ vtotal ≤ l, where l is the total number of voters, an authority with the private key can efficiently decrypt the final ciphertext and recover vtotal.
A drawback of this approach is that the authority is trusted, both to (cor- rectly) decrypt the final ciphertext as well as not to decrypt any of the individ- ual voters’ ciphertexts. (In Section 13.2.3 we assumed the authority could not see the individual voters’ ciphertexts.) We might instead prefer to distribute trust among a set of N authorities, such that any set of t authorities is able to jointly decrypt an agreed-upon ciphertext (this ensures availability even if some authorities are down or unwilling to help decrypt), but no collection of t − 1 authorities is able to decrypt any ciphertext on their own (this ensures privacy as long as fewer than t authorities are corrupted).
At first glance, it may seem that secret sharing solves the problem. If we share the private key among the N authorities, then no set of t − 1 authorities learns the private key and so they cannot decrypt. On the other hand, any t authorities can pool their shares, recover the private key, and then decrypt any desired ciphertext.
A little thought shows that this does not quite work. If the authorities reconstruct the private key in order to decrypt some ciphertext, then as part of this process all the authorities learn the private key! Thus, afterward, any authority could decrypt any ciphertext of its choice, on its own.
We need instead a modified approach in which the “secret” (namely, the private key) is never reconstructed in the clear, yet is implicitly reconstructed only enough to enable decryption of one, agreed-upon ciphertext. We can achieve this for the specific case of El Gamal encryption in the following way. Fix a public key pk = ⟨G,q,g,h⟩, and let x ∈ Zq be the private key, i.e., gx = h. Each authority is given a share xi ∈ Zq exactly as in Shamir’s secret- sharing scheme. That is, a uniform degree-(t − 1) polynomial p with p(0) = x is chosen, and the ith authority is given xi := p(i). (We assume a trusted dealer who knows x and securely deletes it once it is shared. It is possible to eliminate the dealer entirely, but this is beyond our present scope.)
Now, say some t authorities i1, . . . , it wish to jointly decrypt a ciphertext ⟨c , c ⟩. To do so, authority i first publishes the value w := cxij . Re-
call from the previous section that there exist publicly computable polynomi-
12jj1
als {δj(X)} (that depend on the identities of these t authorities) such that def t def
p(X) = j=1 δj(X)·xij . Setting δj = δj(0), we see that there exist publicly computablevaluesδ1,…,δt ∈Zq forwhichx=p(0)= tj=1δj·xij. Any
authority can then compute
M′ := c2 .
t wδj j=1 j
*Advanced Topics in Public-Key Encryption 507 (They can then each compute logg M, if desired.) To see that this correctly
recovers the message, say c1 = gy and c2 = hy · M. Then t txiδj txiδj
and so
′ def c2 hy ·M (gx)y ·M M=t δj=cx=(gy)x=M.
wδj = c j =c j=1 j =cp(0)=cx, j1111
j=1 j=1
j=1 wj 1
Note that any set of t − 1 corrupted authorities learns nothing about the private key x from their shares. Moreover, it is possible to show that they learn nothing from the decryption process beyond the recovered value M.
Malicious (active) adversaries. Our treatment above assumes that the authorities decrypting some ciphertext all behave correctly. (If they do not, it would be easy for any of them to cause an incorrect result by publishing an arbitrary value wj .) We also assume that voters behave honestly, and encrypt a vote of either 0 or 1. (Note that a voter could unfairly sway the election by encrypting a large value or a negative value.) Potential malicious behavior of this sort can be prevented using techniques beyond the scope of this book.
13.4 The Goldwasser–Micali Encryption Scheme
Before we present the Goldwasser–Micali encryption scheme, we need to develop a better understanding of quadratic residues. We first explore the easier case of quadratic residues modulo a prime p, and then look at the slightly more complicated case of quadratic residues modulo a composite N.
Throughout this section, p and q denote odd primes, and N = pq denotes a product of two distinct, odd primes.
13.4.1 Quadratic Residues Modulo a Prime
In a group G, an element y ∈ G is a quadratic residue if there exists an x∈Gwithx2 =y. Inthiscase,wecallxasquareroot ofy. Anelement that is not a quadratic residue is called a quadratic non-residue. In an abelian group, the set of quadratic residues forms a subgroup.
In the specific case of Z∗p, we have that y is a quadratic residue if there exists an x with x2 = y mod p. We begin with an easy observation.
PROPOSITION 13.16 Let p > 2 be prime. Every quadratic residue in Z∗p has exactly two square roots.
508 Introduction to Modern Cryptography
PROOF Let y ∈ Z∗p be a quadratic residue. Then there exists an x ∈ Z∗p such that x2 = y mod p. Clearly, (−x)2 = x2 = y mod p. Furthermore, −x ̸= xmodp: if −x = xmodp then 2x = 0modp, which implies p|2x. Since p is prime, this would mean that either p | 2 (which is impossible since p > 2) or p|x (which is impossible since 0 < x < p). So, [xmodp] and [−x mod p ] are distinct elements of Z∗p, and y has at least two square roots.
Let x′ ∈ Z∗p be a square root of y. Then x2 = y = (x′)2 modp, implying that x2 − (x′)2 = 0 mod p. Factoring the left-hand side we obtain
(x − x′)(x + x′) = 0 mod p ,
so that (by Proposition 8.3) either p | (x − x′) or p | (x + x′). In the first case, x′ =xmodpandinthesecondcasex′ =−xmodp,showingthatyindeed has only [±x mod p ] as square roots.
∗ ∗ def 2
Let sqp : Zp → Zp be the function sqp(x) = [x mod p ]. The above shows
that sqp is a two-to-one function when p > 2 is prime. This immediately implies that exactly half the elements of Z∗p are quadratic residues. We denote the set of quadratic residues modulo p by QRp, and the set of quadratic non-residues by QNRp. We have just seen that for p > 2 prime
|QRp|=|QNRp|= Z∗p = p−1. 22
Define Jp(x), the Jacobi symbol of x modulo p, as follows.4 prime, and x ∈ Z∗p. Then
Let p > 2 be
def +1 if x is a quadratic residue modulo p
Jp(x) =
The notation can be extended in the natural way for any x relatively prime
−1 if x is not a quadratic residue modulo p. def
to p by setting Jp(x) = Jp([x mod p ]).
Can we characterize the quadratic residues in Z∗p? We begin with the fact
that Z∗p is a cyclic group of order p−1 (see Theorem 8.56). Let g be a generator of Z∗p. This means that
p222
Z∗ = {g0,g1,g2,…,gp−1−1,gp−1 ,gp−1+1,…,gp−2}
(recall that p is odd, so p − 1 is even). Squaring each element in this list and reducing modulo p − 1 in the exponent (cf. Corollary 8.15) yields a list of all the quadratic residues in Z∗p:
QRp = {g0,g2,g4,…,gp−3,g0,g2,…,gp−3}.
4For p prime, Jp(x) is also sometimes called the Legendre symbol of x and denoted by Lp(x); we have chosen our notation to be consistent with notation introduced later.
*Advanced Topics in Public-Key Encryption 509
Each quadratic residue appears twice in this list. Therefore, the quadratic residues in Z∗p are exactly those elements that can be written as gi with i∈{0,…,p−2}aneven integer.
The above characterization leads to a simple way to compute the Jacobi symbol and thus tell whether an element x ∈ Z∗p is a quadratic residue or not.
PROPOSITION 13.17 Let p > 2 be a prime. Then J (x) = xp−1 mod p. p2
PROOF Let g be an arbitrary generator of Z∗p. If x is a quadratic residue modulo p, our earlier discussion shows that x = gi for some even integer i. Writing i = 2j with j an integer we then have
p−1 p−1 j
x 2 = g2j 2 =g(p−1)j = gp−1 =1j =1modp,
andsoxp−1 =+1=J (x)modpasclaimed. 2p
On the other hand, if x is not a quadratic residue then x = gi for some odd integer i. Writing i = 2j + 1 with j an integer we have
p−1 p−1 p−1 p−1 = g2j+1 2 = g2j 2 ·g 2 =1·g 2
2
gp−12 =gp−1 =1modp,
p−1 =g 2
p−1 x 2
22p
gp−1 ̸=1modp. Itfollowsthatxp−1 =−1=J (x)modp.
Proposition 13.17 directly gives a polynomial-time algorithm (cf. Algo- rithm 13.18) for testing whether an element x ∈ Z∗p is a quadratic residue.
modp.
Now,
2 and so gp−1
= ±1 mod p since [±1 mod p] are the two square roots of 1 (cf. Proposition 13.16). Since g is a generator, it has order p − 1 and so
ALGORITHM 13.18
Deciding quadratic residuosity modulo a prime
Input: A prime p; an element x ∈ Z∗p
Output: Jp(x) (or, equivalently, whether x is a quadratic residue or quadratic non-residue)
2
b := xp−1 mod p
if b = 1 return “quadratic residue” else return “quadratic non-residue”
We conclude this section by noting a nice multiplicative property of quadratic residues and non-residues modulo p.
510 Introduction to Modern Cryptography PROPOSITION 13.19 Let p > 2 be a prime, and x, y ∈ Z∗p . Then
Jp(xy) = Jp(x) · Jp(y). PROOF Using the previous proposition,
p−1 p−1 p−1
Jp(xy)=(xy) 2 =x 2 ·y 2 =Jp(x)·Jp(y)modp.
Since Jp(xy), Jp(x), Jp(y) = ±1, equality holds over the integers as well. COROLLARY 13.20 Let p > 2 be prime, and say x,x′ ∈ QRp and
y,y′ ∈QNRp. Then:
1. [xx′ mod p] ∈ QRp. 2. [yy′ mod p] ∈ QRp. 3. [xymodp]∈QNRp.
13.4.2 Quadratic Residues Modulo a Composite
We now turn our attention to quadratic residues in the group Z∗N , where N = pq. Characterizing the quadratic residues modulo N is easy if we use the results of the previous section in conjunction with the Chinese remainder theorem. Recall that the Chinese remainder theorem says that Z∗N ≃ Z∗p × Z∗q , and we let y ↔ (yp, yq) denote the correspondence guaranteed by the theorem (i.e., yp = [y mod p] and yq = [y mod q]). The key observation is:
PROPOSITION 13.21 Let N = pq with p, q distinct primes, and y ∈ Z∗N with y ↔ (yp,yq). Then y is a quadratic residue modulo N if and only if yp is a quadratic residue modulo p and yq is a quadratic residue modulo q.
PROOF If y is a quadratic residue modulo N then, by definition, there existsanx∈Z∗N suchthatx2 =ymodN. Letx↔(xp,xq). Then
(yp,yq)↔y=x2 ↔(xp,xq)2 =([x2p modp],[x2q modq]),
where (xp,xq)2 is simply the square of the element (xp,xq) in the group Z∗p ×
Z∗q . We have thus shown that
yp =x2p modp and yq =x2q modq (13.4)
and yp, yq are quadratic residues (with respect to the appropriate moduli).
*Advanced Topics in Public-Key Encryption 511
Conversely, if y ↔ (yp, yq) and yp, yq are quadratic residues modulo p and q, respectively, then there exist xp ∈ Z∗p and xq ∈ Z∗q such that Equation (13.4) holds. Let x ∈ Z∗N be such that x ↔ (xp,xq). Reversing the above steps shows that x is a square root of y modulo N.
The above proposition characterizes the quadratic residues modulo N. A careful examination of the proof yields another important observation: each quadratic residue y ∈ Z∗N has exactly four square roots. To see this, let y ↔ (yp,yq) be a quadratic residue modulo N and let xp,xq be square roots of yp and yq modulo p and q, respectively. Then the four square roots of y are given by the elements in Z∗N corresponding to
(xp, xq), (−xp, xq), (xp, −xq), (−xp, −xq). (13.5) Each of these is a square root of y since
(±xp,±xq)2 =[(±xp)2 modp],[(±xq)2 modq] =([x2p modp],[x2q modq])=(yp,yq)↔y
(where again the notation (·, ·)2 refers to squaring in the group Zp × Zq ). The Chinese remainder theorem guarantees that the four elements in Equa- tion (13.5) correspond to distinct elements of Z∗N , since xp and −xp are unique modulo p (and similarly for xq and −xq modulo q).
Example 13.22
Consider Z∗15 (the correspondence given by the Chinese remainder theorem is tabulated in Example 8.25). Element 4 is a quadratic residue modulo 15 with square root 2. Since 2 ↔ (2, 2), the other square roots of 4 are given by
• 2,[−2 mod 3] = (2,1) ↔ 7;
• [−2 mod 5],2 = (3,2) ↔ 8; and
• [−2 mod 5],[−2 mod 3] = (3,1) ↔ 13.
Onecanverifythat72 =82 =132 =4mod15. ♦
Let QRN denote the set of quadratic residues modulo N. Since squaring modulo N is a four-to-one function, we see that exactly 1/4 of the elements of Z∗N are quadratic residues. Alternately, we could note that since y ∈ Z∗N is a quadratic residue if and only if yp, yq are quadratic residues, there is a one- to-one correspondence between QRN and QRp × QRq . Thus, the fraction of quadratic residues modulo N is
|QRN| |QRp|·|QRq| p−1 ·q−1 1 ∗=∗=22=,
|ZN | |ZN | (p − 1)(q − 1) 4 in agreement with the above.
512
Introduction to Modern Cryptography
FIGURE 13.1:
The structure of Z∗p and Z∗N .
In the previous section, we defined the Jacobi symbol Jp(x) for p > 2 prime. We extend the definition to the case of N a product of distinct, odd primes p and q as follows. For any x relatively prime to N = pq,
def
JN (x) = Jp(x) · Jq(x)
= Jp([xmodp])·Jq([xmodq]).
We define J +1 as the set of elements in Z∗ having Jacobi symbol +1, and −1 N N
define JN analogously.
We know from Proposition 13.21 that if x is a quadratic residue modulo N,
then [x mod p ] and [x mod q ] are quadratic residues modulo p and q, respec- tively; that is, Jp(x) = Jq(x) = +1. So JN (x) = +1 and we see that:
If x is a quadratic residue modulo N , then JN (x) = +1.
However, JN(x) = +1 can also occur when Jp(x) = Jq(x) = −1, that is, when both [x mod p ] and [x mod q ] are not quadratic residues modulo p and q (and so x is not a quadratic residue modulo N). This turns out to be useful for the Goldwasser–Micali encryption scheme, and we therefore introduce the notation QNR+1 for the set of elements of this type. That is,
+1 def ∗ x is not a quadratic residue modulo N, QNRN = x∈ZN butJN(x)=+1 .
It is now easy to prove the following (see Figure 13.1):
PROPOSITION 13.23 Let N = pq with p, q distinct, odd primes. Then:
1. Exactly half the elements of Z∗ are in J +1. NN
2. QRN is contained in J +1 . N
N
3. Exactly half the elements of J+1 are in QRN (the other half are in
QNR+1). N N
*Advanced Topics in Public-Key Encryption 513
PROOF We know that JN(x) = +1 if either Jp(x) = Jq(x) = +1 or
Jp(x) = Jq(x) = −1. We also know (from the previous section) that exactly
half the elements of Z∗p have Jacobi symbol +1, and half have Jacobi symbol
−1 (and similarly for Z∗). Defining J +1, J −1, J +1, and J −1 in the natural qppqq
way, we thus have
J+1 = |J+1 ×J+1|+|J−1 ×J−1|
= |J+1|·|J+1|+|J−1|·|J−1| pqpq
= (p−1)(q−1) + (p−1)(q−1) = φ(N). 22222
So J +1 = |Z∗ | /2, proving that half the elements of Z∗ are in J +1. NN NN
Npqpq
We have noted earlier that all quadratic residues modulo N have Jacobi symbol +1, showing that QRN ⊆ J +1.
N
Sincex∈QRN ifandonlyifJp(x)=Jq(x)=+1,wehave
|QR |=|J+1 ×J+1|= (p−1)(q−1) = φ(N), Npq224
and so |QRN | = J +1 /2. Since QRN is a subset of J +1, this proves that NN
half the elements of J +1 are in QRN . N
The next two results are analogues of Proposition 13.19 and Corollary 13.20. PROPOSITION 13.24 Let N = pq be a product of distinct, odd primes,
andx,y∈Z∗N. ThenJN(xy)=JN(x)·JN(y).
PROOF Using the definition of JN (·) and Proposition 13.19:
JN (xy) = Jp(xy) · Jq(xy) = Jp(x) · Jp(y) · Jq(x) · Jq(y)
= Jp(x) · Jq(x) · Jp(y) · Jq(y) = JN (x) · JN (y).
COROLLARY 13.25 Let N = pq be a product of distinct, odd primes,
and say x,x′ ∈ QRN and y,y′ ∈ QNR+1. Then: N
1. [xx′ mod N] ∈ QRN.
2. [yy′ modN]∈QRN.
3. [xy mod N] ∈ QNR+1. N
514 Introduction to Modern Cryptography
PROOF We prove the final claim; proofs of the others are similar. Since
x ∈ QRN, we have Jp(x) = Jq(x) = +1. Since y ∈ QNR+1, we have
Jp(y) = Jq(y) = −1. Using Proposition 13.19,
Jp(xy) = Jp(x) · Jp(y) = −1 and Jq(xy) = Jq(x) · Jq(y) = −1,
and so JN(xy) = +1. But xy is not a quadratic residue modulo N, since
Jp(xy)=−1andso[xymodp]isnotaquadraticresiduemodulop. We
conclude that xy ∈ QN R+1. N
In contrast to Corollary 13.20, it is not true that y,y′ ∈ QNRN implies
yy′ ∈ QRN . (Instead, as indicated in the corollary, this is only guaranteed
if y,y′ ∈ QNR+1.) For example, we could have Jp(y) = +1, Jq(y) = −1 N
and Jp(y′) = −1, Jq(y′) = +1, so Jp(yy′) = Jq(yy′) = −1 and yy′ is not a quadratic residue even though JN (yy′) = +1.
13.4.3 The Quadratic Residuosity Assumption
In Section 13.4.1, we showed an efficient algorithm for deciding whether an input x is a quadratic residue modulo a prime p. Can we adapt the algorithm to work modulo a composite number N? Proposition 13.21 gives an easy solution to this problem provided the factorization of N is known. See Algorithm 13.26.
N
ALGORITHM 13.26
Deciding quadratic residuosity modulo a composite of known factorization
Input: Composite N = pq; the factors p and q; element x ∈ Z∗N Output: A decision as to whether x ∈ QRN
compute Jp(x) and Jq(x)
if Jp(x) = Jq(x) = +1 return “quadratic residue” else return “quadratic non-residue”
(As always, we assume the factors of N are distinct odd primes.) A simple modification of the above algorithm allows for computing JN(x) when the factorization of N is known.
When the factorization of N is unknown, however, there is no known polynomial-time algorithm for deciding whether a given x is a quadratic residue modulo N or not. Somewhat surprisingly, a polynomial-time algo- rithm is known for computing JN(x) without the factorization of N. (Al- though the algorithm itself is not that complicated, its proof of correctness is beyond the scope of this book and we therefore do not present the algo- rithm at all. The interested reader can refer to the references listed at the
*Advanced Topics in Public-Key Encryption 515
end of this chapter.) This leads to a partial test of quadratic residuosity: if, for a given input x, it holds that JN(x) = −1, then x cannot possibly be a quadratic residue. (See Proposition 13.23.) This test says nothing when JN (x) = +1, and there is no known polynomial-time algorithm for deciding quadratic residuosity in that case (that does better than random guessing).
We now formalize the assumption that this problem is hard. Let GenModulus be a polynomial-time algorithm that, on input 1n, outputs (N,p,q) where N = pq, and p and q are n-bit primes except with probability negligible in n.
DEFINITION 13.27 We say deciding quadratic residuosity is hard relative to GenModulus if for all probabilistic polynomial-time algorithms D there exists a negligible function negl such that
Pr[D(N, qr) = 1] − Pr[D(N, qnr) = 1] ≤ negl(n),
where in each case the probabilities are taken over the experiment in which
GenModulus(1n) is run to give (N,p,q), qr is chosen uniformly from QRN,
and qnr is chosen uniformly from QNR+1. N
It is crucial in the above that qnr is chosen from QNR+1 rather than N
QNRN;ifqnrwerechosenfromQNRN thenwithprobability2/3itwouldbe the case that JN (x) = −1 and so distinguishing qnr from a uniform quadratic residue would be easy. (Recall that JN (x) can be computed efficiently even without the factorization of N.)
The quadratic residuosity assumption is simply the assumption that there exists a GenModulus relative to which deciding quadratic residuosity is hard. It is easy to see that if deciding quadratic residuosity is hard relative to GenModulus, then factoring must be hard relative to GenModulus as well.
13.4.4 The Goldwasser–Micali Encryption Scheme
The preceding section immediately suggests a public-key encryption scheme for single-bit messages based on the quadratic residuosity assumption:
• The public key is a modulus N, and the private key is its factorization.
• To encrypt a ‘0,’ send a uniform quadratic residue; to encrypt a ‘1,’ send
a uniform quadratic non-residue with Jacobi symbol +1.
• The receiver can decrypt a ciphertext c with its private key by using the
factorization of N to decide whether c is a quadratic residue or not.
CPA-security of this scheme follows almost trivially from the hardness of the quadratic residuosity problem as formalized in Definition 13.27.
One thing missing from the above description is a specification of how the sender, who does not know the factorization of N, can choose a uniform
516 Introduction to Modern Cryptography
element of QRN (to encrypt a 0) or a uniform element of QN R+1 (to encrypt
a 1). The first of these is easy, while the second requires some ingenuity.
Choosing a uniform quadratic residue. Choosing a uniform element y ∈ QRN is easy: simply pick a uniform x ∈ Z∗N (see Appendix B.2.5) and set y := x2 mod N . Clearly y ∈ QRN . The fact that y is uniformly distributed in QRN follows from the facts that squaring modulo N is a 4-to-1 function (see Section 13.4.2) and that x is chosen uniformly from Z∗N . In more detail, fix any yˆ ∈ QRN and let us compute the probability that y = yˆ after the above procedure. Denote the four square roots of yˆ by ±xˆ,±xˆ′. Then:
Pr[y = yˆ] = Pr[x is a square root of yˆ] = Pr [x ∈ {±xˆ, ±xˆ′}]
=4=1. |Z∗N | |QRN |
Since the above holds for every yˆ ∈ QRN , we see that y is distributed uni- formly in QRN .
to choose a uniform element of QN RN if the factorization of N is unknown.
What saves us in the present context is that the receiver can help by including
certain information in the public key. Specifically, we modify the scheme so
that the receiver additionally chooses a uniform z ∈ QNR+1 and includes z N
as part of its public key. (This is easy for the receiver to do since it knows
the factorization of N; see Exercise 13.6.) The sender can choose a uniform
element y ∈ QNR+1 by choosing a uniform x ∈ Z∗ (as above) and setting 2N N+1
y := [z · x mod N]. It follows from Corollary 13.25 that y ∈ QNRN . We leave it as an exercise to show that y is uniformly distributed in QNR+1; we
do not use this fact directly in the proof of security given below.
We give a complete description of the Goldwasser–Micali encryption scheme, implementing the above ideas, in Construction 13.28.
THEOREM 13.29 If the quadratic residuosity problem is hard relative to GenModulus, then the Goldwasser–Micali encryption scheme is CPA-secure.
PROOF Let Π denote the Goldwasser–Micali encryption scheme. We prove that Π has indistinguishable encryptions in the presence of an eaves- dropper; by Theorem 11.6 this implies that it is CPA-secure.
Let A be an arbitrary probabilistic polynomial-time adversary. Consider the following ppt adversary D that attempts to solve the quadratic residuosity problem relative to GenModulus:
Algorithm D:
The algorithm is given N and z as input, and its goal is to deter-
mineifz∈QRN orz∈QNR+1. N
N
Choosing a uniform element of QNR+1. In general, it is not known how +1 N
N
*Advanced Topics in Public-Key Encryption 517
CONSTRUCTION 13.28
Let GenModulus be as usual. Construct a public-key encryption scheme as follows:
• Gen: on input 1n, run GenModulus(1n) to obtain (N,p,q), and choose a uniform z ∈ QNR+1. The public key is pk = ⟨N,z⟩ and
the private key is sk = ⟨p, q⟩.
• Enc: on input a public key pk = ⟨N,z⟩ and a message m ∈ {0,1},
choose a uniform x ∈ Z∗N and output the ciphertext c := [zm · x2 mod N].
• Dec: on input a private key sk and a ciphertext c, determine whether c is a quadratic residue modulo N using, e.g., Algo- rithm 13.26. If yes, output 0; otherwise, output 1.
N
The Goldwasser–Micali encryption scheme.
• Set pk = ⟨N, z⟩ and run A(pk) to obtain two single-bit mes-
sages m0, m1.
• Choose a uniform bit b and a uniform x ∈ Z∗N , and then set
c:=[zmb ·x2 modN].
• Give the ciphertext c to A, who in turn outputs a bit b′. If
b′ = b, output 1; otherwise, output 0.
Let us analyze the behavior of D. There are two cases to consider:
Case 1: Say the input to D was generated by running GenModulus(1n) to obtain (N, p, q) and then choosing a uniform z ∈ QN R+1. Then D runs A on
a public key constructed exactly as in Π, and we see that in this case the view
of A when run as a subroutine by D is distributed identically to A’s view in
experiment PubKeav (n). Since D outputs 1 exactly when the output b′ of A A,Π
N
is equal to b, we have
Pr[D(N, qnr) = 1] = Pr[PubKeav (n) = 1],
A,Π
where qnr represents a uniform element of QN R+1 as in Definition 13.27.
Case 2: Say the input to D was generated by running GenModulus(1n) to obtain (N, p, q) and then choosing a uniform z ∈ QRN . We claim that the view of A in this case is independent of the bit b. To see this, note that the ciphertext c given to A is a uniform quadratic residue regardless of whether a 0 or a 1 is encrypted:
• Whena0isencrypted,c=[x2 modN]forauniformx∈Z∗N,andsoc is a uniform quadratic residue.
• Whena1isencrypted,c=[z·x2 modN]forauniformx∈Z∗N. Let
xˆ = [x mod N], and note that xˆ is a uniformly distributed element
def 2
N
518 Introduction to Modern Cryptography
of the group QRN . Since z ∈ QRN , we can apply Lemma 11.15 to
conclude that c is uniformly distributed in QRN as well.
Since A’s view is independent of b, the probability that b′ = b in this case is
exactly 1 . That is, 2
Pr[D(N,qr) = 1] = 1 , 2
where qr represents a uniform element of QRN as in Definition 13.27. Thus,
Pr[D(N, qr) = 1] − Pr[D(N, qnr) = 1] = Pr[PubKeav (n) = 1] − 1 . A,Π 2
By the assumption that the quadratic residuosity problem is hard relative to GenModulus, there is a negligible function negl such that
ε(n) − 1 ≤ negl(n); 2
thus, ε(n) ≤ 1 + negl(n). This completes the proof. 2
13.5 The Rabin Encryption Scheme
As mentioned at the beginning of this chapter, the Rabin encryption scheme is attractive because its security is equivalent to the assumption that factoring is hard. An analogous result is not known for RSA-based encryption, and the RSA problem may potentially be easier than factoring. (The same is true of the Goldwasser–Micali encryption scheme, and it is possible that deciding quadratic residuosity modulo N is easier than factoring N.)
Interestingly, the Rabin encryption scheme is (superficially, at least) very similar to the RSA encryption scheme yet has the advantage of being based on a potentially weaker assumption. The fact that RSA is more widely used than the former seems to be due more to historical factors than technical ones; we discuss this further at the end of this section.
We begin with some preliminaries about computing modular square roots. We then introduce a trapdoor permutation that can be based directly on the assumption that factoring is hard. The Rabin encryption scheme (or, at least, one instantiation of it) is then obtained by applying the results from Section 13.1. Throughout this section, we continue to let p and q denote odd primes, and let N = pq denote a product of two distinct, odd primes.
13.5.1 Computing Modular Square Roots
The Rabin encryption scheme requires the receiver to compute modular square roots, and so in this section we explore the algorithmic complexity of
*Advanced Topics in Public-Key Encryption 519
this problem. We first show an efficient algorithm for computing square roots modulo a prime p, and then extend this algorithm to enable computation of square roots modulo a composite N of known factorization. The reader willing to accept the existence of these algorithms on faith can skip to the following section, where we show that computing square roots modulo a composite N with unknown factorization is equivalent to factoring N.
Let p be an odd prime. Computing square roots modulo p is relatively simple when p = 3 mod 4, but much more involved when p = 1 mod 4. (The easier case is all we need for the Rabin encryption scheme as presented in Section 13.5.3; we include the second case for completeness.) In both cases, we show how to compute one of the square roots of a quadratic residue a ∈ Z∗p. Note that if x is one of the square roots of a, then [−x mod p ] is the other.
We tackle the easier case first. Say p = 3 mod 4, meaning we can write p = 4i + 3 for some integer i. Since a ∈ Z∗p is a quadratic residue, we have
J (a) = 1 = ap−1 p2
a we obtain
and so ai+1 = ap+1
mo
a=ap−1+1 =a2i+2 = ai+1
d p (see Proposition 13.17). Multiplying both sides by
2 2
modp,
mo
d p is a square root of a. That is, we obtain a square p+1
4
root of a modulo p by simply computing x := [a 4 mod p ].
It is crucial above that (p + 1)/2 is even because this ensures that (p + 1)/4 p+1
is an integer (this is necessary in order for a 4 mod p to be well-defined; recall that the exponent must be an integer). This approach does not succeed when p = 1 mod 4, in which case p+1 is an integer that is not divisible by 4.
When p = 1 mod 4 we proceed slightly differently. Motivated by the above approach, we might hope to find an odd integer r for which it holds that
2
ar =1modp. Then,asabove,ar+1 =amodpandar+1 modpwouldbea
square root of a with (r + 1)/2 an integer. Although we will not be able to do this, we can do something just as good: we will find an odd integer r along with an element b ∈ Z∗p and an even integer r′ such that
We now describe the general approach to finding r, b, and r′ with the stated properties. Let p−1 = 2l · m where l, m are integers with l ≥ 1 and m odd.5
ar ·br′ =1modp. 22
Thenar+1·br′ =amodpandar+1 ·br′ modpisasquarerootofa(withthe exponents (r + 1)/2 and r′/2 being integers).
2
Since a is a quadratic residue, we know that 2
a2lm = ap−1 = 1 mod p. (13.6) This means that a2lm/2 = a2l−1m mod p is a square root of 1. The square roots
of1modulopare±1modp,soa2l−1m =±1modp. Ifa2l−1m =1modp,we 5The integers l and m can be computed easily by taking out factors of 2 from (p − 1)/2.
520 Introduction to Modern Cryptography
are in the same situation as in Equation (13.6) except that the exponent of a is now divisible by a smaller power of 2. This is progress in the right direction: if we can get to the point where the exponent of a is not divisible by any power of 2 (as would be the case here if l = 1), then the exponent of a is odd and we can compute a square root as discussed earlier. We give an example, and discuss in a moment how to deal with the case when a2l−1m = −1 mod p.
Example 13.30
Take p = 29 and a = 7. Since 7 is a quadratic residue modulo 29, we have 714 mod29=1andweknowthat77 mod29isasquarerootof1. Infact,
77 =1mod29,
and the exponent 7 is odd. So 7(7+1)/2 = 74 = 23 mod 29 is a square root of 7
modulo 29. ♦ To summarize the algorithm so far: we begin with a2lm = 1 mod p and we
pull out factors of 2 from the exponent until one of two things happen: either
am =1modp,ora2l′m =−1modpforsomel′
Show that fN is a permutation over S.
(b) Define a family of trapdoor permutations based on factoring using
fN as defined above.
13.20 Let N be a Blum integer. Definethe function halfN : Z∗N → {0, 1} as
−1 if x < N/2 +1 if x > N/2
Show that the function f : Z∗N → QRN × {−1, +1}2 defined as
f(x) = [x2 mod N], JN (x), halfN (x) is one-to-one.
halfN (x) =
Index of Common Notation
General notation:
• := refers to deterministic assignment
• If S is a set, then x ← S denotes that x is chosen uniformly from S
• If A is a randomized algorithm, then y ← A(x) denotes running A on input x with a uniform random tape and assigning the output to y. We write y := A(x; r) to denote running A on input x using random tape r and assigning the output to y
• ∧ denotes Boolean conjunction (the AND operator)
• ∨ denotes Boolean disjunction (the OR operator)
• ⊕ denotes the exclusive-or (XOR) operator; this operator can be applied to single bits or entire strings (in the latter case, the XOR is bitwise)
• {0, 1}n is the set of all bit-strings of length n
• {0, 1}≤n is the set of all bit-strings of length at most n
• {0, 1}∗ is the set of all finite bit-strings
• 0n (resp., 1n) denotes the string comprised of n zeroes (resp., n ones)
• ∥x∥ denotes the length of the binary representation of the (positive) integer x, written with leading bit 1. Note that log x < ∥x∥ ≤ log x + 1
• |x| denotes the length of the binary string x (which may have leading 0s), or the absolute value of the real number x
• O(·), Θ(·), Ω(·), ω(·) see Appendix A.2
• 0x denotes that digits are being represented in hexadecimal
• x∥y denotes umambiguous concatenation of the strings x and y (“un- ambiguous” means that x and y can be recovered from x∥y)
• Pr[X] denotes the probability of event X
• log x denotes the base-2 logarithm of x
533
534 Introduction to Modern Cryptography
Crypto-specific notation:
• n is the security parameter
• ppt stands for “probabilistic polynomial time”
• AO(·) denotes the algorithm A with oracle access to O
• k typically denotes a secret key (as in private-key encryption and MACs)
• (pk, sk) denotes a public/private key-pair (for public-key encryption and digital signatures)
• negl denotes a negligible function; see Definition 3.4
• poly(n) denotes an arbitrary polynomial
• polylog(n) denotes poly(log(n))
• Funcn denotes the set of functions mapping n-bit strings to n-bit strings
• Permn denotes the set of bijections on n-bit strings
• IV denotes an initialization vector (used for modes of operation and collision-resistant hash functions)
Algorithms and procedures:
• G denotes a pseudorandom generator
• F denotes a keyed function that is typically a pseudorandom function
or permutation
• (Gen,Enc,Dec) denote the key-generation, encryption, and decryption procedures, respectively, for both private- and public-key encryption. For the case of private-key encryption, when Gen is unspecified then Gen(1n) outputs a uniform k ∈ {0, 1}n
• (Gen, Mac, Vrfy) denote the key-generation, tag-generation, and verifica- tion procedures, respectively, for a message authentication code. When Gen is unspecified then Gen(1n) outputs a uniform k ∈ {0, 1}n
• (Gen,Sign,Vrfy) denote the key-generation, signature-generation, and verification procedures, respectively, for a digital signature scheme
• GenPrime denotes a ppt algorithm that, on input 1n, outputs an n-bit prime except with probability negligible in n
• GenModulus denotes a ppt algorithm that, on input 1n, outputs (N, p, q) where N = pq and (except with negligible probability) p and q are n-bit primes
Index of Common Notation 535
• GenRSA denotes a ppt algorithm that, on input 1n, outputs (except with negligible probability) a modulus N , an integer e > 0 with gcd(e, φ(N )) = 1, and an integer d satisfying ed = 1 mod φ(N)
• G denotes a ppt algorithm that, on input 1n, outputs (except with neg- ligible probability) a description of a cyclic group G, the group order q (with ∥q∥ = n), and a generator g ∈ G.
Number theory:
• Z denotes the set of integers
• a|b means a divides b
• a̸ |b means that a does not divide b
• gcd(a, b) denotes the greatest common divisor of a and b
• [a mod b] denotes the remainder of a when divided by b. Note that 0 ≤ [a mod b] < b.
• x1 = x2 = ··· = xn modN means that x1,...,xn are all congruent modulo N
Note: x = y mod N means that x and y are congruent modulo N, whereas x = [y mod N] means that x is equal to the remainder of y when divided by N.
• ZN denotes the additive group of integers modulo N as well as the set {0,...,N −1}
• Z∗N denotes the multiplicative group of invertible integers modulo N (i.e., those that are relatively prime to N)
• φ(N) denotes the size of Z∗N
• G and H denote groups
• G1 ≃ G2 means that groups G1 and G2 are isomorphic. If this isomor- phismisgivenbyf andf(x1)=x2 thenwewritex1 ↔x2
• g is typically a generator of a group
• logg h denotes the discrete logarithm of h to the base g
• ⟨g⟩ denotes the group generated by g
• p and q usually denote primes
• N typically denotes the product of two distinct primes p and q of equal length
536 Introduction to Modern Cryptography
• QRp is the set of quadratic residues modulo p
• QNRp is the set of quadratic non-residues modulo p
• Jp(x) is the Jacobi symbol of x modulo p
• J +1 is the set of elements with Jacobi symbol +1 modulo N N
• J −1 is the set of elements with Jacobi symbol −1 modulo N N
• QNR+1 is the set of quadratic non-residues modulo N having Jacobi N
symbol +1
Appendix A Mathematical Background
A.1 Identities and Inequalities
We list some standard identities and inequalities that are used at various points throughout the text.
THEOREM A.1 (Binomial expansion theorem) Let x, y be real num- bers, and let n be a positive integer. Then
(x+y)n=n nixiyn−i. i=0
For all x ≥ 1 it holds that (1 − 1/x)x ≤ e−1.
For all x it holds that 1 − x ≤ e−x.
For all x with 0 ≤ x ≤ 1 it holds that e−x ≤1−1− 1·x≤1− x.
A.2 Asymptotic Notation
We use standard notation for expressing asymptotic behavior of functions. DEFINITION A.5 Let f (n), g(n) be functions from non-negative integers
to non-negative reals. Then:
• f(n) = O(g(n)) means that there exist positive integers c and n′ such that for all n>n′ it holds that f(n)≤c·g(n).
PROPOSITION A.2
PROPOSITION A.3
PROPOSITION A.4
e2
537
538 Introduction to Modern Cryptography
• f(n) = Ω(g(n)) means that there exist positive integers c and n′ such
that for all n>n′ it holds that f(n)≥c·g(n).
• f(n) = Θ(g(n)) means that there exist positive integers c1,c2, and n′
such that for all n>n′ it holds that c1 ·g(n)≤f(n)≤c2 ·g(n).
• f(n) = o(g(n)) means that limn→∞ f(n) = 0.
g(n)
• f(n) = ω(g(n)) means that limn→∞ f(n) = ∞.
g(n)
Let f (n) = n4 + 3n + 500. Then:
• f(n) = O(n4).
• f(n) = O(n5). In fact, f(n) = o(n5).
• f(n) = Ω(n3 logn). In fact, f(n) = ω(n3 logn).
• f(n) = Θ(n4).
A.3 Basic Probability
Example A.6
♦
We assume the reader is familiar with basic probability theory, on the level of what is covered in a typical undergraduate course on discrete mathematics. Here we simply remind the reader of some notation and basic facts.
If E is an event, then E ̄ denotes the complement of that event; i.e., E ̄ is the event that E does not occur. By definition, Pr[E] = 1 − Pr[E ̄]. If E1 and E2 are events, then E1 ∧ E2 denotes their conjunction; i.e., E1 ∧ E2 is the event that both E1 and E2 occur. By definition, Pr[E1 ∧ E2] ≤ Pr[E1]. Events E1 and E2 are said to be independent if Pr[E1 ∧ E2] = Pr[E1] · Pr[E2].
If E1 and E2 are events, then E1 ∨ E2 denotes the disjunction of E1 and E2; that is, E1 ∨ E2 is the event that either E1 or E2 occurs. It follows from the definition that Pr[E1 ∨ E2] ≥ Pr[E1]. The union bound is often a very useful upper bound of this quantity.
PROPOSITION A.7 (Union Bound)
Pr[E1 ∨ E2] ≤ Pr[E1] + Pr[E2].
Mathematical Background 539 Repeated application of the union bound for any events E1, . . . , Ek gives
k
P r ki = 1 E i ≤ P r [ E i ] .
i=1
The conditional probability of E1 given E2, denoted Pr[E1 | E2], is defined as
Pr[E1 | E2] =
def Pr[E1 ∧ E2]
Pr[E2 ]
as long as Pr[E2] ̸= 0. (If Pr[E2] = 0 then Pr[E1 | E2] is undefined.) This rep- resents the probability that event E1 occurs, given that event E2 has occurred. It follows immediately from the definition that
Pr[E1 ∧E2]=Pr[E1 |E2]·Pr[E2];
equality holds even if Pr[E2] = 0 as long as we interpret multiplication by zero on the right-hand side in the obvious way.
We can now easily derive Bayes’ theorem.
THEOREM A.8 (Bayes’ Theorem) If Pr[E2] ̸= 0 then
Pr[E1 | E2] = Pr[E2 | E1] · Pr[E1]. Pr[E2 ]
PROOF This follows because
Pr[E1 |E2]= Pr[E1 ∧E2] = Pr[E2 ∧E1] = Pr[E2 |E1]·Pr[E1].
Pr[E2 ] Pr[E2 ] Pr[E2 ]
Let E1,…,En be events such that Pr[E1∨···∨En] = 1 and Pr[Ei∧Ej] = 0 for all i ̸= j. That is, the {Ei} partition the space of all possible events, so that with probability 1 exactly one of the events Ei occurs. Then for any F
n
Pr[F ∧Ei]. P r [ F ] = P r [ F ∧ E 1 ] + P r [ F ∧ E ̄ 1 ]
Pr[F]=
A special case is when n = 2 and E2 = E ̄1, giving
i=1
= Pr[F | E1] · Pr[E1] + Pr[F | E ̄1] · Pr[E ̄1]. Taking F = E1 ∨ E2, we get a tighter version of the union bound:
Pr[E1 ∨E2]=Pr[E1 ∨E2 |E1]·Pr[E1]+Pr[E1 ∨E2 |E ̄1]·Pr[E ̄1] ≤ Pr[E1] + Pr[E2 | E ̄1].
540 Introduction to Modern Cryptography Extending this to events E1, . . . , En we obtain
PROPOSITION A.9
k
P r ki = 1 E i ≤ P r [ E 1 ] + P r [ E i | E ̄ 1 ∧ · · · ∧ E ̄ i − 1 ] .
i=2
We review some terminology and state probability bounds that are stan- dard, but may not be encountered in a basic discrete mathematics course. The material here is used only in Section 7.3.
A (discrete, real-valued) random variable X is a variable whose value is assigned probabilistically from some finite set S of real numbers. X is non- negative if it does not take negative values; it is a 0/1-random variable if S = {0, 1}. The 0/1-random variables X1, . . . , Xk areindependent if for allb1,…,bk itholdsthatPr[X1 =b1∧···∧Xk =bk]= ki=1Pr[Xi =bi].
* Useful Probability Bounds
We let Exp[X] denote the expectation of a random variable X; if X takes def
values in a set S then Exp[X] = s∈Ss·Pr[X = s]. One of the most important facts is that expectation is linear; for random variables X1,…,Xk (with arbitrary dependencies) we have Exp[i Xi] = i Exp[Xi]. If X1, X2 are independent, then Exp[Xi · Xj] = Exp[Xi] · Exp[Xj].
Markov’s inequality is useful when little is known about X.
PROPOSITION A.10 (Markov’s inequality) Let X be a non-negative random variable and v > 0. Then Pr[X ≥ v] ≤ Exp[X]/v.
PROOF
Say X takes values in a set S. We have Exp[X] = s · Pr[X = s]
s∈S
≥ Pr[X =s]·0+ v·Pr[X =s] x∈S, x
Pr[|X − Exp[X]| ≥ δ] ≤ Var[X].
δ2 PROOF Define the non-negative random variable Y
def
then apply Markov’s inequality. So,
= (X − Exp[X ]) and
Pr[|X − Exp[X]| ≥ δ] = Pr[(X − Exp[X])2 ≥ δ2]
≤ Exp[(X − Exp[X])2] = Var[X].
δ2 δ2
The 0/1-random variables X1, . . . , Xm are pairwise independent if for every i ̸= j and every bi,bj ∈ {0,1} it holds that
Pr[Xi =bi ∧ Xj =bj]=Pr[Xi =bi]·Pr[Xj =bj].
If X1, . . . , Xm are pairwise independent then Var[mi=1 Xi] = mi=1 Var[Xi]. (This follows since Exp[Xi ·Xj] = Exp[Xi]·Exp[Xj] when i ̸= j, using pairwise independence.) An important corollary of Chebyshev’s inequality follows.
COROLLARY A.12 Let X1, . . . , Xm be pairwise-independent random variables with the same expectation μ and variance σ2. Then for every δ > 0,
m Xi σ2 i=1
Pr m −μ≥δ ≤δ2m.
m
PROOF By linearity of expectation, Exp[ Xi/m] = μ. Applying
i=1
Chebyshev’s inequality to the random variable mi=1 Xi/m, we have
X i 1m1m 1mσ2
m X i V a r 1 · mi = 1 Pr i=1 −μ ≥δ ≤ m
.
Var m · Xi = m2 Var[Xi] = m2 σ2 = m . i=1 i=1 i=1
m δ2 Using pairwise independence, it follows that
The inequality is obtained by combining the above two equations.
Say 0/1-random variables X1, . . . , Xm each provides an estimate of some fixed (unknown) bit b. That is, Pr[Xi = b] ≥ 1/2+ε for all i, where ε > 0.
542 Introduction to Modern Cryptography
We can estimate b by looking at the value of X1; this estimate will be correct with probability Pr[X1 = b]. A better estimate can be obtained by looking at the values of X1,…,Xm and taking the value that occurs the majority of the time. We analyze how well this does when X1, . . . , Xm are pairwise independent.
PROPOSITION A.13 Fix ε > 0 and b ∈ {0, 1}, and let {Xi} be pairwise- independent, 0/1-random variables for which Pr[Xi = b] ≥ 1 + ε for all i.
2
Consider the process in which m values X1, . . . , Xm are recorded and X is set
to the value that occurs a strict majority of the time. Then
Pr[X ̸= b] ≤ 1 . 4 · ε2 · m
PROOF Assume b = 1; by symmetry, this is without loss of generality.
Then Exp[Xi] = 1 + ε. Let X denote the strict majority of the {Xi} as in the 2 m
proposition, and note that X ̸= 1 if and only if i=1 Xi ≤ m/2. So m
Pr[X̸=1]=Pr Xi ≤m/2 i=1
= P r mi = 1 X i − 1 ≤ 0 m2
=Prmi=1Xi −1+ε≤−ε m2
≤ Prmi=1 Xi − 1 + ε ≥ ε. m2
Since Var[Xi] ≤ 1/4 for all i, applying the previous corollary shows that Pr[X ̸= 1] ≤ 1 as claimed.
4ε2 m
A better bound is obtained if the {Xi} are independent:
PROPOSITION A.14 (Chernoff bound) Fix ε > 0 and b ∈ {0,1}, and let {Xi} be independent 0/1-random variables with Pr[Xi = b] = 1 + ε
2
for all i. The probability that their majority value is not b is at most e−ε2m/2.
A.4 The “Birthday” Problem
If we choose q elements y1, . . . , yq uniformly from a set of size N, what is the probability that there exist distinct i, j with yi = yj ? We refer to the stated
Mathematical Background 543
event as a collision, and denote the probability of this event by coll(q,N). This problem is related to the so-called birthday problem, which asks what size group of people we need such that with probability 1/2 some pair of people in the group share a birthday. To see the relationship, let yi denote the birthday of the ith person in the group. If there are q people in the group thenwehaveqvaluesy1,…,yq chosenuniformlyfrom{1,…,365},making the simplifying assumption that birthdays are uniformly and independently distributed among the 365 days of a non-leap year. Furthermore, matching birthdays correspond to a collision, i.e., distinct i,j with yi = yj. So the desired solution to the birthday problem is given by the minimal (integer) value of q for which coll(q, 365) ≥ 1/2. (The answer may surprise you—taking q = 23 people suffices!)
In this section, we prove lower and upper bounds on coll(q,N). Taken together and summarized at a high level, they show that if q < √N then the probability of a collision is Θ(q2/N); alternately, for q = Θ(√N) the probability of a collision is constant.
An upper bound for the collision probability is easy to obtain.
LEMMA A.15 Fix a positive integer N, and say q elements y1,...,yq are
chosen uniformly and independently at random from a set of size N. Then the
probability that there exist distinct i, j with yi = yj is at most q2 . That is, 2N
q2 coll(q,N)≤ 2N.
PROOF The proof is a simple application of the union bound (Proposi- tion A.7). Recall that a collision means that there exist distinct i,j with yi = yj. Let Coll denote the event of a collision, and let Colli,j denote the event that yi = yj. It isimmediate that Pr[Colli,j] = 1/N for any distinct i,j.Furthermore,Coll= i̸=jColli,jandsorepeatedapplicationoftheunion bound implies that
Pr [Coll] = Pr Colli,j
i̸=j
q1q2
≤ Pr[Colli,j]= 2 ·N≤2N. i̸=j
544 Introduction to Modern Cryptography
LEMMA A.16 Fix a positive integer N, and say q ≤ √2N elements y1, . . . , yq are chosen uniformly and independently at random from a set of size N. Then the probability that there exist distinct i,j with yi = yj is at least q(q−1) . In fact,
4N
coll(q,N)≥1−e−q(q−1)/2N ≥ q(q−1). 4N
PROOF Recall that a collision means that there exist distinct i,j with yi = yj. Let Coll denote this event. Let NoColli be the event that there is no collision among y1,...,yi; that is, yj ̸= yk for all j < k ≤ i. Then NoCollq = Coll is the event that there is no collision at all.
If NoCollq occurs then NoColli must also have occurred for all i ≤ q. Thus, Pr[NoCollq]=Pr[NoColl1]·Pr[NoColl2 |NoColl1]···Pr[NoCollq |NoCollq−1].
Now, Pr[NoColl1] = 1 since y1 cannot collide with itself. Furthermore, if event
NoColli occurs then {y1, . . . , yi} contains i distinct values; so, the probability
that yi+1 collides with one of these values is i and hence the probability that N
yi+1 does not collide with any of these values is 1 − i . This means N
and so
Pr[NoColli+1 | NoColli] = 1 − i , N
q−1 i
Pr[NoCollq] = 1 − N . i=1
Since i/N < 1 for all i, we have 1 − i ≤ e−i/N (by Inequality A.3) and so N
q−1
Pr[NoColl ] ≤ e−i/N = e− q−1 (i/N ) = e−q(q−1)/2N .
q i=1 i=1
We conclude that
Pr[Coll] = 1 − Pr[NoCollq ] ≥ 1 − e−q(q−1)/2N ≥ q(q − 1) ,
4N using Inequality A.4 in the last step (note that q(q − 1)/2N < 1).
A.5 *Finite Fields
We use finite fields only sparingly in the book, but we include a definition and some basic facts for completeness. Further details can be found in any textbook on abstract algebra.
Mathematical Background 545 DEFINITION A.17 A (finite) field is a (finite) set F along with two
binary operations +,· for which the following hold:
• F is an abelian group with respect to the operation ‘+.’ We let 0 denote
the identity element of this group.
• F \ {0} is an abelian group with respect to the operation ‘·.’ We let 1 denote the identity element of this group.
As usual, we often write ab in place of a · b.
• (Distributivity:) For all a,b,c∈F, we have a·(b+c)=ab+ac.
The additive inverse of a ∈ F, denoted by −a, is the unique element satisfy- ing a+(−a) = 0; we write b−a in place of b+(−a). The multiplicative inverse of a ∈ F \ {0}, denoted by a−1, is the unique element satisfying aa−1 = 1; we often write b/a in place of ba−1.
Example A.18
It follows from the results of Section 8.1.4 that for any prime p the set {0, . . . , p − 1} is a finite field with respect to addition and multiplication mod- ulo p. We denote this field by Fp. ♦
Finite fields have a rich theory. For our purposes, we need only a few basic facts. The order of F is the number of elements in F (assuming it is finite). Recallalsothatqisaprimepowerifq=pr forsomeprimepandintegerr≥1.
THEOREM A.19 If F is a finite field, then the order of F is a prime power. Conversely, for every prime power q there is a finite field of order q, which is moreover the unique such field (up to relabeling of the elements).
For q = pr with p prime, we let Fq denote the (unique) field of order q. We call p the characteristic of Fq. The preceding theorem tells us that the characteristic of any finite field is prime.
As in the case of groups, if n is a positive integer and a ∈ F then
def n def
n·a = a+···+a and a = a···a.
n times n times The notation is extended for n ≤ 0 in the natural way.
THEOREM A.20 Let Fq be a finite field of characteristic p. Then for all a∈Fq we have p·a=0.
Letq=pr withpprime. Forr=1,wehaveseeninExampleA.18thatFq = Fp can be taken to be the set {0, . . . , p − 1} under addition and multiplication
546 Introduction to Modern Cryptography
modulo p. We caution, however, that for r > 1 the set {0,…,q − 1} is not a field under addition and multiplication modulo q. For example, if we take q = 32 = 9 then the element 3 does not have a multiplicative inverse modulo 9.
Finite fields of characteristic p can be represented using polynomials over Fp. We give an example to demonstrate the flavor of the construction, without discussing why the construction works or describing the general case. We con- struct the field F4 by working with polynomials over F2. Fix the polynomial r(x) = x2 +x+1, and note that r(x) has no roots over F2 since r(0) = r(1) = 1 (recall that we are working in F2, which means that all operations are carried out modulo 2). In the same way that we can introduce the imaginary num- ber i to be a root of x2 + 1 over the reals, we can introduce a value ω to be a rootofr(x)overF2;thatis,ω2 =−ω−1. WethendefineF4 tobethesetof all degree-1 polynomials in ω over F2; that is, F4 = {0, 1, ω, ω + 1}. Addition in F4 will just be regular polynomial addition, remembering that operations on the coefficients are done in F2 (that is, modulo 2). Multiplication in F4 will be polynomial multiplication (again, with operations on the coefficients carried out modulo 2) followed by the substitution ω2 = −ω − 1; this also ensures that the result lies in F4. So, for example,
and
ω + (ω + 1) = 2ω + 1 = 1 (ω+1)·(ω+1)=ω2 +2ω+1=(−ω−1)+1=−ω=ω.
Although not obvious, one can check that this is a field; the only difficult condition to verify is that every nonzero element has a multiplicative inverse.
We need only one other result.
THEOREM A.21 Let Fq be a finite field of order q. Then the abelian group Fq \ {0} with respect to ‘·’ is a cyclic group of order q − 1.
Appendix B
Basic Algorithmic Number Theory
For the cryptographic constructions given in this book to be efficient (i.e., to run in time polynomial in the lengths of their inputs), it is necessary for these constructions to utilize efficient (that is, polynomial-time) algorithms for performing basic number-theoretic operations. Although in some cases there exist “trivial” algorithms that would work, it is still worthwhile to carefully consider their efficiency since for cryptographic applications it is not uncom- mon to use integers that are thousands of bits long. In other cases obtaining any polynomial-time algorithm requires a bit of cleverness, and an analysis of their performance may rely on non-trivial group-theoretic results.
In Appendix B.1 we describe basic algorithms for integer arithmetic. Here we cover the familiar algorithms for addition, subtraction, etc., as well as the Euclidean algorithm for computing greatest common divisors. We also discuss the extended Euclidean algorithm, assuming there that the reader has covered the material in Section 8.1.1.
In Appendix B.2 we show various algorithms for modular arithmetic. In addition to a brief discussion of basic modular operations (i.e., modular reduc- tion, addition, multiplication, and inversion), we also describe Montgomery multiplication, which can greatly simplify (and speed up) implementations of modular arithmetic. We then discuss algorithms for problems that are less common outside the field of cryptography: exponentiation modulo N (as well as in arbitrary groups) and choosing a uniform element of ZN or Z∗N (or in an arbitrary group). This section assumes familiarity with the basic group theory covered in Section 8.1.
The material above is used implicitly throughout the second half of the book, although it is not absolutely necessary to read this material in order to follow the book. (In particular, the reader willing to accept the results of this Appendix without proof can simply read the summary of those results in the theorems below.) Appendix B.3, which discusses finding generators in cyclic groups (when the factorization of the group order is known) and assumes the results of Section 8.3.1, contains material that is hardly used at all; it is included for completeness and reference.
Since our goal is only to establish that certain problems can be solved in polynomial time, we have opted for simplicity rather than efficiency in our selection of algorithms and their descriptions (as long as the algorithms run in polynomial time). For this reason, we generally will not be interested in
547
548 Introduction to Modern Cryptography
the exact running times of the algorithms we present beyond establishing that they indeed run in polynomial time. The reader who is seriously interested in implementing these algorithms is forewarned to look at other sources for more efficient alternatives as well as various techniques for speeding up the necessary computations.
The results in this Appendix are summarized by the theorems that follow. Throughout, we assume that any integer a provided as input is written using exactly ∥a∥ bits; i.e., the high-order bit is 1. In Appendix B.1 we show:
THEOREM B.1 (Integer operations) Given integers a and b, it is possible to perform the following operations in time polynomial in ∥a∥ and ∥b∥:
1. Computing the sum a + b and the difference a − b;
2. Computing the product ab;
3. Computing positive integers q and r < b such that a = qb + r (i.e., computing division with remainder);
4. Computing the greatest common divisor of a and b, gcd(a, b);
5. Computing integers X, Y with X a + Y b = gcd(a, b).
The following results are proved in Appendix B.2:
THEOREM B.2 (Modular operations) Given integers N > 1, a, and b, it is possible to perform the following operations in time polynomial in ∥a∥, ∥b∥, and ∥N∥:
1. Computing the modular reduction [a mod N];
2. Computing the sum [(a+b) mod N], the difference [(a−b) mod N], and
the product [ab mod N];
3. Determining whether a is invertible modulo N;
4. Computing the multiplicative inverse [a−1 mod N], assuming a is in- vertible modulo N;
5. Computing the exponentiation [ab mod N].
The following generalizes Theorem B.2(5) to arbitrary groups:
THEOREM B.3 (Group exponentiation) Let G be a group, written multiplicatively. Let g be an element of the group and let b be a non-negative integer. Then gb can be computed using poly(∥b∥) group operations.
Basic Algorithmic Number Theory 549 THEOREM B.4 (Choosing uniform elements) There exists a ran-
domized algorithm with the following properties: on input N,
• The algorithm runs in time polynomial in ∥N∥;
• The algorithm outputs fail with probability negligible in ∥N∥; and
• Conditioned on not outputting fail, the algorithm outputs a uniformly distributed element of ZN .
An algorithm with analogous properties exists for Z∗N as well.
Since the probability that either algorithm referenced in the above theorem outputs fail is negligible, we ignore this possibility (and instead leave it im- plicit). In Appendix B.2 we also discuss generalizations of the above to the case of selecting a uniform element from any finite group (subject to certain requirements on the representation of group elements).
A proof of the following is in Appendix B.3:
THEOREM B.5 (Testing and finding generators) Let G be a cyclic group of order q, and assume that the group operation and selection of a uniform group element can be carried out in unit time.
1.
2.
B.1
There is an algorithm that on input q, the prime factorization of q, and an element g ∈ G, runs in poly(∥q∥) time and decides whether g is a generator of G.
There is a randomized algorithm that on input q and the prime factor- ization of q, runs in poly(∥q∥) time and outputs a generator of G except with probability negligible in ∥q∥. Conditioned on the output being a generator, it is uniformly distributed among the generators of G.
Integer Arithmetic
B.1.1 Basic Operations
We begin our exploration of algorithmic number theory with a discussion of integer addition/subtraction, multiplication, and division with remainder. A little thought shows that all these operations can be carried out in time polynomial in the input length using the standard “grade-school” algorithms for these problems. For example, addition of two positive integers a and b with a > b can be done in time linear in ∥a∥ by stepping one-by-one through the bits of a and b, starting with the low-order bits, and computing the cor- responding output bit and a “carry bit” at each step. (Details are omitted.) Multiplication of two n-bit integers a and b, to take another example, can
550 Introduction to Modern Cryptography
be done by first generating a list of n integers of length at most 2n (each of which is equal to a · 2i−1 · bi, where bi is the ith bit of b) and then adding these n integers together to obtain the final result. (Division with remainder is trickier to implement efficiently, but can also be done.)
Although these grade-school algorithms suffice to demonstrate that the aforementioned problems can be solved in polynomial time, it is interesting to note that these algorithms are in some cases not the best ones available. As an example, the simple algorithm for multiplication given above multiplies two n-bit numbers in time O(n2), but there exists a better algorithm running in time O(nlog2 3) (and even that is not the best possible). While the differ- ence is insignificant for numbers of the size we encounter daily, it becomes noticeable when the numbers are large. In cryptographic applications it is not uncommon to use integers that are thousands of bits long (i.e., n > 1000), and a judicious choice of which algorithms to use then becomes critical.
B.1.2 The Euclidean and Extended Euclidean Algorithms
Recall from Section 8.1 that gcd(a, b), the greatest common divisor of two integers a and b, is the largest integer d that divides both a and b. We state an easy proposition regarding the greatest common divisor, and then show how this leads to an efficient algorithm for computing gcd’s.
PROPOSITION B.6 Let a, b > 1 with b̸ | a. Then gcd(a, b) = gcd(b, [a mod b]).
PROOF If b > a the stated claim is immediate. So assume a > b. Write a = qb + r for q, r positive integers and r < b (cf. Proposition 8.1); note that r > 0 because b̸ | a. Since r = [a mod b], we prove the proposition by showing that gcd(a, b) = gcd(b, r).
Let d = gcd(a,b). Then d divides both a and b, and so d also divides r = a − qb. By definition of the greatest common divisor, we thus have gcd(b, r) ≥ d = gcd(a, b).
Let d′ = gcd(b,r). Then d′ divides both b and r, and so d′ also divides a = qb + r. By definition of the greatest common divisor, we thus have gcd(a, b) ≥ d′ = gcd(b, r).
Sinced≥d′ andd′ ≥d,weconcludethatd=d′.
The above suggests the recursive Euclidean algorithm (Algorithm B.7) for computing the greatest common divisor gcd(a,b) of two integers a and b. Correctness of the algorithm follows readily from Proposition B.6. As for its running time, we show below that on input (a, b) the algorithm makes fewer than 2 · ∥b∥ recursive calls. Since checking whether b divides a and computing
Basic Algorithmic Number Theory 551
ALGORITHM B.7
The Euclidean algorithm GCD Input: Integers a,b with a ≥ b > 0
Output: The greatest common divisor of a and b
if b divides a return b
else return GCD(b, [a mod b])
[a mod b] can both be done in time polynomial in ∥a∥ and ∥b∥, this implies that the entire algorithm runs in polynomial time.
PROPOSITION B.8 Consider an execution of GCD(a0, b0), and let ai, bi (for i = 1, . . . , l) denote the arguments to the ith recursive call of GCD. Then bi+2 ≤ bi/2 for 0 ≤ i ≤ l − 2.
PROOF First note that for any a > b we have [amodb] < a/2. To see this, consider the two cases: If b ≤ a/2 then [a mod b] < b ≤ a/2 is immediate. On the other hand, if b > a/2 then [a mod b] = a − b < a/2.
Now fix arbitrary i with 0 ≤ i ≤ l−2. Then bi+2 = [ai+1 modbi+1] < ai+1/2 = bi/2.
COROLLARY B.9 In an execution of algorithm GCD(a, b), there are at most 2 ∥b∥ − 2 recursive calls to GCD.
PROOF Let ai, bi (for i = 1, . . . , l) denote the arguments to the ith recur- sive call of GCD. The {bi} are always greater than zero, and the algorithm makes no further recursive calls if it ever happens that bi = 1 (since then bi | ai). The previous proposition indicates that the {bi} decrease by a mul- tiplicative factor of (at least) 2 in every two iterations. It follows that the number of recursive calls to GCD is at most 2 · (∥b∥ − 1).
The Extended Euclidean Algorithm
By Proposition 8.2, we know that for positive integers a, b there exist inte- gers X, Y with Xa + Y b = gcd(a, b). A simple modification of the Euclidean algorithm, called the extended Euclidean algorithm, can be used to find X, Y in addition to computing gcd(a,b); see Algorithm B.10. You are asked to show correctness of the extended Euclidean algorithm in Exercise B.1, and to prove that the algorithm runs in polynomial time in Exercise B.2.
552 Introduction to Modern Cryptography
ALGORITHM B.10
The extended Euclidean algorithm eGCD Input: Integers a,b with a ≥ b > 0
Output: (d,X,Y)withd=gcd(a,b)andXa+Yb=d
if b divides a return (b, 0, 1)
else
Compute integers q, r with a = qb + r and 0 < r < b (d,X,Y):=eGCD(b,r) //notethatXb+Yr=d return (d, Y, X − Y q)
B.2 Modular Arithmetic
We now turn our attention to basic arithmetic operations modulo N > 1. We will use ZN to refer both to the set {0,…,N −1} as well as to the group that results by considering addition modulo N among the elements of this set.
B.2.1 Basic Operations
Efficient algorithms for the basic arithmetic operations over the integers immediately imply efficient algorithms for the corresponding arithmetic oper- ations modulo N. For example, computing the modular reduction [a mod N] can be done in time polynomial in ∥a∥ and ∥N∥ by computing division-with- remainder over the integers. Next consider modular operations on two ele- ments a,b ∈ ZN where ∥N∥ = n. (Note that a,b have length at most n. Ac- tually, it is convenient to simply assume that all elements of ZN have length exactly n, padding to the left with 0s if necessary.) Addition of a and b mod- ulo N can be done by first computing a+b, an integer of length at most n+1, and then reducing this intermediate result modulo N. Similarly, multiplica- tion modulo N can be performed by first computing the integer ab of length at most 2n and then reducing the result modulo N. Since addition, multipli- cation, and division-with-remainder can all be done in polynomial time, these give polynomial-time algorithms for addition and multiplication modulo N.
B.2.2 Computing Modular Inverses
Our discussion thus far has shown how to add, subtract, and multiply mod- ulo N. One operation we are missing is “division” or, equivalently, computing multiplicative inverses modulo N. Recall from Section 8.1.2 that the multi- plicative inverse (modulo N) of an element a ∈ ZN is an element a−1 ∈ ZN such that a · a−1 = 1 mod N. Proposition 8.7 shows that a has an inverse if and only if gcd(a,N) = 1, i.e., if and only if a ∈ Z∗N. Thus, using the
Basic Algorithmic Number Theory 553 Euclidean algorithm we can easily determine whether a given element a has
a multiplicative inverse modulo N.
Given N and a ∈ ZN with gcd(a,N) = 1, Proposition 8.2 tells us that there exist integers X, Y with Xa + Y N = 1. This means that [X mod N] is the multiplicative inverse of a. Integers X and Y satisfying Xa + Y N = 1 can be found efficiently using the extended Euclidean algorithm eGCD shown in Section B.1.2. This leads to the following polynomial-time algorithm for computing multiplicative inverses:
ALGORITHM B.11
Computing modular inverses Input: Modulus N; element a
Output: [a−1 mod N] (if it exists)
(d,X,Y):=eGCD(a,N) //notethatXa+YN=gcd(a,N) if d ̸= 1 return “a is not invertible modulo N”
else return [X mod N]
B.2.3 Modular Exponentiation
A more challenging task is that of exponentiation modulo N, that is, com- puting [ab mod N] for base a ∈ ZN and integer exponent b > 0. (When b = 0 the problem is easy. When b < 0 and a ∈ Z∗N then ab = (a−1)−b modN and the problem is reduced to the case of exponentiation with a positive exponent given that we can compute inverses, as discussed in the previous section.) Notice that the basic approach used in the case of addition and multiplication (i.e., computing the integer ab and then reducing this inter- mediate result modulo N) does not work here: the integer ab has length ab = Θ(log ab ) = Θ(b · ∥a∥), and so even storing the intermediate result ab would require time exponential in ∥b∥ = Θ(log b).
We can address this problem by reducing modulo N at all intermediate steps of the computation, rather than only reducing modulo N at the end. This has the effect of keeping the intermediate results “small” throughout the computation. Even with this important initial observation, it is still non- trivial to design a polynomial-time algorithm for modular exponentiation. Consider the na ̈ıve approach of Algorithm B.12, which simply performs b multiplications by a. This still runs in time that is exponential in ∥b∥.
This na ̈ıve algorithm can be viewed as relying on the following recurrence: [ab modN]=[a·ab−1 modN]=[a·a·ab−2 modN]=···
Any algorithm based on this relationship will require Θ(b) time. We can do
554 Introduction to Modern Cryptography
ALGORITHM B.12
A na ̈ıve algorithm for modular exponentiation Input: Modulus N ; base a ∈ ZN ; integer exponent b > 0
Output: [ab mod N]
x := 1
for i = 1 to b:
x := [x · a mod N] return x
better by relying on the following recurrence:
ab 2 mod N when b is even [abmodN]= b−12
a· a 2 modN whenbisodd.
Doing so leads to an algorithm—called, for obvious reasons, “square-and- multiply” (or “repeated squaring”)—that requires only O(logb) = O(∥b∥) modular squarings/multiplications; see Algorithm B.13. In this algorithm, the length of b decreases by 1 in each iteration; it follows that the number of iterations is ∥b∥, and so the overall algorithm runs in time polynomial in ∥a∥, ∥b∥, and ∥N∥. More precisely, the number of modular squarings is exactly ∥b∥, and the number of additional modular multiplications is exactly the Hamming weight of b (i.e., the number of 1s in the binary representation of b). This explains the preference, discussed in Section 8.2.4, for choosing the public RSA exponent e to have small length/Hamming weight.
2
ALGORITHM B.13
Algorithm ModExp for efficient modular exponentiation Input: Modulus N ; base a ∈ ZN ; integer exponent b > 0
Output: [ab mod N]
x := a
t := 1
// maintain the invariant that the answer is [ t · xb mod N ] while b > 0 do:
if b is odd
t:=[t·xmodN], b:=b−1
x:=[x2 modN], b:=b/2 return t
Fix a and N and consider the modular exponentiation function given by fa,N (b) = [ab mod N ]. We have just seen that computing fa,N is easy. In contrast, computing the inverse of this function—that is, computing b given a, N, and [ab mod N]—is believed to be hard for appropriate choice of a
Basic Algorithmic Number Theory 555 and N . Inverting this function requires solving the discrete-logarithm problem,
something we discuss in detail in Section 8.3.2.
Using precomputation. If the base a is known in advance, and there is a bound on the length of the exponent b, then one can use precomputation and a small amount of memory to speed up computation of [ab mod N]. Say ∥b∥ ≤ n. Then we precompute and store the n values
x0 := a, x1 := [a2 mod N], …, xn−1 := [a2n−1 mod N].
Given exponent b with binary representation bn−1 · · · b0 (written from most
to least significant bit), we then have
ab=an−12i·bi =xbi modN.
n−1 i=0
Since bi ∈ {0, 1}, the number of multiplications needed to compute the result is exactly one less than the Hamming weight of b.
Exponentiation in Arbitrary Groups
The efficient modular exponentiation algorithm given above carries over in a straightforward way to enable efficient exponentiation in any group, as long as the underlying group operation can be performed efficiently. Specifically, if G is a group and g is an element of G, then gb can be computed using at most 2 · ∥b∥ applications of the underlying group operation. Precomputation could also be used, exactly as described above.
If the order q of G is known, then ab = a[b mod q] (cf. Proposition 8.52) and this can be used to speed up the computation by reducing b modulo q first.
Considering the (additive) group ZN , the group exponentiation algorithm just described gives a method for computing the “exponentiation”
def
that differs from the method discussed earlier that relies on standard integer multiplication followed by a modular reduction. In comparing the two ap- proaches to solving the same problem, note that the original algorithm uses specific information about ZN ; in particular, it (essentially) treats the “ex- ponent” b as an element of ZN (possibly by reducing b modulo N first). In contrast, the “square-and-multiply” algorithm just presented treats ZN only as an abstract group. (Of course, the group operation of addition modulo N relies on the specifics of ZN .) The point of this discussion is merely to illus- trate that some group algorithms are generic (i.e., they apply equally well to all groups) while some group algorithms rely on specific properties of a par- ticular group or class of groups. We saw some examples of this phenomenon in Chapter 9.
[b·gmodN] = [g+···+gmodN]
b times
i i=0
556 Introduction to Modern Cryptography B.2.4 *Montgomery Multiplication
Although division over the integers (and hence modular reduction) can be done in polynomial time, algorithms for integer division are slow in compari- son to, say, algorithms for integer multiplication. Montgomery multiplication provides a way to perform modular multiplication without carrying out any expensive modular reductions. Since pre- and postprocessing is required, the method is advantageous only when several modular multiplications will be done in sequence as, e.g., when computing a modular exponentiation.
Fix an odd modulus N with respect to which modular operations are to be done. LetR>Nbeapoweroftwo,sayR=2w,andnotethatgcd(R,N)=1. The key property we will exploit is that division by R is fast: the quotient of x upon division by R is obtained by simply shifting x to the right w positions, and [x mod R] is just the w least-significant bits of x.
∗ def
Define the Montgomery representation of x ∈ ZN by x ̄ = [xR mod N].
Montgomerymultiplicationofx ̄,y ̄∈Z∗N isdefinedas def −1
M o n t ( x ̄ , y ̄ ) = [ x ̄ y ̄ R m o d N ] .
(We show below how this can be computed without any expensive modular
reductions.) Note that
Mont(x ̄,y ̄)=x ̄y ̄R−1 =(xR)(yR)R−1 =(xy)R=xymodN.
This means we can multiply several values in ZN by (1) converting to the
Montgomery representation, (2) carrying out all multiplications using Mont-
gomery multiplication to obtain the final result, and then (3) converting the
result from Montgomery representation back to the standard representation.
def −1 Let α = [−N
mod R], a value which can be precomputed. (Computation
of α, and conversion to/from Montgomery representation, can also be done
without any expensive modular reductions; details are beyond our scope.) To
def
compute c = Mont(x, y) without any expensive modular reductions do: 1. Let z := x · y (over the integers).
2. Setc′ :=(z+[zαmodR]·N)/R.
3. If c′ < N then set c := c′; else set c := c′ − N.
To see that this works, we first need to verify that step 2 is well-defined, namely, that the numerator is divisible by R. This follows because
z+[zαmodR]·N =z+zαN =z−zN−1N =0modR.
Next, note that c′ = z/R mod N after step 2; moreover, since z < N2 < RN wehave0
as desired.
c=[c′ modN]=[z/RmodN]=[xyR−1 modN],
Basic Algorithmic Number Theory 557 B.2.5 Choosing a Uniform Group Element
For cryptographic applications, it is often necessary to choose a uniform element of a group G. We first treat the problem in an abstract setting, and then focus specifically on the cases of ZN and Z∗N .
Note that if G is a cyclic group of order q, and a generator g ∈ G is known, then choosing a uniform element h ∈ G reduces to choosing a uniform integer x ∈ Zq and setting h := gx. In what follows we make no assumptions on G.
Elements of a group G must be specified using some representation of these elements as bit-strings, where we assume without any real loss of generality that all elements are represented using strings of the same length. (It is also crucial that there is a unique string representing each group element.) For example, if ∥N∥ = n then elements of ZN can all be represented as strings of length n, where the integer a ∈ ZN is padded to the left with 0s if ∥a∥ < n.
We do not focus much on the issue of representation, since for all the groups considered in this text the representation can simply be taken to be the “nat- ural” one (as in the case of ZN , above). Note, however, that different repre- sentations of the same group can affect the complexity of performing various computations, and so choosing the “right” representation for a given group is often important in practice. Since our goal is only to show polynomial-time algorithms for each of the operations we need (and not to show the most ef- ficient algorithms known), the exact representation used is less important for our purposes. Moreover, most of the “higher-level” algorithms we present use the group operation in a “black-box” manner, so that as long as the group operation can be performed in polynomial time (in some parameter), the re- sulting algorithm will run in polynomial time as well.
Given a group G where elements are represented by strings of length l, a uniform group element can be selected by choosing uniform l-bit strings until the first string that corresponds to a group element is found. (Note this assumes that testing group membership can be done efficiently.) To obtain an algorithm with bounded running time, we introduce a parameter t bounding the maximum number of times this process is repeated; if all t iterations fail to find an element of G, then the algorithm outputs fail. (An alternative is to output an arbitrary element of G.) That is:
ALGORITHM B.14
Choosing a uniform group element
Input: A (description of a) group G; length-parameter l; parameter t
Output: A uniform element of G
for i = 1 to t:
Choose uniform x ∈ {0, 1}l if x ∈ G return x
return “fail”
558 Introduction to Modern Cryptography
It is clear that whenever the above algorithm does not output fail, it outputs a uniformly distributed element of G. This is simply because each element of G is equally likely to be chosen in any iteration. Formally, if we let Fail be the event that the algorithm outputs fail, then for any element g ∈ G we have
Pr output of the algorithm equals g | Fail = 1 . |G|
What is the probability that the algorithm outputs fail? In any iteration the probability that x ∈ G is exactly |G|/2l, and so the probability that x does not lie in G in any of the t iterations is
|G|t
1− 2l . (B.1)
There is a trade-off between the running time of Algorithm B.14 and the prob-
ability that the algorithm outputs fail: increasing t decreases the probability
of failure but increases the worst-case running time. For cryptographic appli-
cations we need an algorithm where the worst-case running time is polynomial
in the security parameter n, while the failure probability is negligible in n. def l
Let K = 2 /|G|. If we set t := K · n then the probability that the algorithm outputs fail is:
1K·n 1Kn −1n −n 1−K=1−K≤e=e,
using Proposition A.2. Thus, if K = poly(n) (we assume some group-generation algorithm that depends on the security parameter n, and so both |G| and l are functions of n), we obtain an algorithm with the desired properties.
The case of ZN . Consider the group ZN , with n = ∥N ∥. Checking whether an n-bit string x (interpreted as a positive integer of length at most n) is an element of ZN simply requires checking whether x < N. Furthermore,
2n 2n 2n
|ZN | = N ≤ 2n−1 = 2 ,
and so we can sample a uniform element of ZN in poly(n) time and with failure probability negligible in n.
The case of Z∗N . Consider next the group Z∗N , with n = ∥N ∥ as before. Determining whether an n-bit string x is an element of Z∗N is also easy (see the exercises). Moreover,
2n 2n 2n N N
| Z ∗N | = φ ( N ) = N · φ ( N ) ≤ 2 · φ ( N ) .
A poly(n) upper-bound is a consequence of the following theorem.
Basic Algorithmic Number Theory THEOREMB.15 ForN≥3oflengthn,wehave N <2n.
φ(N )
559
(Stronger bounds are known, but the above suffices for our purpose.) The theorem can be proved using Bertrand’s Postulate (Theorem 8.32), but we content ourselves with a proof in two special cases: when N is prime and when N is a product of two equal-length (distinct) primes.
The analysis is easy when N is an odd prime. Here φ(N) = N − 1 and so N 2n 2n 2n
φ(N)≤φ(N)=N−1≤2n−1 =2
(using the fact that N is odd for the second inequality). Consider next the
case of N = pq for p and q distinct, odd primes. Then
N = pq = p · q <3·5<2.
φ(N) (p − 1)(q − 1) p − 1 q − 1 2 4
We conclude that when N is prime or the product of two distinct, odd primes, there is an algorithm for generating a uniform element of Z∗N that runs in time polynomial in n = ∥N∥ and outputs fail with probability negligible in n.
Throughout this book, when we speak of sampling a uniform element of ZN or Z∗N we simply ignore the negligible probability of outputting fail with the understanding that this has no significant effect on the analysis.
B.3 *Finding a Generator of a Cyclic Group
In this section we address the problem of finding a generator of an arbitrary cyclic group G of order q. Here, q does not necessarily denote a prime number; indeed, finding a generator when q is prime is trivial by Corollary 8.55.
We actually show how to sample a uniform generator, proceeding in a manner very similar to that of Section B.2.5. Here, we repeatedly sample uniform elements of G until we find an element that is a generator. As in Section B.2.5, an analysis of this method requires understanding two things:
• How to efficiently test whether a given element is a generator; and
• the fraction of group elements that are generators.
In order to understand these issues, we first develop a bit of additional group-
theoretic background.
B.3.1 Group-Theoretic Background
We tackle the second issue first. Recall that the order of an element h is the smallest positive integer i for which hi = 1. Let g be a generator of a
560 Introduction to Modern Cryptography
group G of order q > 1; this means the order of g is q. Consider an element h ∈ G that is not the identity (the identity cannot be a generator of G), and let us ask whether h might also be a generator of G. Since g generates G, we can write h = gx for some x ∈ {1,…,q−1} (note x ̸= 0 since h is not the identity). Consider two cases:
Case 1: gcd(x, q) = r > 1. Write x = α · r and q = β · r with α, β non-zero integers less than q. Then:
hβ =(gx)β =gαrβ =(gq)α =1.
So the order of h is at most β < q, and h cannot be a generator of G.
Case 2: gcd(x, q) = 1. Let i ≤ q be the order of h. Then g0 =1=hi =(gx)i =gxi,
implying xi = 0 mod q by Proposition 8.53. This means that q | xi. Since gcd(x, q) = 1, however, Proposition 8.3 shows that q | i and so i = q. We conclude that h is a generator of G.
Summarizing the above, we see that for x ∈ {1, . . . , q − 1} the element h = gx is a generator of G exactly when gcd(x, q) = 1. We have thus proved the following:
THEOREM B.16 Let G be a cyclic group of order q > 1 with generator g. There are φ(q) generators of G, and these are exactly given by {gx | x ∈ Z∗q }.
In particular, if G is a group of prime order q, then it has φ(q) = q − 1 generators—exactly in agreement with Corollary 8.55.
We turn next to the first issue, that of deciding whether a given element h is a generator of G. Of course, one way to check whether h generates G is to enumerate {h0,h1,…,hq−1} and see whether this list includes every element of G. This requires time linear in q (i.e., exponential in ∥q∥) and is therefore unacceptable for our purposes. Another approach, if we already know a generator g, is to compute the discrete logarithm x = logg h and then apply the previous theorem; in general, however, we may not have such a g, and anyway computing the discrete logarithm may itself be a hard problem.
If we know the factorization of q, we can do better.
PROPOSITION B.17 Let G be a group of order q, and let q = k pei
i=1 i be the prime factorization of q, where the {pi} are distinct primes and ei ≥ 1.
Set qi =q/pi. Then h∈G is a generator of G if and only if hqi ̸=1 fori=1,…,k.
Basic Algorithmic Number Theory 561
PROOF One direction is easy. Say hqi = 1 for some i. Then the order of h is at most qi < q, and so h cannot be a generator.
Conversely, say h is not a generator but instead has order q′ < q. By
then q′ divides qj = pej −1 · pei , and so (using Proposition 8.53) hqj = ′ j i̸=j i
Proposition 8.54, we know q′ | q. This implies that q′ can be written as q′ = ′
kei′ ′
i=1pi ,whereei ≥0andforatleastoneindexjwehaveej