Cryptography Part I
Feb 26 Introduction to Cryptography
Instructor Information
gerardo.pelosi@polimi.it
alessandro.barenghi@polimi.it
Introduction to cryptography
cryptography: how to design and implement cryptographic algorithms
cryptanalysis: how to break a cipher (recover key/message), analyzing weak mathematical assumptions or advancements, bad implementations
Modern Cryptography
Kerckhoff’s Principle A cryptosystem should be secure even if everything about the system is publicly known, except a single parameter (a.k.a., the cryptographic key)
Security Models
Unconditional Security (Perfect Secrecy): assumes an adversary with unlimited computing power and proves that she does not have enough information to infer either the cryptographic key or the original message
Computational Security: assumes that any adversary is computationally limited (. . .as all adversaries are in practice). It gives a lower-bound on the computational complexity of the best methods available to break the cipher
Security Properties
Confidentiality[保密性]
Confidentiality provides encrypted information, thus making it unreadable to anyone except for the intended receivers, who are able to reverse the encryption
Authenticity[真实性]
Authenticity indicate that systems use to identify their users. It answers the questions of Who is the user/counterpart? and is the user/counterpart really who she claims herself to be?
When a system includes a communication protocol:
The parties may need to identify themselves (entity authentication)
The parties want to make sure that data is really exchanged between the intended endpoints (i.e., No-one in middle is masquerading as one of them) (data origin authentication)
Data integrity[完整性]
Data integrity guarantees that data has not been tampered with.(Insertion/Replacement/Deletion)
Data Authentication (data origin authentication + data integrity)
data origin authentication (the fact that you know who the sender is)
data integrity (the fact that data has not been modified)
Non-repudiation[不可否认性]
Non-repudiation prevents an entity from denying previous commitments or actions later on.
Basic Terminology and Concepts
Alphabet A: a finite set of symbols
The binary alphabet, A={0, 1}, is the most common choice as any other alphabet can be encoded over it
Message Space M: consists of set of strings over an alphabet
Plain Text: an element of M
Ciphertext Space C: consists of a set of strings over an alphabet (which may differ from the alphabet for M).
iphertext: an element of C
KeySpace K: a set of elements called keys.(a set of keys)
Encryption Transformation: Given an element e∈K, the encryption transformation Ee : M→C, uniquely determines a bijective map from M to C.
Decryption transformation: Given an element d∈K, the decryption transformation Ed : C→M, uniquely determines a bijective map from C to M.
Cryptoscheme: e : e ∈ K}, {Dd : d ∈ K}>
Cryptographic Paradigms
Symmetric (or Secret-Key) Cryptosystem
Every user is bound to a single secret key, with fixed length, i.e. encryption Ee and decryption Dd algorithms use the same key: e = d
There are two main strategies for defining E and D, which lead to
Block ciphers: act on a fixed length plain/ciphertext (e.g., AES, 3DES2, CAST5, Camellia, Gost, BlowFish)
Stream ciphers: act on an arbitrary length plain/ciphertext (e.g., RC4, Trivium, A5/1, A5/2, A5/3)
Asymmetric (or Public-Key) Cryptosystem
Every user is bound to a key pair:
Public Key e ∈ K: is employed to encrypt plaintexts (Ee (m), m ∈ M). Can be known to everybody
Private Key d ∈ K: is employed to decrypt ciphertexts (Dd (c), c ∈ D). Must be known to the recipient only (and rigorously kept secret from everybody else)
The public key needs authentication to avoid identity theft. Which guarantee do we have that the public key is really bound to the intended user?
Digital Signatures and Certification Authorities (CAs)
To provide the Non-Repudiation service we need: Digital Signatures and Certification Authorities (CAs)
Digital Signature
A tool to authenticate the public key. It allows to verify the unambiguous association of a user to her public-key
A digital signature can be obtained applying the decryption transformation(using his/her private key d) to the message m, which should be authenticated: s = Dd (m)
Everyone can check the validity of the signed message
(by the public key e) through checking whether Ee (s) = m or not, as e is a publicly known value 非对称加密的原理就好像有一把锁,每个人都可以用一把钥匙(公钥e)锁上,但是要想开这把锁,必须用一把特殊的钥匙(私钥d)开锁,而数字签名的原理刚好相反,一个用户需要向所有的用户展示自己的身份,最好的办法就是他举起他的钥匙大喊,“看!这唯一的一把私钥在我手上”。而在计算机科学的领域,他的做法就是将一条消息解密(这是只有拥有私钥的人才能办得到的事情!)然后向每个人公布解密后的消息,每个人都可以拿到他解密后的消息s和原来的消息m,通过公钥加密s后可以得到m,就证明了这个私钥用户的身份。
Certification Authority (CA)
An independent third party (that all users trust) certifies the binding of a public-key to the identity of the corresponding user。
The CA digitally signs a document (Digital Certificate) saving in public repositories, containing
The identity of the user
The public-key of the user
Metadata (e.g., expiration date, CA name, etc…)
The assumption is that everybody knows the public keys of the CAs, thus every certificate can be checked through verifying a CA signature
Identity Based Cryptosystem
A Public-Key system where the User Identity is employed to uniquely derive the public key.
Identity: any previously recognized and publicly known piece of information bound to a specified user (e.g., a SSN, an email address, passport No., driving licence No., position in a company)
Public Key: is uniquely derived from the Identity chosen by the user. May be known to everybody
Private Key: is released to each user by a Trusted Authority (TA) who combines the user Identity and a master secret parameter to compute it
Attacks on Cryptoscheme
Basic assumptions: Kerckhoff’s principle
The alphabet A, the structure of the plaintexts (i.e., the form of M) and the details of the encryption/decryption algorithms are known
Brute force attack (Exhaustive key search)
Given a ciphertext, it checks all possible keys until the correct one is found. The attacker must be able to distinguish the correct plaintext from a valid but incorrect one
Categrories
Ciphertext-only attack (COA)
The attacker knows the ciphertext of a number of messages encrypted with same key (he doesn’t known any plaintext)
Known-plaintext attack (KPA)
The attacker knowns plain-ciphertext pairs, encrypted with same key
Chosen-plaintext attack (CPA)
The attacker chooses a number of plaintexts to be encrypted (all with the same unknown key) and obtains the corresponding ciphertexts
Mar 1 Historical Ciphers
Substitution Ciphers
Monoalphabetic Substitution Cipher(eg: Shift Cipher, The Gold Bug Cryptogram):
There is a 1-to-1 correspondence between plaintext (ptx) and ciphertext (ctx) letters (also digrams, trigrams, etc). Thus, assuming the Kerchoff’s principle holds, a Ciphertext-Only attack (COA) easily reveals the key(Based on the statistic of frequency of each letter).
The statistics of the plaintext language (frequency distribution of the symbols of Am in M) is known
The statistics of the ciphertext space, i.e., the frequency distribution of the symbols of Ac in C, can be computed over the available ciphertexts.
The substitution map (i.e., the key) can be inferred easily by matching the symbols occurring with similar frequencies
Most common letters: E,A,T,O,N,I…
Most common bigrams: TH,HE,IN,ER…
Most common trigrams:THE,ING,AND…
Polyalphabetic Cipher(eg: Vigenere Cipher):
The plaintext and ciphertext spaces include finite sequence of letters (words) from the alphabets Am and Ac , respectively.
The encryption transformation is defined as the application of “L > 1 bijective maps”(L is the length of the key, may be reused and reused) between the two alphabets: µ0, µ1,… here each µ means a mapping from Am to Ac.
here it means that one µ is a crypto-system that mapping A to C, B to D … so the keyspace is (|Am|!)^L
The key k = (µ0, µ1, . . . , µL−1) is constituted by the L maps employed to encrypt the ciphertext
Vigenere Cipher
If the plain text is ATTACKATDAWN
Choose the key as LEMON and get the length same with plain text LEMONLEMONLE
Here each letter in the key noted as a sequence, for example,
K1 = L, and the M1 is A, so the C1 is L(the 1st one noted by L).
K2 = E, and the M2 is T, so the C2 is X(the 20st one noted by E).
… …
And get the final cipher text LXFOPVEFRNHR
The cipher is still easy to break with a Ciphertext Only Attack using Kasisky Test
It bases on the fact that two identical segments of 2 ≤ l ≤ L plaintext letters (l-gram), will be encrypted to the same sequence of l ciphertext letters (when properly aligned with the keyword)
Deduction: The distance d between two repeated sequences of l chars in the ciphertext may suggest a multiple of the key length L (find the GCD)
Beale’s Cipher/Book Cipher
Permutation Ciphers/Transposition Cipher
The encryption transformation of this cipher consists of a permutation of the positions of the plaintext letters
a permutation cipher is not equivalent to a substitution map as each plaintext letter may corresponds to multiple ciphertext letters depending on the position
COA may not effective, but KPA/CPA still work
Affine Ciphers
- The Hill Cipher
Perfect Secrecy
A perfectly secret cipher should be unbreakable regardless of the (computational) effort thrown at it This implies that the ciphertext alone provides no information (no clue) to an attacker
A perfectly secure cipher is proven to be resistant to COAs, KPAs, and CCA(2)s.
Definition
A symmetric-key cryptosystem is Perfectly Secure if the ciphertext does not reveal any information about the plaintext
Pr(P = m|C = c) = Pr(P = m), ∀m ∈ M, c ∈ C
LEMMA1: Pr(C = c|P = m) = Pr(C = c), ∀m ∈ M, c ∈ C
LEMMA2: Given a Perfectly Secure symmetric key cryptosystem, the followingconditions hold |K| ≥ |C| ≥ |M|
Theorem (C. Shannon)
(i) every key is used with probability 1/|K|
(ii) ∀ (m, c)∈M×C there is a unique key k∈K s.t. Ek (m)=c
Vernam Cipher/One-Time-Pad (OTP)[一次性密码本]
Modified Shift Cipher
To encrypt plain text [This is an example]
Using the OTP of [MASKL NSFLD FKJPQ]
First transform two strings into numbers
This is an example → 19 7 8 18 8 18 0 13 4 23 0 12 15 11 4
MASKL NSFLD FKJPQ → 12 0 18 10 11 13 18 5 11 3 5 10 9 15 16
Sum them up and get
31 7 26 28 19 31 18 18 15 26 5 22 24 26 20
Mod 26 get
5 7 0 2 19 5 18 18 11 0 5 22 24 0 20
Transform back to English and get
FHACTFSSLAFWYAU
The aformentioned OTP system employed with binary keys and messages is the most effective implementation of a perfectly secure cryptoscheme
This system is perfectly secure (M=C=K={0, . . . , 25}^5; ptxs and keys are independent):
each key is chosen with uniform probability ⇒ Pr(K = k) = 1/26^5
for each ptx-ctx pair
there is a unique key: ki=(ci−mi) mod 26, ∀ i
Computationally secure ciphers
Entropy
Definition
Let X be a random variable (generally means an alphabet) which takes values in {x1, x2, . . . xn} with probability distribution pi=Pr(X = xi), ∀ 1≤i≤n. The Entropy of X is defined as:
H(X) = −Sigma(i from 1 to n) pilog2pi
assuming conventionally that pilog2pi=0, if pi=0.
H(X, Y )≤H(X)+H(Y ), the equality holds if X, Y are independent
H(X, Y )=H(Y )+H(X|Y );
H(X|Y )≤H(X), the equality holds if X, Y are independent
H(P|K, C)=0: if you know the ctx and the key you do not have any uncertainty in deriving the ptx
H(C|P,K)=0: if you know the ptx and the key you do not have any uncertainty in deriving the ctx
H(C, P,K)=H(P,K)+H(C|P,K)=H(P,K)=H(P)+H(K)
H(C, P,K)=H(K, C)+H(P|K, C)=H(K, C)
Key Equivocation
It defines the amount of information (uncertainty) about the key, that you got by the knowledge of a ctx: H(K|C)
H(K|C) = H(K, C) − H(C) = H(P) + H(K) − H(C)
Definition of Language Redundancy
RL = 1 − HL/log2|M|
Unicity Distance
It is the length of ciphertext words (i.e., the number of ctx) n=n0 such that the number of spurious keys is equal to zero,
n0 ≈ log2|K|/(RLlog2|M|)