Cryptography Part I

Cesare (MINGJU LI) Mar 15, 2018

Feb 26 Introduction to Cryptography

Instructor Information

gerardo.pelosi@polimi.it

alessandro.barenghi@polimi.it

Introduction to cryptography

cryptography: how to design and implement cryptographic algorithms

cryptanalysis: how to break a cipher (recover key/message), analyzing weak mathematical assumptions or advancements, bad implementations

Modern Cryptography

Kerckhoff’s Principle A cryptosystem should be secure even if everything about the system is publicly known, except a single parameter (a.k.a., the cryptographic key)

Security Models

Unconditional Security (Perfect Secrecy): assumes an adversary with unlimited computing power and proves that she does not have enough information to infer either the cryptographic key or the original message

Computational Security: assumes that any adversary is computationally limited (. . .as all adversaries are in practice). It gives a lower-bound on the computational complexity of the best methods available to break the cipher

Security Properties

Confidentiality[保密性]

Confidentiality provides encrypted information, thus making it unreadable to anyone except for the intended receivers, who are able to reverse the encryption
Authenticity[真实性]

Authenticity indicate that systems use to identify their users. It answers the questions of Who is the user/counterpart? and is the user/counterpart really who she claims herself to be?
When a system includes a communication protocol:
- The parties may need to identify themselves (entity authentication)
- The parties want to make sure that data is really exchanged between the intended endpoints (i.e., No-one in middle is masquerading as one of them) (data origin authentication)
Data integrity[完整性]

Data integrity guarantees that data has not been tampered with.(Insertion/Replacement/Deletion)
Data Authentication (data origin authentication + data integrity)
- data origin authentication (the fact that you know who the sender is)
- data integrity (the fact that data has not been modified)
Non-repudiation[不可否认性]

Non-repudiation prevents an entity from denying previous commitments or actions later on.

Basic Terminology and Concepts

Alphabet A: a finite set of symbols

The binary alphabet, A={0, 1}, is the most common choice as any other alphabet can be encoded over it
Message Space M: consists of set of strings over an alphabet
Plain Text: an element of M
Ciphertext Space C: consists of a set of strings over an alphabet (which may differ from the alphabet for M).
iphertext: an element of C
KeySpace K: a set of elements called keys.(a set of keys)
Encryption Transformation: Given an element e∈K, the encryption transformation E_e : M→C, uniquely determines a bijective map from M to C.
Decryption transformation: Given an element d∈K, the decryption transformation E_d : C→M, uniquely determines a bijective map from C to M.
Cryptoscheme: e : e ∈ K}, {D_d : d ∈ K}>

Cryptographic Paradigms

Symmetric (or Secret-Key) Cryptosystem

Every user is bound to a single secret key, with fixed length, i.e. encryption E_e and decryption D_d algorithms use the same key: e = d
There are two main strategies for defining E and D, which lead to
- Block ciphers: act on a fixed length plain/ciphertext (e.g., AES, 3DES2, CAST5, Camellia, Gost, BlowFish)
- Stream ciphers: act on an arbitrary length plain/ciphertext (e.g., RC4, Trivium, A5/1, A5/2, A5/3)
Asymmetric (or Public-Key) Cryptosystem

Every user is bound to a key pair:
- Public Key e ∈ K: is employed to encrypt plaintexts (E_e (m), m ∈ M). Can be known to everybody
- Private Key d ∈ K: is employed to decrypt ciphertexts (D_d (c), c ∈ D). Must be known to the recipient only (and rigorously kept secret from everybody else)
  
  The public key needs authentication to avoid identity theft. Which guarantee do we have that the public key is really bound to the intended user?
Digital Signatures and Certification Authorities (CAs)

To provide the Non-Repudiation service we need: Digital Signatures and Certification Authorities (CAs)
- Digital Signature
  
  A tool to authenticate the public key. It allows to verify the unambiguous association of a user to her public-key
  - A digital signature can be obtained applying the decryption transformation(using his/her private key d) to the message m, which should be authenticated: s = D_d (m)
  - Everyone can check the validity of the signed message (by the public key e) through checking whether E_e (s) = m or not, as e is a publicly known value
    
    非对称加密的原理就好像有一把锁，每个人都可以用一把钥匙（公钥e）锁上，但是要想开这把锁，必须用一把特殊的钥匙（私钥d）开锁，而数字签名的原理刚好相反，一个用户需要向所有的用户展示自己的身份，最好的办法就是他举起他的钥匙大喊，“看！这唯一的一把私钥在我手上”。而在计算机科学的领域，他的做法就是将一条消息解密（这是只有拥有私钥的人才能办得到的事情！）然后向每个人公布解密后的消息，每个人都可以拿到他解密后的消息s和原来的消息m，通过公钥加密s后可以得到m，就证明了这个私钥用户的身份。
- Certification Authority (CA)
  
  An independent third party (that all users trust) certifies the binding of a public-key to the identity of the corresponding user。
  
  The CA digitally signs a document (Digital Certificate) saving in public repositories, containing
  - The identity of the user
  - The public-key of the user
  - Metadata (e.g., expiration date, CA name, etc…)
    
    The assumption is that everybody knows the public keys of the CAs, thus every certificate can be checked through verifying a CA signature
- Identity Based Cryptosystem
  
  A Public-Key system where the User Identity is employed to uniquely derive the public key.
  - Identity: any previously recognized and publicly known piece of information bound to a specified user (e.g., a SSN, an email address, passport No., driving licence No., position in a company)
  - Public Key: is uniquely derived from the Identity chosen by the user. May be known to everybody
  - Private Key: is released to each user by a Trusted Authority (TA) who combines the user Identity and a master secret parameter to compute it

Attacks on Cryptoscheme

Basic assumptions: Kerckhoff’s principle

The alphabet A, the structure of the plaintexts (i.e., the form of M) and the details of the encryption/decryption algorithms are known

Brute force attack (Exhaustive key search)

Given a ciphertext, it checks all possible keys until the correct one is found. The attacker must be able to distinguish the correct plaintext from a valid but incorrect one

Categrories

Ciphertext-only attack (COA)

The attacker knows the ciphertext of a number of messages encrypted with same key (he doesn’t known any plaintext)
Known-plaintext attack (KPA)

The attacker knowns plain-ciphertext pairs, encrypted with same key
Chosen-plaintext attack (CPA)

The attacker chooses a number of plaintexts to be encrypted (all with the same unknown key) and obtains the corresponding ciphertexts

Mar 1 Historical Ciphers

Substitution Ciphers

Monoalphabetic Substitution Cipher(eg: Shift Cipher, The Gold Bug Cryptogram):

There is a 1-to-1 correspondence between plaintext (ptx) and ciphertext (ctx) letters (also digrams, trigrams, etc). Thus, assuming the Kerchoff’s principle holds, a Ciphertext-Only attack (COA) easily reveals the key(Based on the statistic of frequency of each letter).

The statistics of the plaintext language (frequency distribution of the symbols of Am in M) is known

The statistics of the ciphertext space, i.e., the frequency distribution of the symbols of Ac in C, can be computed over the available ciphertexts.

The substitution map (i.e., the key) can be inferred easily by matching the symbols occurring with similar frequencies

Most common letters: E,A,T,O,N,I…

Most common bigrams: TH,HE,IN,ER…

Most common trigrams:THE,ING,AND…
Polyalphabetic Cipher(eg: Vigenere Cipher):

The plaintext and ciphertext spaces include finite sequence of letters (words) from the alphabets A_m and A_c , respectively.

The encryption transformation is defined as the application of “L > 1 bijective maps”(L is the length of the key, may be reused and reused) between the two alphabets: µ0, µ1,… here each µ means a mapping from A_m to A_c.

here it means that one µ is a crypto-system that mapping A to C, B to D … so the keyspace is (|A_m|!)^L

The key k = (µ0, µ1, . . . , µL−1) is constituted by the L maps employed to encrypt the ciphertext
- Vigenere Cipher
  
  If the plain text is ATTACKATDAWN
  
  Choose the key as LEMON and get the length same with plain text LEMONLEMONLE
  
  Here each letter in the key noted as a sequence, for example,
  
  K1 = L, and the M1 is A, so the C1 is L(the 1st one noted by L).
  
  K2 = E, and the M2 is T, so the C2 is X(the 20st one noted by E).
  
  … …
  
  And get the final cipher text LXFOPVEFRNHR
  
  The cipher is still easy to break with a Ciphertext Only Attack using Kasisky Test
  
  It bases on the fact that two identical segments of 2 ≤ l ≤ L plaintext letters (l-gram), will be encrypted to the same sequence of l ciphertext letters (when properly aligned with the keyword)
  
  Deduction: The distance d between two repeated sequences of l chars in the ciphertext may suggest a multiple of the key length L (find the GCD)
- Beale’s Cipher/Book Cipher

Permutation Ciphers/Transposition Cipher

The encryption transformation of this cipher consists of a permutation of the positions of the plaintext letters

a permutation cipher is not equivalent to a substitution map as each plaintext letter may corresponds to multiple ciphertext letters depending on the position

COA may not effective, but KPA/CPA still work

Affine Ciphers

The Hill Cipher

Perfect Secrecy

A perfectly secret cipher should be unbreakable regardless of the (computational) effort thrown at it This implies that the ciphertext alone provides no information (no clue) to an attacker

A perfectly secure cipher is proven to be resistant to COAs, KPAs, and CCA(2)s.

Definition

A symmetric-key cryptosystem is Perfectly Secure if the ciphertext does not reveal any information about the plaintext

Pr(P = m|C = c) = Pr(P = m), ∀m ∈ M, c ∈ C

LEMMA1: Pr(C = c|P = m) = Pr(C = c), ∀m ∈ M, c ∈ C

LEMMA2: Given a Perfectly Secure symmetric key cryptosystem, the followingconditions hold |K| ≥ |C| ≥ |M|

Theorem (C. Shannon)

Let k (), k ∈ K}, {D_k (), k ∈ K}> denote a symmetric key cryptosystem where the keys are picked independently of plaintexts values and |K| = |C| = |M|, The cryptosystem is perfectly secure iff

(i) every key is used with probability 1/|K|

(ii) ∀ (m, c)∈M×C there is a unique key k∈K s.t. E_k (m)=c

Vernam Cipher/One-Time-Pad (OTP)[一次性密码本]

Modified Shift Cipher

To encrypt plain text [This is an example]

Using the OTP of [MASKL NSFLD FKJPQ]

First transform two strings into numbers

This is an example → 19 7 8 18 8 18 0 13 4 23 0 12 15 11 4

MASKL NSFLD FKJPQ → 12 0 18 10 11 13 18 5 11 3 5 10 9 15 16

Sum them up and get

31 7 26 28 19 31 18 18 15 26 5 22 24 26 20

Mod 26 get

5 7 0 2 19 5 18 18 11 0 5 22 24 0 20

Transform back to English and get

FHACTFSSLAFWYAU

The aformentioned OTP system employed with binary keys and messages is the most effective implementation of a perfectly secure cryptoscheme

This system is perfectly secure (M=C=K={0, . . . , 25}^5; ptxs and keys are independent):

each key is chosen with uniform probability ⇒ Pr(K = k) = 1/26^5

for each ptx-ctx pair there is a unique key: k_i=(c_i−m_i) mod 26, ∀ i

Computationally secure ciphers

Entropy

Definition

Let X be a random variable (generally means an alphabet) which takes values in {x1, x2, . . . xn} with probability distribution p_i=Pr(X = x_i), ∀ 1≤i≤n. The Entropy of X is defined as:

H(X) = −Sigma(i from 1 to n) p_ilog₂p_i

assuming conventionally that p_ilog₂p_i=0, if p_i=0.

H(X, Y )≤H(X)+H(Y ), the equality holds if X, Y are independent

H(X, Y )=H(Y )+H(X|Y );

H(X|Y )≤H(X), the equality holds if X, Y are independent

H(P|K, C)=0: if you know the ctx and the key you do not have any uncertainty in deriving the ptx

H(C|P,K)=0: if you know the ptx and the key you do not have any uncertainty in deriving the ctx

H(C, P,K)=H(P,K)+H(C|P,K)=H(P,K)=H(P)+H(K)

H(C, P,K)=H(K, C)+H(P|K, C)=H(K, C)

Key Equivocation

It defines the amount of information (uncertainty) about the key, that you got by the knowledge of a ctx: H(K|C)

H(K|C) = H(K, C) − H(C) = H(P) + H(K) − H(C)
Definition of Language Redundancy

R_L = 1 − H_L/log₂|M|
Unicity Distance

It is the length of ciphertext words (i.e., the number of ctx) n=n0 such that the number of spurious keys is equal to zero,

n0 ≈ log₂|K|/(R_Llog₂|M|)