Cryptography Part I

Feb 26 Introduction to Cryptography

Instructor Information

gerardo.pelosi@polimi.it

alessandro.barenghi@polimi.it

Introduction to cryptography

cryptography: how to design and implement cryptographic algorithms

cryptanalysis: how to break a cipher (recover key/message), analyzing weak mathematical assumptions or advancements, bad implementations

Modern Cryptography

Kerckhoff’s Principle A cryptosystem should be secure even if everything about the system is publicly known, except a single parameter (a.k.a., the cryptographic key)

Security Models

Unconditional Security (Perfect Secrecy): assumes an adversary with unlimited computing power and proves that she does not have enough information to infer either the cryptographic key or the original message

Computational Security: assumes that any adversary is computationally limited (. . .as all adversaries are in practice). It gives a lower-bound on the computational complexity of the best methods available to break the cipher

Security Properties

  1. Confidentiality[保密性]

    Confidentiality provides encrypted information, thus making it unreadable to anyone except for the intended receivers, who are able to reverse the encryption

  2. Authenticity[真实性]

    Authenticity indicate that systems use to identify their users. It answers the questions of Who is the user/counterpart? and is the user/counterpart really who she claims herself to be?

    When a system includes a communication protocol:

    • The parties may need to identify themselves (entity authentication)

    • The parties want to make sure that data is really exchanged between the intended endpoints (i.e., No-one in middle is masquerading as one of them) (data origin authentication)

  3. Data integrity[完整性]

    Data integrity guarantees that data has not been tampered with.(Insertion/Replacement/Deletion)

    Data Authentication (data origin authentication + data integrity)

    • data origin authentication (the fact that you know who the sender is)

    • data integrity (the fact that data has not been modified)

  4. Non-repudiation[不可否认性]

    Non-repudiation prevents an entity from denying previous commitments or actions later on.

Basic Terminology and Concepts

  1. Alphabet A: a finite set of symbols

    The binary alphabet, A={0, 1}, is the most common choice as any other alphabet can be encoded over it

  2. Message Space M: consists of set of strings over an alphabet

  3. Plain Text: an element of M

  4. Ciphertext Space C: consists of a set of strings over an alphabet (which may differ from the alphabet for M).

  5. iphertext: an element of C

  6. KeySpace K: a set of elements called keys.(a set of keys)

  7. Encryption Transformation: Given an element e∈K, the encryption transformation Ee : M→C, uniquely determines a bijective map from M to C.

  8. Decryption transformation: Given an element d∈K, the decryption transformation Ed : C→M, uniquely determines a bijective map from C to M.

  9. Cryptoscheme: e : e ∈ K}, {Dd : d ∈ K}>

Cryptographic Paradigms

  1. Symmetric (or Secret-Key) Cryptosystem

    Every user is bound to a single secret key, with fixed length, i.e. encryption Ee and decryption Dd algorithms use the same key: e = d

    There are two main strategies for defining E and D, which lead to

    • Block ciphers: act on a fixed length plain/ciphertext (e.g., AES, 3DES2, CAST5, Camellia, Gost, BlowFish)

    • Stream ciphers: act on an arbitrary length plain/ciphertext (e.g., RC4, Trivium, A5/1, A5/2, A5/3)

  2. Asymmetric (or Public-Key) Cryptosystem

    Every user is bound to a key pair:

    • Public Key e ∈ K: is employed to encrypt plaintexts (Ee (m), m ∈ M). Can be known to everybody

    • Private Key d ∈ K: is employed to decrypt ciphertexts (Dd (c), c ∈ D). Must be known to the recipient only (and rigorously kept secret from everybody else)

      The public key needs authentication to avoid identity theft. Which guarantee do we have that the public key is really bound to the intended user?

  3. Digital Signatures and Certification Authorities (CAs)

    To provide the Non-Repudiation service we need: Digital Signatures and Certification Authorities (CAs)

    • Digital Signature

      A tool to authenticate the public key. It allows to verify the unambiguous association of a user to her public-key

      • A digital signature can be obtained applying the decryption transformation(using his/her private key d) to the message m, which should be authenticated: s = Dd (m)

      • Everyone can check the validity of the signed message (by the public key e) through checking whether Ee (s) = m or not, as e is a publicly known value

        非对称加密的原理就好像有一把锁,每个人都可以用一把钥匙(公钥e)锁上,但是要想开这把锁,必须用一把特殊的钥匙(私钥d)开锁,而数字签名的原理刚好相反,一个用户需要向所有的用户展示自己的身份,最好的办法就是他举起他的钥匙大喊,“看!这唯一的一把私钥在我手上”。而在计算机科学的领域,他的做法就是将一条消息解密(这是只有拥有私钥的人才能办得到的事情!)然后向每个人公布解密后的消息,每个人都可以拿到他解密后的消息s和原来的消息m,通过公钥加密s后可以得到m,就证明了这个私钥用户的身份。

    • Certification Authority (CA)

      An independent third party (that all users trust) certifies the binding of a public-key to the identity of the corresponding user。

      The CA digitally signs a document (Digital Certificate) saving in public repositories, containing

      • The identity of the user

      • The public-key of the user

      • Metadata (e.g., expiration date, CA name, etc…)

        The assumption is that everybody knows the public keys of the CAs, thus every certificate can be checked through verifying a CA signature

    • Identity Based Cryptosystem

      A Public-Key system where the User Identity is employed to uniquely derive the public key.

      • Identity: any previously recognized and publicly known piece of information bound to a specified user (e.g., a SSN, an email address, passport No., driving licence No., position in a company)

      • Public Key: is uniquely derived from the Identity chosen by the user. May be known to everybody

      • Private Key: is released to each user by a Trusted Authority (TA) who combines the user Identity and a master secret parameter to compute it

Attacks on Cryptoscheme

Basic assumptions: Kerckhoff’s principle

The alphabet A, the structure of the plaintexts (i.e., the form of M) and the details of the encryption/decryption algorithms are known

Brute force attack (Exhaustive key search)

Given a ciphertext, it checks all possible keys until the correct one is found. The attacker must be able to distinguish the correct plaintext from a valid but incorrect one

Categrories

  • Ciphertext-only attack (COA)

    The attacker knows the ciphertext of a number of messages encrypted with same key (he doesn’t known any plaintext)

  • Known-plaintext attack (KPA)

    The attacker knowns plain-ciphertext pairs, encrypted with same key

  • Chosen-plaintext attack (CPA)

    The attacker chooses a number of plaintexts to be encrypted (all with the same unknown key) and obtains the corresponding ciphertexts

Mar 1 Historical Ciphers

Substitution Ciphers

  • Monoalphabetic Substitution Cipher(eg: Shift Cipher, The Gold Bug Cryptogram):

    There is a 1-to-1 correspondence between plaintext (ptx) and ciphertext (ctx) letters (also digrams, trigrams, etc). Thus, assuming the Kerchoff’s principle holds, a Ciphertext-Only attack (COA) easily reveals the key(Based on the statistic of frequency of each letter).

    The statistics of the plaintext language (frequency distribution of the symbols of Am in M) is known

    The statistics of the ciphertext space, i.e., the frequency distribution of the symbols of Ac in C, can be computed over the available ciphertexts.

    The substitution map (i.e., the key) can be inferred easily by matching the symbols occurring with similar frequencies

    Most common letters: E,A,T,O,N,I…

    Most common bigrams: TH,HE,IN,ER…

    Most common trigrams:THE,ING,AND…

  • Polyalphabetic Cipher(eg: Vigenere Cipher):

    The plaintext and ciphertext spaces include finite sequence of letters (words) from the alphabets Am and Ac , respectively.

    The encryption transformation is defined as the application of “L > 1 bijective maps”(L is the length of the key, may be reused and reused) between the two alphabets: µ0, µ1,… here each µ means a mapping from Am to Ac.

    here it means that one µ is a crypto-system that mapping A to C, B to D … so the keyspace is (|Am|!)^L

    The key k = (µ0, µ1, . . . , µL−1) is constituted by the L maps employed to encrypt the ciphertext

    • Vigenere Cipher

      If the plain text is ATTACKATDAWN

      Choose the key as LEMON and get the length same with plain text LEMONLEMONLE

      Here each letter in the key noted as a sequence, for example,

      K1 = L, and the M1 is A, so the C1 is L(the 1st one noted by L).

      K2 = E, and the M2 is T, so the C2 is X(the 20st one noted by E).

      … …

      And get the final cipher text LXFOPVEFRNHR

      The cipher is still easy to break with a Ciphertext Only Attack using Kasisky Test

      It bases on the fact that two identical segments of 2 ≤ l ≤ L plaintext letters (l-gram), will be encrypted to the same sequence of l ciphertext letters (when properly aligned with the keyword)

      Deduction: The distance d between two repeated sequences of l chars in the ciphertext may suggest a multiple of the key length L (find the GCD)

    • Beale’s Cipher/Book Cipher

Permutation Ciphers/Transposition Cipher

The encryption transformation of this cipher consists of a permutation of the positions of the plaintext letters

a permutation cipher is not equivalent to a substitution map as each plaintext letter may corresponds to multiple ciphertext letters depending on the position

COA may not effective, but KPA/CPA still work

Affine Ciphers

  • The Hill Cipher

Perfect Secrecy

A perfectly secret cipher should be unbreakable regardless of the (computational) effort thrown at it This implies that the ciphertext alone provides no information (no clue) to an attacker

A perfectly secure cipher is proven to be resistant to COAs, KPAs, and CCA(2)s.

Definition

A symmetric-key cryptosystem is Perfectly Secure if the ciphertext does not reveal any information about the plaintext

Pr(P = m|C = c) = Pr(P = m), ∀m ∈ M, c ∈ C

LEMMA1: Pr(C = c|P = m) = Pr(C = c), ∀m ∈ M, c ∈ C

LEMMA2: Given a Perfectly Secure symmetric key cryptosystem, the followingconditions hold |K| ≥ |C| ≥ |M|

Theorem (C. Shannon)

Let k (), k ∈ K}, {Dk (), k ∈ K}> denote a symmetric key cryptosystem where the keys are picked independently of plaintexts values and |K| = |C| = |M|, The cryptosystem is perfectly secure iff

(i) every key is used with probability 1/|K|

(ii) ∀ (m, c)∈M×C there is a unique key k∈K s.t. Ek (m)=c

Vernam Cipher/One-Time-Pad (OTP)[一次性密码本]

  • Modified Shift Cipher

    To encrypt plain text [This is an example]

    Using the OTP of [MASKL NSFLD FKJPQ]

    First transform two strings into numbers

    This is an example → 19 7 8 18 8 18 0 13 4 23 0 12 15 11 4

    MASKL NSFLD FKJPQ → 12 0 18 10 11 13 18 5 11 3 5 10 9 15 16

    Sum them up and get

    31 7 26 28 19 31 18 18 15 26 5 22 24 26 20

    Mod 26 get

    5 7 0 2 19 5 18 18 11 0 5 22 24 0 20

    Transform back to English and get

    FHACTFSSLAFWYAU

    The aformentioned OTP system employed with binary keys and messages is the most effective implementation of a perfectly secure cryptoscheme

    This system is perfectly secure (M=C=K={0, . . . , 25}^5; ptxs and keys are independent):

    each key is chosen with uniform probability ⇒ Pr(K = k) = 1/26^5

    for each ptx-ctx pair there is a unique key: ki=(ci−mi) mod 26, ∀ i

Computationally secure ciphers

Entropy

Definition

Let X be a random variable (generally means an alphabet) which takes values in {x1, x2, . . . xn} with probability distribution pi=Pr(X = xi), ∀ 1≤i≤n. The Entropy of X is defined as:

H(X) = −Sigma(i from 1 to n) pilog2pi

assuming conventionally that pilog2pi=0, if pi=0.

H(X, Y )≤H(X)+H(Y ), the equality holds if X, Y are independent

H(X, Y )=H(Y )+H(X|Y );

H(X|Y )≤H(X), the equality holds if X, Y are independent

H(P|K, C)=0: if you know the ctx and the key you do not have any uncertainty in deriving the ptx

H(C|P,K)=0: if you know the ptx and the key you do not have any uncertainty in deriving the ctx

H(C, P,K)=H(P,K)+H(C|P,K)=H(P,K)=H(P)+H(K)

H(C, P,K)=H(K, C)+H(P|K, C)=H(K, C)

  • Key Equivocation

    It defines the amount of information (uncertainty) about the key, that you got by the knowledge of a ctx: H(K|C)

    H(K|C) = H(K, C) − H(C) = H(P) + H(K) − H(C)

  • Definition of Language Redundancy

    RL = 1 − HL/log2|M|

  • Unicity Distance

    It is the length of ciphertext words (i.e., the number of ctx) n=n0 such that the number of spurious keys is equal to zero,

    n0 ≈ log2|K|/(RLlog2|M|)