**Summary**

- I Shannon Theorem
- Huffman Coding
- Lempel – Ziv Coding

The** I Shannon Theorem **states how much is **the compression limit** that a code can reach without introducing distorsion (**Distorsion-less Coding**).

The **minimum average number of binary simbols** to represent an arbitrary source symbol (average length of the code words) cannot be smaller than the** source entropy**.

Let S be a DMS with source alphabet A={a_{1},a_{2},…,a_{N}}

According to **Large Number Law** for a symbol sequence of length **n** enough large (n→∞) one has:

Pr (ai)

**Definition**

- A typical sequence is a sequence satisfying the
**Large Number Law**, namely n_{i}= p_{i}n**For n→∞**

The typical sequences have the same probabilities equal to 2^{-nH(X)}

We will need only **N _{τ}=2^{nH(X)} ** codewords, namely, if we utilize binary codes nH(X) binary symbolsfor any

sequence of n source symbols.

**Resource saving!!!**

**The saving is larger and larger whenever H(X) is smaller and smaller than log N !!!!**

** What happens if the source outpot is a non-typical sequence ?**

No codeword has been assigned!!!

An error occurs, namely a distorsion!!!!

For n→∞ such an event will occur with null probability

Such a coding is asymptotically ( **n→∞**) distortion-less.

** Statement**

A memory-less source with rate H(X) can be coded with arbitrary small distorsion if **n→∞** with **Coding Rate** (bits /source symbol) such that:

*R≥H(X)*

On the contrary if ** R<H(X)** asymptotically null distorsion.

The I Shannon Theorem is an existence theorem: it just assures us that if we need a code with coding rate *R≥H(X)*, it does exist.

**How to single out a code ?**

Huffman Algorithm

**Assumption**: we know the probabilities p_{i} of DMS symbols a_{i} (**Entropic Coding**).

The Huffman code is a code block/variable length.

The codewords have different lengths :

- The more probable source symbols are mapped to the shorter codewords.
- The less probable source symbols are mapped to thelonger codewords.

**Procedure**

- Sort source outputs in decreasing order of their probabalities.
- Merge the two least probable source outputs into a single output whose probability is the sum of the corresponding probabilities.
- If the number of the remaining outputs is 2, then go to the the next step, otherwise go to step 1. Arbitrarily assign “0″ and “1″as codewords for the two remaining outputs.
- If an output is the result of the merger of two outputs” a preceding step, append the current codeword with a 0 and a 1 to obtain the codeword for the preceding outputs and then repeat.
- If no output is preceded by another output in a preceding step, then stop.

If n=2 we have 32=9 source symbols with n probabilities p_{i}=1/9

If we built a Huffman code one has that

The Huffman coding requiresthe knowledge of the symbol probabilities.

The Lempel-Ziv coding does not requiresuch such knowledge.

The procedure is autolearning: it is a **universal coding**.

The Lempel-Ziv coding is a variable length /fixed length code.

It can be shown that for a stationary and ergodig source.

The number of bits for **n→∞ **is equal to n H(X).

**nH(X) < n because H(X) < 1**

**Problems**

- How many phrases can be inserted in the dictionary? How is it possible to avoid the overflow?
- Removing from the dictionary the phrases which did not occur in the past.

**It is widely utilized to compress files for computers, routine under UNIX such as ZIP, ZOO, LZH,ARJ etc.**

**Compression ratio**:

- it depends on the characteristics of the source.

*5*. Digital Transmission over AWGN chanel

*6*. Evaluation of P(e) for optimum RX in AWGN

*7*. Error probability for M-PSK

*8*. Noncoherent ML Demodulation of FSK signals in AWGN

*9*. Transmission through bandlimited AWGN channels

*13*. Cyclic Codes

Progetto "Campus Virtuale" dell'Università degli Studi di Napoli Federico II, realizzato con il cofinanziamento dell'Unione europea. Asse V - Società dell'informazione - Obiettivo Operativo 5.1 e-Government ed e-Inclusion