Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Introduction to Data Compression: Lossy and Lossless, Modeling and Coding, Lecture notes of Data Compression

An overview of data compression, focusing on lossy and lossless techniques. It discusses the importance of data compression in various applications, such as communications and multimedia. The document also introduces the concepts of modeling and coding, including markov models and composite source models. It is a valuable resource for students and professionals in computer science, engineering, and related fields.

Typology: Lecture notes

2016/2017

Uploaded on 11/28/2017

asif-khan-3
asif-khan-3 🇮🇳

4

(1)

1 document

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1.A
2. A
3. B
4. D
5. C
6. D
7. B
8. A
9. C
10.B
11.Lossy and Lossless
12.Compact
13.Intigrity
14.Information
15.Shorter
16.Winzip
17.JPEG
18.H = -summation P(A) log(P(A))
19.B0/B1 no of bits before and after compression
20.Repeated, 1
21.4 bits
23.Suppose we have an event A, which is a set of outcomes of some random experiment. If P(A) is
the probability that the event A will occur, then the self-information associated with A is given by
Introduction to Information Theory
24. The dierence between the original
and the reconstruction is often called
the distortion.
pf3
pf4
pf5
pf8

Partial preview of the text

Download Introduction to Data Compression: Lossy and Lossless, Modeling and Coding and more Lecture notes Data Compression in PDF only on Docsity!

1.A

  1. A
  2. B
  3. D
  4. C
  5. D
  6. B
  7. A
  8. C

10.B

11.Lossy and Lossless

12.Compact

13.Intigrity

14.Information

15.Shorter

16.Winzip

17.JPEG

18.H = -summation P(A) log(P(A))

19.B0/B1 no of bits before and after compression

20.Repeated, 1

21.4 bits

23.Suppose we have an event A, which is a set of outcomes of some random experiment. If P(A) is the probability that the event A will occur, then the self-information associated with A is given by

Introduction to Information Theory

24. The difference between the original

and the reconstruction is often called

the distortion.

  1. In copy
  2. 1.Physical Models: If we know something about the physics of the data generation process, we can use that information to construct a model.

For Ex. In speech- related applications, knowledge about the physics of speech production can be used to construct a mathematical model for the sampled speech process. Sampled speech can be encoded using this model.

Real life Application: Residential electrical meter readings

  1. Probability Models: The simplest statistical model for the source is to assume that each letter that is generated by the source is independent of every other letter, and each occurs with the same probability. We could call this the ignorance model as it would generation be useful only when we know nothing about the source. The next step up in complexity is to keep the independence assumption but remove the equal probability assumption and assign a probability of occurrence to each letter in the alphabet.

For a source that generates letters from an alphabet $A = { a1 , a2 , …….. am}$ we can have a probability model $P= { P (a1) , P (a2)……… P (aM)}$

  1. In copy
  2. lossloess nd lossy compression
  3. Data compression squeezes data so it requires less disk space for storage and less bandwidth on a data transmission channel. Communications equipment like modems, bridges, and routers use compression schemes to improve throughput over standard phone lines or leased lines. Compression is also used to compress voice telephone calls transmitted over leased lines so that more calls can be placed on those lines. In addition, compression is essential for videoconferencing applications that run over data networks.

Most compression schemes take advantage of the fact that data contains a lot of repetition. For example, alphanumeric characters are normally

Modeling and Coding

F

A7The development of data

compression algorithms for a variety

of data can be divided into two

phases.

F 0

A 7Modeling: In this phase we try to extract

information about any redundancy that

exists in the data and describe the

redundancy in the form of a model.

F 0

A 7Coding: A description of the model and a

“description” of how the data differ from the

model are encoded, generally using a binary

alphabet.

F

A7The difference between the data

and the model is often referred to as

the residual.

Markov Models: Markov models are particularly useful in text

compression, where the probability of the next letter is heavily

influenced by the preceding letters. In current text compression, the $K^

{th}$ order Markov Models are more widely known as finite context

models, with the word context being used for what we have earlier

defined as state. Consider the word ‘preceding’. Suppose we have

already processed ‘preceding’ and we are going to encode the next

ladder. If we take no account of the context and treat each letter a

surprise, the probability of letter ‘g’ occurring is relatively low. If we use a

1st order Markov Model or single letter context we can see that the

probability of g would increase substantially. As we increase the context

size (go from n to in to din and so on), the probability of the alphabet

becomes more and more skewed which results in lower entropy.

word context being used for what we have earlier defined as state. Consider the word ‘preceding’. Suppose we have already processed ‘preceding’ and we are going to encode the next ladder. If we take no account of the context and treat each letter a surprise, the probability of letter ‘g’ occurring is relatively low. If we use a 1st order Markov Model or single letter context we can see that the probability of g would increase substantially. As we increase the context size (go from n to in to din and so on), the probability of the alphabet becomes more and more skewed which results in lower entropy.

4. Composite Source Model: In many applications it is not easy to use a single model to describe the source. In such cases, we can define a composite source, which can be viewed as a combination or composition of several sources, with only one source being active at any given time. A composite source can be represented as a number of individual sources $S_i$ , each with its own model $M_i$ and a switch that selects a source $S_i$ with probability $P_i$. This is an exceptionally rich model and can be used to describe some very complicated processes.

Figure 1.1 Composite Source Model

  1. In copy