One of my brothers captured his birthday celebration on his phone and later he shared some pictures and videos with me through Facebook. The strange part was the video size dramatically fell after downloading the video on my phone. And he was astonished by this incident, and I think everyone faces this situation very frequently. Now the question arises, how the downloaded photos and videos are so small in size while having an impressive quality??
This is the wonder of a technological invention that is called Data Compression. We are using too many files, images, audio, video keeping them saved in the memory or hardware of our devices. It has only been possible with the help of compressing the size of data.
A brief knowledge about data compression will make things easy for us and you will be very much grateful for this wonder of modern science that made our digital life experience so convenient.
Data compression is a term used very frequently in the field of computer science. The process of encoding, modifying, re-organizing data to use as little storage as possible. It includes re-encoding of data to lessen data size than the raw figure of the file. This process is done by a program that uses different functions or an algorithm to gather information on how to reduce the size of the data efficiently.
A perfect instance will be image compression. Normally an image requires 10-20 megabytes of storage without compression. Whereas, the compression process re-sizes the file using too little amount of storage space. Data compression is also known as Compaction. It plays a very crucial role in saving data to the computer storage and also sharing the data using different communication networks.
We always want to keep our computer processing power fast enough and also make storage space free as much as possible. To maintain both of the requirements, there is no other option but data compression. Therefore, in order to maintain the processing speed of our device and keep enough space in the storage, we must deduct the amount of file size such as images, audio, videos, text documents, etc.
Moreover, compressing your data will ease your internet browsing experience by reducing the file size of any audio or video which takes a long time to load and play.
Data compression is crucial because compressed data takes very little time to transmit using any network of communication. Most importantly, reduced data occupies tremendously less storage of hardware. As a result, users benefit economically by not buying extra storage. No doubt, this also increases the productivity of the device.
However, the excessive use of computing materials to apply compression to the applicable data is setting a drawback in data compression.
Compressing data is a process where a program works behind the scene in a structured procedure to find how data size can be reduced.
For text compression, the program removes unwanted characters, detects redundant characters, and removes them. A text file’s size can be minimized up to half of its original size or to an amount that is still smaller than the raw files. An algorithm scans the picture and matches the color present in the picture, after that, the program rebuilds the string of data into a single bit. But it still holds the raw information of the image.
The computer understands only 0’s and 1’s. So every character that is going to be compressed, should be encoded to 0’s and 1’s orbits (binary digits). Each character consumes two bits of space if all the characters are equally portable in a word.
Any type of data can be compressed like a text document, audio, video. However, there are two different types of compression:
- Lossy Compression
- Lossless Compression
This type focuses on the quality rather than the size of the file. This type of compression does not remove or deduct data from the original file, just minimizes the redundant data from the file so that it can be returned to the original size.
This compression is necessary for text documents as well as spreadsheet documents, where every character is not removable for the sake of correct information. ALAC and FLAC audio codecs, ZIP archives, PNG images are some of the examples of lossless compression. Here are some Lossless data compression methods-
- Run Length Encoding (RLE)
- Lempel Ziv – Welch (LZW)
- Huffman Coding
- Arithmetic Coding
Lossy compression is the type of compression where the system removes the maximum amount of data from the original file. The system scans the document and finds similar but not very noticeable data which can be discarded. Lossy compression is applicable for images and audio files. JPEG images and audio files can be a perfect example of lossy compression. In a highly compressed picture, the quality won’t be very satisfying as the details and clarity may not be present in the picture.
On the other hand, a highly compressed audio may not sound so clear and accurate compared to the original version. Because some bass lines have been changed due to the compression. However, all jpeg images and audio are not very bad in quality. It actually depends on the level of compression that was applied to a particular file. The less the compression is applied, the less the quality will suffer. Here are some Lossy data compression methods –
- Transform coding
- Discrete Cosine Transform (DCT)
- Discrete Wavelet Transform (DWT)
We can differentiate the difference between Lossless and Lossy compression in the table below.
|Comparison Index||Lossless Compression||Lossy Compression|
|Data Elimination||Data is not removed from the file rather just minimized redundant data.||Lossy compression eliminates those data that are not noticeable or don’t have too much impact on the file.|
|Restoration||The file can be restored to the original version.||Data is compromised and not able to restore properly.|
|Quality||In lossless, data quality is not compromised and remains high.||To make the file very small, the original quality is not maintained in lossy compression.|
|Size||File size remains rarely half of the original version to maintain the quality.||File sizes are kept very small so that it can take minimum storage of the device.|
|Algorithm||Algorithm programs for lossless compression: Lempel Ziv – Welch (LZW), Huffman Coding, Arithmetic Coding||Transform coding, Discrete Cosine Transform (DCT), Discrete Wavelet, Transform (DWT) are the most common algorithms.|
|Usage||Lossless is applied to compress image, sound, etc.||Lossy compression is for reducing the size of image, audio, video, etc.|
|Data Savings||Lossless compression allows fewer data to be saved.||Lossy compression has more capacity to store data than lossless technique.|
BONUS INFO: Lossless is also known as reversible compression whereas lossy is known as irreversible compression.
Frequently Asked Questions (FAQs)
- What are the data compression formats?
The most common formats in data compression are GIF, ZIP, JPEG, MP3, WavPack, lossless JPEG, Audio File Formats, PDF, Apple Lossless Audio Codec, Video File Formats, etc.
- What is used for data compression?
DCT is at the top level of popularity as a lossy compression method. This method is widely used in multimedia formats of photos, videos, and audio.
- What is a good compression ratio for data?
Data can be compressed up to 90% if the data is highly compressible.
- Should I use data compression?
As compressed files occupy very little space in the hardware, it is better to use data compression. Basically, trimming of data, communication bandwidth, less time in data transmission are the pros of data compression.
- What is the limitation of data compression?
According to Shanon, a fundamental limit is present in lossless data compression that is called The Entropy Rate, denoted by H. Depending on the statistical nature of data, the value of H varies.
Modifying data sizes with the help of data compression has made our hardware more useful because now we can use it to keep too many files. Without worrying about the storage space, we can keep our important files, images, videos, audio, and other documents owing to too little space in the memory. Of course, lossless and lossy both have their own usefulness in saving storage space and maintaining the quality of the files.