compression

Two definitions from the Oxford American Dictionary of the word “compress” will serve as useful starting places for the discussion of compression. The first has to do with compression related to the way sound, images, and videos are stored and transferred over the internet and between devices, “Computing alter the form of (data) to reduce the amount of storage necessary.” The second definition “Be squeeze or pressed together or into a smaller space.”[1] is pertinent when thinking about how much data can be placed within the physical bounds or constraints of a physical medium. This definition is also relevant if we think of a unit of time as having space to fill. This paper will approach the topic of compression in these two ways as they pertain to media.

I propose compression as one of the most important but overlooked technological advancements used to deliver content via the internet. Despite the ever increasing speeds of our internet connections compression is still necessary for delivering non-textual forms of content. Without it the shear size of images, sound, and (especially) video would have crippled the internet and kept it from developing into what we live with today.

In the first definition above it is important to take note of the word “alter”. Often compression is, simply put, a process of discarding information and organizing what remains. The original is changed is some way to reduce how much information is necessary to represent it. This discarding of data to decrease a file’s footprint is refereed to as “lossy” compression. There is such a thing as “lossless” compression. An example of lossless compression we may all be familiar with is zip compression. We wouldn’t be willing to zip our files if we allowed the zip compressor to throw out bits of data in order to save space. There are many specifications for compressing files using a zip format but most amount to finding redundancies in the data and reducing them to a single description.

The result of compressing a file results in a blob of data that really is not in any sort of usable state without the knowledge of how to decompress it and display the information. We need to know how to take the .jpg file and convert it into a grid of RGB colors to be displayed on a screen or printed onto paper. The codec is the software capable of performing this task. The word codec is a combination of the words “compression” and “decompression”. Knowing it is a combination of these words makes its meaning and function pretty clear.

Before computers and the internet compression often dealt with problems associated with the recording and dissemination of text. One form of text compression that has a long history and has survived into modern times is Braille. Braille’s compression takes the form of a 2×3 grid of raised or flat bumps embossed on a flat surface. This simple grid results in 6bits of data or 64 possible 2×3 grids. This is enough data to encode all the characters of the alphabet, common punctuation, and several common words. With braille we are able to communicate full bodies of text via a spatially efficient simple system.[2]

Another form of text compression that came about well before the internet and used compression in two ways to deal with the difficulties in writing text. Stenography, the act of writing in shorthand, was and is to this day, used to deal with compressing text both spatially and temporally. The spacial efficiencies of Stenography allows an individual to record more information is less space with fewer characters. The temporal efficiency of Stenography allows an individual to quickly commit speech to paper. Many types of shorthand have been created and optimized to deal with either compressing the amount of information that can fit on a page or compressing the amount of data one can capture within a given amount of time.

A final example of pre-internet compression is Morse Code. In this case compression is used to improve the speed with which an individual can transmit text characters, numbers, and punctuation. To aid in the compression of characters sent over time Samuel Morse (and possibly Alfred Vail) chose to assign the shorter combinations of dots and dashes to the most commonly used characters and the longest combinations to the least used characters.[3] The most efficiently compressed characters being E and T which were represented by a dot and a dash, respectively. The technological advances of the telegraph along with the efficiencies gained through the compression of text into Morse Code allowed us to send information over long distances at unprecedented speeds.

This brings us to an important aspect of compression; it exists to reduce the burdens of data flow. Flow between components of a device, flow between devices, and flow over networks. At first compression just made it a possibility to send content via the web. Now compression of data allows us to send more. The ability to send more data than was previously possible brings me to the second meaning of compression as it pertains to media. We have, for a while now, been experiencing an exponential compression of the amount of data available for consumption in a single unit of time. According to Karl Fisch, “It is estimated that a week’s worth of New York Times contains more information than a person was likely to come across in a lifetime in the 18th century.”[4] Hearing temporal compression put in these terms gives us a visceral understanding of the explosion of information at our fingertips. And the New York Times accounts for only a fraction of the total spectrum of news, data, and information we have at our disposal.

Due to numerous converging factors (consumerism, monetization, shorter journalistic deadlines, etc.) the weekly and daily news cycle has been accelerating. The front page of many news websites are updated not in days, or even hours, but in minutes. After the short period of time it takes to read an article you can come back to a home page of a news website to find bits of new content sprinkled around the page. The invention of the internet has not only allowed the familiar news behemoths to take their content online but it has also lowered the bar allowing individuals to speak their mind and reach millions. Anyone can set up a site with little to no money and begin to express their opinions on any given topic. So, not only do we have more coming at us from the usual news sources but we now have a large number of bloggers and individual’s sites creating new content, information, and opinion for our consumption. The internet has become a democratizing force allowing nearly anyone to easily participate in the creation of more media.

To further complicate the modern information landscape new social sites and conventions have developed giving us yet more ways of consuming and disseminating media. Two of the most important and influential contemporary social phenomenons competing for our attention are Facebook and Twitter.

Before even describing what Facebook is or how it works one can begin to get a feel for how important it is through the sheer number of it’s users. If Facebook was a nation it would be a the 3rd largest nation in the world at a population of 500 million (active users)[5]. Beat in population only by India, and China. Facebook opens up a floodgate of constant social interaction. We are able to check in with and interact with far more people than would be physically possible should we choose to interact with our “friends” in an analogue face-to-face manner. Because of this we have greatly increased the number of social interactions we have in any given day. This constant social distraction has begun to effect the profitability of businesses. According to Nucleus Research nearly half of office employees access Facebook during work, costing companies an average of 1.5 percent total office productivity.[6]

It would seem Twitter’s 140 character limit on tweets would keep it from developing into a service with much influence. One might imagine, “How much worth can be said in such short bites?” But the link is what gives Twitter its true power. URL shorteners have allowed us to cram numerous links within the 140 character limit. Each link in a tweet can expand into an article, image, video, or full site. Tweets have become condensed pointers to vast amounts of information.

The rapid development of smart phones, iPads, and other internet connected portable devices has given us a way to consume and drink from the firehose of the internet whenever we want and practically wherever we are. These devices not only help to further serve our consumption, but, they have a positive feedback effect on the whole production of media. They create new demand for even more media. Afterall, how could we live without having all those sites, images, and videos reformatted to fit those tiny screens we all cary around.

This compression of media has a number of benefits and caveats. Below are some of the points I find important or interesting but they are by no means an exhaustive examination of the effects temporal media compression.

The shear number of people contributing content to the world via the internet means we have a much higher chance of finding much more specific content. If you are looking for something chances are someone out there has posted about it or made a site for it. And, as time goes on and more people contribute this sentiment becomes more and more true.

High numbers of people contributing results in a much wider range of opinions on nearly any subject matter. Simply searching the internet using Google one can find view points covering any point on the gamut on nearly any topic. From the most fundamentalist position on any religion to the opinions of diehard atheist. From the most leftist liberal to the most right leaning conservative. And, you can find an opinion and experience on just about any product made. One might say having such varied opinions and experiences available to us would result in a much more informed and enlightened public. This, however, is predicated upon the notion that the media we consume is factually accurate. This is a questionable assumption. The acceleration of media production and quickening deadlines in news organizations and journalism incentivises individuals to skimp on fact checking in the name of expediency. It also makes essays and articles espousing nuanced researched positions on complicated matters harder to find in those organizations with accelerated media output. We all have a tendency to seek out information that confirms what we already believe. Confirmation bias requires effort to overcome. After all, we never like being told we are wrong. When one has fewer media options to pick from the chances of finding an opinion that exactly matches your own becomes less. But now, with such informational riches at our disposal, we have to consciously seek out discordant opinions and information. Making something possibly unpleasant and take effort is not exactly a great recipe for making people perform an act.

Over the years there have been a number of technologies and strategies to deal with the ever-accelerating abundance of media. Aggregators take the form of sites like the Huffington Post that curates content created by others, RSS (Really Simple Syndication) feeds where one curates a list of feeds they are interested in, or even a Twitter feed or list where one curates by choosing specific individuals to follow. Aggregation allows one to curate information in one place limiting what is seen and how many places one needs to go to consume. Again, I see a danger in this. Aggregation can build walls that allow only certain kinds of information in.

The temporal compression of media requires one to be active in finding their sources of information and it places the burden of fact checking in one’s own hands. It also provides the opportunity to be better informed an learned should one decide to test their own beliefs, opinions, and positions by seeking out those counter to their own.

Travis Saul

Works Cited

1. New Oxford American Dictionary 2nd edition © 2005 by Oxford University Press, Inc.

2. Salomon, David. Data Compression. Springer, 2007.

3. Salomon, David. Variable-length Codes for Data Compression. Springer, 2007.

4. Fisch, Karl. Did you know [PDF file] http://www.lps.k12.co.us/schools/arapahoe/ fisch/didyouknow/didyouknowtext.pdf

5. Facebook Statistics. Web. Oct 15 2010. http://www.facebook.com/press/info.php? statistics

6. Nucleus Research. Facebook: Measuring the cost to business of social notworking. [PDF file] http://nucleusresearch.com/research/notes-and-reports/facebook- measuring-the-cost-to-business-of-social-notworking/