Information Management in the Information Age - Part 1

2011-02-12

Context

This post is the first in a series focusing on managing and finding content and information. It provides a context for the remaining posts and a justification for a few systems that I will describe.

Bit Hoarding

Hoarding is built into our genes. We have an innate compulsion to acquire more of everything. 10,000 years ago this was a useful trait. Our ancestors would gather as much food and fresh water as they could carry. The more they could carry, the longer they could survive without leaving their adobe dwelling and exposing themselves to predators.

We've evolved. We've moved out of our adobe huts and into cubicles. The problem is, we are still hoarding. We hoard food and clothing. We hoard Tupperware and plastic bags. We also hoard information. I don't think I am unique in this regard. I have the impression that the majority of computer users hoard bits in every crevice of their hard drive. Take a peek inside your documents folder. How many layers of folders constitute the backbone of your information "archive?"

This isn't a condemnation. In fact, I suggest we celebrate that the information age has granted us this one small victory. We can fulfill our natural hoarding compulsion without taking up physical space. But to what end do we hoard data? My personal reasons fall into 4 categories.

Categories of Data

Crucial information that I will need in the future. This includes financial documents like scans of W-2 forms, bank statements and investment summaries. Additionally, I include family photos and home videos as well as my movie and music collection in this category. I'll explain why in a future post.
Information that I may need again. This category includes receipts, warranties, utility bills, and the rest of my photo and video collection not in category #1.
Information that I would never save in the physical world, but for various reasons I feel compelled to keep. This is a somewhat nonsensical category but never the less comprises a large amount of data. Examples of this category would be snippets of code that I find interesting or educational, color combinations that I find particularly pleasing or writing that is of high quality.
Scratch information. This data is mostly temporary but serves some practical purpose. This category includes web addresses, code snippets, partial blog posts and meeting notes.

It's taken considerable self reflection and value assessments to codify these categories. It is far too easy to simply keep things without thinking. Understanding these categories has huge value. For example, it makes backup planning and execution much more logical and efficient. I can now bucket my backup strategies.

Category 1: Always maintain a contemporaneous backup of every bit. Do not use syncing though. Files can be accidentally deleted on the master disk and I do not want to lose the same file in my backup. This backup category archives deletions. This category also does not depend on any cloud services for recovery.

Category 2: Regular and consistent backups are maintained. Typically once a week. These backups are always available with minimal effort.

Category 3: Backup the data periodically but do not waste too much disk space or time. A monthly backup schedule is fine. The backup can be off site for safe keeping but I do not need multiple backup copies available. Cloud syncing can suffice for this category.

Category 4: No backup necessary. Cloud services do most of the heavy lifting for this backup category by virtue of syncing rather than data preservation. If the cloud service shuts down, there is no real loss.

Do not take this as a detailed backup strategy but rather the high level plan for a backup system.

What is important to understand though, is the value that can be derived from understanding your system, your motivations and your weaknesses.