All hail the distributed era of the web that's coming—an era where data sticks to a user like gum on their shoe and they carry it where they go.
Why am I excited for this? Well, a shower thought this morning went like so:
"Why did Facebook remind me of a post from 15 years ago? What good is this information to me (or anyone else) now? What a waste to store all this useless data of mine on a server somewhere that takes up physical space on the planet."
Social media has ushered in a world of ephemerality. (Love that word.) We think a thought, pass it along to a platform, and never revisit it again.
How often do you trudge back through things you've posted 1 year ago let alone 7 or 8? If not for platforms surfacing information like "On this day in April 2015 you said this", would you ever go back to look at old posts? As we age and gain more experience and perspectives, people despise reading things they wrote years and years ago.
Yes, there's a minority for which these posts generate buzz and get people socializing (not to mention earn them money). But this only happens for a day or two before it's relegated to the past and sits around taking up storage space somewhere in a database, never to be recalled again.
Regardless of whether there are outlier reasons for doing it, I imagine the number of people that would find any use for having this information stored is abysmally low.
My own personal data aside, Facebook's 3 billion users across planet Earth generate buckets of their own data. This is data that's rotting away in the deep recesses of racks upon racks of servers. All of it is utterly useless, meaningless information. The thing with social media is that the information posted is often "useful" for a couple days after being posted. That's it.
I'm aware that the amount of data they have on me would amount to very little space taken up on a server. Then again, they also likely use data replication so it can be served fastest from anywhere in the world, among other benefits. With that in mind, perhaps I take up more space than you think.
But why do we keep it around for so long?
The addiction angle
I'm not so naive to think Facebook resurfacing this information doesn't stir up thoughts in my psyche which then cause me to associate good times with the platform and want to keep using it. Contributing to the petrabytes of data (definitely more than that) is part of their business model, after all. And what's that business model?
The advertising angle
Advertising is their business model. It's their primary means of making money. One of my favourite (and harrowing) quotes is:
If you don't pay for the product, you are the product.
This is true of Facebook as it is for Google and so on and so forth. In their eyes, they want to keep my data on file so they can harvest it to learn about me, add me to a demographic, and sell better advertisement targeting to businesses.
The legal angle
Having worked in Canada's largest bank, there are legal obligations related to data retention. Often, that time frame is 7 years. Well, considering my personal life posts aren't financial data, if I join Facebook in 2008 and make some posts for a couple years, then they don't need to store my data until 2015.
If I happen to get audited by the federal government, Facebook needs to be able to pull up my transactions from up to 7 years ago, but social posts are not financial data.
The scientific angle
"But Dan," people cry, "this information should be studied for the benefit of the human race!"
Yes, there are many angles from which we could study this huge bucket of sociological data. One interesting angle to me would be from the point of view of linguistics. We could look at how language is being used: the various grammatical choices people make and how that ties in to where they live, who they've intermingled with, their interests, and their upbringing. I personally find sociolinguistic research like this fascinating, but it's not a strong enough reason to keep data around for years and years.
Data storage and the environment
Data centers run on electricity. The servers inside used to take incoming requests and store information run on electricity, too. The cooling technology to keep the data center from getting too hot inside also run on electricity.
Not only this, but just imagine how often these companies need to purchase more and more physical storage and expand their physical footprint just to house all their user data. What for? To surface "memories" for people on the platform on the anniversary of a post they made 12 years ago? So that people who want to watch a YouTube video from 2009 with 110 views?
My proposal
My first proposal was: OK, maybe after 7 years the platform can notify the user they're going to delete all their previous data and then offer them an opportunity to pay money to keep the data stored.
But then I thought: why would we allow that? Instead of offering people the chance to store it for longer, simply allow them to download the data to store how they wish.
Yes, I understand this could very well mean the data may just end up switching from one company's servers to another, but that's OK with me.