Data
The oil spill few care about
Apologies for the size of that image, but it captures the magnitude and essence of this post remarkably well, so I am gonna use it anyways even though I try to also keep the narrative coherent without the pictures. I do this for those, like myself, who turn off image-loading in emails. It isn’t that I don’t like photos - I quite do actually. I mean, hey… I’ve built a career in media processing and delivery. I digress…
This post was going to be a part of Sorry, not Sorry but Substack warned me I was approaching a limit where people would not be able to completely read that post in email form. I am certainly not going to be the one that forces you off of the only truly democratic digital platform to sully yourself, even if it is a website.
Those that know me well know that I cut a different path with regards to technology, platforms, privacy, and frankly, stubbornness. I don’t pretend that this is endearing, cute, or clever. But it is genuine. I wrestle with these topics within my own inner-dialogue as well as those who will permit me both grace and opportunity - you’d be amazed how infrequent both of these intersect anymore.
The Data
As I wrote in my History of Ai I had never truly appreciated the bottleneck that Artificial intelligence was facing before Steve Jobs landed a telemetry device in all of our hands with the iPhone. The system was primed and ready to go, but needed fuel. This concept has been debated extensively, which I won’t get into right now, but was insightfully captured by Clive Humbly’s remark in 2006 at the Association of National Advertisers Conference, ‘data is the new oil.’ Well guess what !? Apple decided to not only fill the tanks but also to lubricate the world.
Microsoft spent $USD26.2 Billion in 2016 to purchase LinkedIn which today would be $USD35.5 Billion adjusted for inflation. Their stated purpose was to integrate in into the Enterprise software. You know, so that you could see your LinkedIn photo in your Outlook. Remember? Yes that, but really, the data and of course the advertising tie-in which for LinkedIn represents about $USD 6Billion in 2023 - a tidy sum, but a paltry amount compared to those other folks by the bay, or is it sound? Wait.. how about we just say West Coast US tech titans… yeah, them.
See what most people don’t realize, or perhaps care about, is that while you’re on the site not only are you exposing yourself to their targeted advertising, you also contribute to demographic profile data which says a lot about you. In addition, you also provide additional data, which they scrape up as you connect, whether via browser (as I do, from private windows, with restricted settings, etc.) or their native platform app (which are the worst offenders - despite what you may think because you are on an iPhone). So I thought I’d dip in a bit from my Incomp* post and share a specific application - yes, real data.
The Dump
Do not think you have stumbled upon the beginnings of your data brokering start-up here. I assure you the individual data are less important than the themes. We only tend to remember 4-5 things at anyone time anyways, so the details tend to get lost. My hope is that these themes will strike your Ah-ha! bone, or at least make you shake your head.
So it began with LinkedIn’s Terms-of-service change which I referenced preceding this post and my disagreement with their intentions to break the silo between LinkedIn profile data and Microsoft’s advertising stack (one of the biggies mind you). So I initiated my data dump.
It did take a bit of time and was actually two different downloads. I initiated these the week before their Nov 4th deadline which was really Nov 3rd at 23:59 I’m sure, but had no idea what timezone they’d use so I wanted to be safe. The image above is the page that allows you to dump your data, which I encourage you to do to see for yourself.
Caveat emptor
If you do, be careful! This will contain a surprising amount of information. Which, unfortunately, keeping private is becoming increasingly difficult. Yeah, I wrote about that too in Evacuate, Evacuate.
The Data
This data journey starts, unsurprisingly, with your LinkedIn connections. This is something that you’d expect them to have and I was pleasantly surprised how legible it was. I was able to download a csv file with my connection information - sans email addresses unless those were explicitly shared. The csv did share the LinkedIn URL for the user however. I haven’t looked whether there is a clean way to use these to re-connect, or whether I even will.
From connections we move on to Inferences which is the basis for all of this. This contains what LinkedIn are willing to share with you about the profile they’ve built around targeting you for ads. I’ve redacted mine (which was pretty boring anyways actually as I didn’t give them many of the things they like for these things). They’ll use ASL (Age, Sex, Location) and any of those other handy things you’ve incorporated into your profile that you share with your connections.
A bit of set-up for the next bit. Well before LinkedIn even disclosed they were going to start using your information to train their AI tools I decided to delete much of my content. This was no easy task I tell you (by design). I manually went back through decades of data clicking buttons. There are some tools out there that will do this for you via cloud apis, but think about it. If you’re not paying for it do you really think they aren’t stealing your data? That’s rhetorical obviously…
So I eliminated a lot of this data, posts, comments, etc. What there was /no/ way to eliminate, that I could find were Reactions. Those nearly passive clicks that trigger for so many today their digiX receptors washing over their consciousness like X does for candy kids at music festivals, if only for a fleeting moment. But… What the data-nerds love about those things is 1) the demonstrate engagement — which is crucial because, against all the other passive things you do not engage with, they represent a response which, is an action that is measurable, can be correlated, and goes right into that Inferences model I just described.
So my reason for deleting my profile was all this data that I couldn’t see and did not know how far back it would go. As a previous premium customer representing about $USD 6.44 Billion, just a few 100 million more, than their advertising. (Which do you think has more upside?) I decided to contact Premium Support. What this really means is that they’ll send a ticket to a human, eventually, somewhere. TL;DR on this? They weren’t saying.
As I was saying, I was concerned about how long they’d been keeping data and what it might be. So after I downloaded my data this is the primary question I was seeking to understand.
Ads Clicked
Again, this is to be expected and of little value as who knows what their adIDs are. What is important here is to pay attention to the date. Just over a year ago in September 2024.
Reactions
Comments
I edited this version to improve readability. Notice how far back that date goes in comparison, now we are seeing stored data back to December 2013.
Shares
Rewinding further, now we hit early 2012 for shared posts with others. So, there it is. LinkedIn’s day goes back at least 13 years, perhaps even longer. Data which would have gone over into Microsoft’s advertising platform. I find the retention policies across these various categories interesting and somewhat helter-skelter. Maybe that’s why they are unwilling to disclose more. As I reflect, I find myself wondering if 2005 Kreig would have ever agreed to sign-up for a platform like LinkedIn had they described this as their business model back then. What I come to realize is, like the proverbial frog and the pot, I like many other would have just opt’d out. What digital platforms have learned, and is a spin on a well know sysadmins adage everywhere is, the sticky principle. Once it’s launched, it sticks. So, turn the heat up slowly, pay careful attention, a/b test enough, and eventually you have frog soup. Ribbit.










