Quantcast
Channel: Whitneys Week
Viewing all articles
Browse latest Browse all 100

My Take on Data Mining

$
0
0

Well well well.  A post about policy and current events…

I never thought I’d be writing about something like this, but I recently got into a discussion with my favorite human about Facebook’s data scandal, data mining, and the government’s responsibility to protect people from these sorts of breaches.  When we started talking, and my opinion was strong.

I firmly stated that data mining is a common marketing technique that almost everyone who uses a computer will take advantage of to advertise, that nobody did anything wrong in targeting voters in our last election, and that the government has no responsibility to protect those who hand over their personal information like candy to a baby.  Needless to say, my opinion has changed.

I read a ton of articles, many academically driven, and took notes on both sides of the story.  Austin helped jump start me with this video.

 

The biggest thing I didn’t realize until watching that video is that Cambridge Analytica did indeed break Facebook’s rules. It also seemed to be more about how they used the data than actually obtaining it in the first place, but it was still vague. So, my research was (and still is) cut out for me.

Let’s start with a few definitions of this thing called data mining.

Wikipedia: Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learningstatistics, and database systems.[1]

SAS Insights: Data mining is the process of finding anomalies, patterns and correlations within large data sets to predict outcomes. Using a broad range of techniques, you can use this information to increase revenues, cut costs, improve customer relationships, reduce risks and more.

Dictionary.com: the process of collecting, searching through, and analyzing a large amount of data in a database, as to discover patterns or relationships

At first glance, it seems to make sense and be completely harmless.  It’s common for anyone trying to reach an audience to search for and understand patterns of behavior and preferences from a sample size of actual people.  Now, why does data mining have a bad rap?

After searching further, I have come to understand that the wrongdoing of data mining stems from a lack of transparency.  When businesses collect and use people’s data, they are doing so to improve business practices and make money.  This makes our information a monetized tool.  Information is a valuable resource, and when businesses use it, we should at least know and agree to effectively sign over our rights.  It is OUR information after all.

During my research, I stumbled upon on one of Khan Academy’s Help Center pages where a gentleman made a few great points in his long winded, sarcastic response to a suggestion that they offer a class on data mining.  He said about the nonprofit tutoring service, “It’s free. But users do pay a price: In effect, they trade their data for the tutoring.”

He went on to explain the ways in which sites like Khan Academy will write up intentionally confusing terms of service or privacy policies to vaguely explain how they collect and use user data. “Parents and teachers typically turn to companies’ privacy policies to try to figure out what student data is being collected and how it could be used. “

“Clarity is a rarity.”

To understand why this matters and why we so often click the “agree” button, we’ve got to know what the terms of service document really is.  Here’s a pretty accurate definition written by
Christian Sandvig and Karrie Karahalios in an article about the Computer Fraud and Abuse Act (CFAA).

“Terms of service are corporate documents written by individual websites to advantage themselves. They often prohibit research by third parties, they prohibit visitors to websites from copying the information found there, and they have even prohibited users from criticizing a system’s owner publicly.”

Basically, if a company leaves their terms of service or privacy policy document in a grey area that cannot be understood, then it doesn’t count as letting people know.  Nevertheless, we agree to it pretty much every time we visit the wonderful world wide web.  The second and most awful part about these privacy policies is that they can instantly change without notice.

That proves that this game is absolutely unfair from the get go.  The entire structure of data collection is wrong, but it’s not punishable because we “agree” to it. In fact, we as individuals can be punished for breaching any part of the terms of service contract without even knowing it because we aren’t aware that it has changed. Anyway, that’s a blog post for another day.  So, how do sites get in trouble (or more often get away with) improperly using people’s data? The way the data is used is what actually gets people in trouble.

The reason why Facebook is in trouble is because Cambridge Analytica was allowing Russia (allegedly) to target voters, and push people toward Trump ( allegedly).  That crosses a few lines that the government is not ok with, and rightly so. Facebook let Cambridge Analytics do whatever they wanted, and therefore is wrong in subjecting its users to basically being hacked without knowing. Even if Cambridge Analytica was breaking Facebook’s own rules, Facebook still cashed the check so to speak, letting it go until the scandal broke.

Of course, I’m only scratching the surface here, but this is a pretty concerning topic of conversation, especially if we as people have no control over what we see, what we don’t see, and what is happening behind our screens.  What do you think?

 

Disclaimer:  Now, you know I’m not a public policy scholar or a computer science nerd.  Take my opinions with a grain of salt.  I’m out of school, so I am not looking to be graded on accuracy or persuasiveness.  Buh bye.


Viewing all articles
Browse latest Browse all 100

Trending Articles