This is an individual post from E Pluribus Unum
There's more on the main page.


A Data Miner's perspective on the NSA Database

Sorceress Sarah:

First let me qualify. I own a small data mining company in California. I use off-the-shelf software and hardware that I built into a powerful data mining cluster. I apply considerable computing firepower to assist political candidates and PACs.

What it is:

Now, let's talk about data mining. I should begin with what it is not: Data mining is not magic, though the results can frequently resemble it. Data mining is math. Nothing more. It is math applied to seemingly unrelated or only tangentially related datasets that reveals patterns within the data that may not be evident to even the most rigorous scrutiny. Data mining has been used to find the genetic causes of disease, predict credit card fraud, understand global warming, and a host of other applications from the beneficial to the benign, from the unscrupulous to the malign. It is a tool, and like any other tool it can be used for good or for evil....

Who are they really spying on?

All of this brings us to ask who the real targets of all of this spying is. In truth, it could be the terrorists. In order to identify them, you need to know an awful lot about those who are not terrorists. This helps to eliminate false positives. However, the data for terrorists is so sparse, that even if a possible terrorist is identified, the algorithms used will rarely generate a high probability and a high confidence. In other words, little, if any actionable intelligence. On the other hand, if you want to predict how a person will vote in a given election, you can get an amazingly accurate prediction from the high-quality data from Joe and Jane Sixpack.

Read the rest.


Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Full Feed RSS

Creative Commons LicenseThis weblog is licensed under a Creative Commons License.
Powered by
Movable Type 3.2