Guest Post: False Positives & The Limits Of Predictive Analysis

Submitted by Charles Hugh-Smith of OfTwoMinds blog,

Analytic systems share system limits with financial markets.

Correspondent Lew G. recently sent me a thought-provoking commentary on the limits of "total information awareness" in terms of any information system's intrinsic rate of generating false positives.

In essence, the rate of false positives limits the effectiveness of any predictive system. The process of attempting to eliminate false positives is inherently one of diminishing return: even with no expense spared, the effort to eliminate false positives runs into boundaries of signal noise and generation of false positives.

To the degree that financial markets are ultimately predictive systems, this suggests a systemic cause of "unexpected" market crashes: signal noise and the intrinsic generation of false positives lead to a false sense of confidence in the system's stability and its ability to predict continued stability.

Here are Lew's comments: 

Resources to deal with reality are inherently limited by that reality.
Information, to the contrary, is inherently infinite, because of the fractal nature of reality.


A property of that information reality is that 'meaning' is relative to other items of info, and that any single item can change the interpretation of a big set of facts. E.g., "Muslim, bought pipes, bought gun powder, visits jihadi sites, attends the Mosque weekly, tithes ..." can be completely changed in meaning by a fact such as 'belongs to the Libertarian Party', even 'is a plumber, 'is a target shooting enthusiast'".


This will continue to be true no matter how much info the NSA gathers: it will be a small subset of the information needed to answer the question 'possible terrorist?'.


Thus NSA's tradeoff of privacy vs security is inconsistent with reality: no matter how much info they gather, no matter how sophisticated their filters, they can never detect terrorists without a false positive rate so high that there will be insufficient resources to follow up on them.

In other words, if the system's lower boundary is one false positive per million, no additional amount of information gathering or predictive analysis will lower that rate of false positive generation to zero.

Why does this matter? It matters because it reveals that large-scale analytic systems are limited by their very nature. It isn't a matter of a lack of political will or funding; there are limits to the practical effectiveness of information gathering and predictive analysis.

Though Lew applied this to the NSA's "total information awareness" program, couldn't it also be applied to other large-scale information gathering and analysis projects such as analyzing financial markets?

This was the conclusion drawn by the father of fractals, Benoit Mandelbrot, in his book The (Mis)Behavior of Markets. As Mandelbrot observed: "When the weather changes, nobody believes the laws of physics have changed. Similarly, I don't believe that when the stock market goes into terrible gyrations its rules have changed."

All this should arouse a sense of humility about our ability to predict events, risks and crashes of one kind or another. In other words, risk cannot be entirely eliminated. Beyond a certain point, we're sacrificing treasure, civil liberties and energy for not just zero gain but negative return, as the treasure squandered on the quixotic quest for zero risk carries a steep opportunity cost: what else could we have accomplished with that treasure, effort and energy?

This entry was drawn from the Musings Reports, which are sent weekly to subscribers and major contributors.


No comments yet! Be the first to add yours.