Big Data Algos 'Are' The Singularity & They're Coming To A Stock Market Near You

Tyler Durden's picture

Submitted by Ben Hunt via Salient Partners' Epsilon Theory blog,

For the life of me, I don't understand the debate [over the NSA metadata program].

– Jeb Bush, February 18, 2015





The Central Intelligence Agency played a crucial role in helping the Justice Department develop technology that scans data from thousands of US cellphones at a time, part of a secret high-tech alliance between the spy agency and domestic law enforcement, according to people familiar with the work.

The Wallstreet Journal front page story, March 10, 2015

Athena:  You wish to be called righteous rather than act right.

Aeschylus, "The Oresteia" (458 BC)

Point72 Asset Management, the successor to Cohen's hedge fund SAC Capital Advisors, has hired about 30 employees since the start of last year to build computer models that collect publicly available data and analyze it for patterns, according to two people with knowledge of the matter.
Cohen, whose SAC Capital shut down last year and paid a record fine to settle charges of insider trading, joins Ray Dalio's Bridgewater Associates in pushing into computer-driven investing, an area dominated by a handful of big firms such as the $25 billion Renaissance Technologies and the $24 billion Two Sigma. The money managers are seeking to take advantage of advances in computing power and data availability to analyze large amounts of information.

Bloomberg, March 10, 2015

Cassandra:  Have I missed the mark, or, like true archer, do I strike my quarry? Or am I prophet of lies, a babbler from door to door?

– Aeschylus, "The Oresteia" (BC)





I know, I know … I’m a broken record and a Cassandra, with 2 successive notes on Big Data. But I don’t care. This is a much larger structural risk for markets and investors than HFT and the whole Flash Boys brouhaha, it’s just totally under the radar and hasn’t surfaced yet. And unfortunately, just as I think Jeb Bush speaks for most Americans – Democrat and Republican alike – when he says that he doesn’t get what all the fuss is about when it comes to metadata collection and Big Data technologies, so do I think that most investors – institutional and individual alike – are blithely unaware of how their market identities can be stolen and their market behaviors influenced, all in plain sight.

Jeb Bush should know better. I think he probably does. Investors may not know better yet, but they will soon, one way or another. As you read this note, a small group of hedge fund managers are doing to you exactly what the NSA is doing to “terrorists”.

Today a handful of governments use Big Data to identify individual behavioral patterns so that certain individuals can be killed. Today a handful of hedge funds use Big Data to identify investor behavioral patterns so that certain investors can be crushed. Today Big Data is primarily an instrument of social information gathering, with a powerful but punctuated impact on those individuals on the receiving end of a drone strike or a targeted trade.

Tomorrow a handful of governments will influence aggregate political behaviors by triggering small communications that Big Data tells them will be voluntarily magnified by individual citizens, snowballing into outsized, long-lasting, and untraceable “popular” actions. Tomorrow a handful of hedge funds will influence aggregate market behaviors by triggering small trades that Big Data tells them will be voluntarily magnified by individual traders, snowballing into outsized, long-lasting, and untraceable “market” actions. Tomorrow Big Data will be primarily an instrument of social control, with a powerful and ubiquitous impact on all citizens and all investors.

Q: How can I protect myself?
A: You can’t.

But WE can protect ourselves, to some extent at least, by working together to raise voter and investor awareness of the risk and pressing for regulatory reform to shield our behavioral data from commercial use AND bureaucratic collection. I’ll leave the voter awareness piece to others, and use Epsilon Theory to focus on investor awareness.

Trust me, I know how this sounds, to write to an audience of free market-oriented investors and call for stronger regulatory intervention to prevent the collection or sale of “anonymous” investment data. And if you think that any mutually agreed upon transaction should be allowed, no matter how large the gulf in knowledge between the buyer and seller … if you would buy an original Honus Wagner baseball card from a 10-year old kid for a quarter, telling him that you were doing him a favor to pay him that much for such a ratty card … then I’m never going to convince you of the merits of my argument. If that’s you, then I’m sure Stevie Cohen sends his best regards from the Grand Duchy of Fairfield County. But if you believe, as Adam Smith did, that it is government’s appropriate role to prevent transactions that are massively lop-sided from an informational perspective and that directly subvert the small-l liberal institutions of free elections and free markets, then I think you will find this a proposal worth considering.

It’s by no means a perfect solution, but I like more than I dislike about the way our personal medical data is protected through HIPAA. As an initial step, I’d like to see federal financial data legislation equivalent to HIPAA, where both private AND public sector use of our investment history, no matter how scrubbed or “anonymized”, is prohibited.

Such a law would cause a lot of pain. For-profit exchanges, all of which have transformed themselves from trading venues into “data companies”, would no longer be able to sell disaggregated transaction data. Mega-asset managers would no longer be able to sell anonymized client portfolio data. Ubiquitous financial information companies that may or may not share a name with a former mayor of New York would be subject to a regulatory scrutiny that is sorely lacking today.

Yes, a lot of pain. But it’s a fraction of the pain we will ALL feel if for-profit exchanges, mega-asset managers, and ubiquitous financial information companies are allowed to continue producing weapons-grade plutonium for the handful of hedge funds that are building their instruments of market control.

Unfortunately, like Cassandra, I’m predicting future pain, and that’s rarely successful as a goad to current action. To quote Aeschylus once more:

Nothing forces us to know
What we do not want to know
Except pain.

I don’t think we investors have suffered enough … yet … to force us to accept the unwanted knowledge we need to spark effective collective action. Instead, I can just hear the apologists, the lobbyists, and the bought-and-paid-for spouting the Big Lie when it comes to Big Data: “But it’s anonymous data we’re talking about, so you have nothing to worry about.”

I hope I’m wrong, but I’m not optimistic.

Pessimism and hope may seem to be odd bedfellows, but for 2,500 years that’s been the best prescription for dealing with a tragic world, where external forces threaten at every turn to sweep us off our moorings. I’ve used a lot of quotes this week from Aeschylus because, as the inventor of tragedy as an art form, he was the guy who first proposed that bittersweet tonic.
Aeschylus had an interesting life and an interesting death. As the story goes, in middle age a fortune teller warned him he would be killed by something dropped on his head. From then on, Aeschylus famously stayed out of cities, where someone might accidentally knock a chamber pot or some such out from an open window. Sure enough, though, in the best tradition of the inescapable-destiny trope that Aeschylus helped invent, he was killed outside a Sicilian town when an eagle mistook his bald head for a rock and dropped a turtle on it. As I recall, there was a CSI episode that used this as a plot device to resolve an inexplicable death in the desert outside of Las Vegas … my estimation of the show runners went up immensely when they showed their surprising knowledge of classical history!

But it’s his life that I want to commemorate here. You see, first and foremost Aeschylus was a patriot. He fought the Persians at Marathon, Salamis, and Plataea, where he was recognized for bravery in all three battles. His epitaph says nothing about being a playwright, only about being a soldier. One of his two brothers was killed at Marathon, the other lost his hand at Salamis. Aeschylus himself bore terrible scars from the victory at Marathon. We know that he had these scars because he showed them to the jury when he was put on trial for treason after supposedly revealing some of the Eleusinian Mysteries – essentially state secrets – in one of his plays. Fortunately for the world, Aeschylus was acquitted, and Athens went on to experience a golden age that inspires us still.

Aeschylus argued that you can question your government’s policy on secrecy without being a traitor, that he was in fact still a patriot – perhaps even more of a patriot – for the tragedies he wrote. I’d hope that we can be as wise today as that Athenian jury was more than 2,500 years ago. I’d hope that we can question both our government’s policy and our private sector’s policy on behavioral data collection without being accused of treason or (worse in some investor circles) socialism. I’d hope. But I’m not optimistic.

So here’s Plan B, a plan for a crowd-sourcing world.

If we can’t cut off the supply of plutonium for these weapons of mass market destruction, then we can at least provide the blueprints for the Bomb so that anyone can build one. Or, better yet, we can build a collective early warning system, an open-source Bomb detector … a Big Data market intelligence available to everyone. It’s not an instrument of social control and it’s not a spoofer; the former is the enemy and the latter is really, really expensive. It’s a collection of highly sensitive risk antennae, sensitive enough to identify the likelihood of otherwise untraceable market manipulation in real time.

Recursive inference engine [A] comprised of thousands of “bots” (static data models) executes small trades to test market reaction to different stimuli. Game/learning implementation [B] serves as dynamic data model to recognize and calculate arbitrage likelihood functions. Analytics platform [C] operating within real-time database architecture governs [A] and [B].

This is a basic schematic for what I think could function as a rudimentary Big Data market intelligence. When I sketched this out 4 years ago I pegged the hardware cost at close to $5 million; today I figure it’s closer to $1 million. Host it somewhere like my friend Gary King’s Institute for Quantitative Social Science and the total cost, both to build and maintain, becomes very manageable. What’s costly is the time required to program the system, but there’s no shortage of Big Data wizards coming out of Harvard, MIT, Stanford, etc. every year.

Yes, I know that this schematic will be gobbledygook to almost all of my readers, and the few readers who are immersed in this stuff will undoubtedly find it overly simplistic. But it’s a start on Plan B. It’s a start on demystifying the powerful non-human intelligences that will soon be used … I suspect are already being used … by all-too-human institutions to shape our political and commercial behavior in pervasive and unwanted ways. And yes, I know that this is what all-too-human institutions have always done to the madding crowd. But what’s different today is the scale and scope of what’s possible. Big Data non-human intelligences ARE the Singularity, and they are coming soon to a stock market near you. I’d like to starve them out with legislation establishing a financial data equivalent to HIPAA (Plan A), or failing that enlist one of their own to share the information as widely as possible and thus diffuse their market impact (Plan B). But if we do nothing, then the Stevie Cohens of the world are going to conquer our capital markets just as surely as Agamemnon sacked Troy. That’s my prediction.

I don’t really know what to expect by putting these ideas out there on Epsilon Theory, and I’m really curious to see the reaction this note will get. Support for Plan A? Enthusiasm for Plan B? Both? I hope it’s both. But I’m not optimistic. I fear that like Cassandra, my blessing is to see the future clearly and my curse is that no one believes me.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
ExpendableOne's picture

Great, now we will get trampled by sheeple stampedes.  Cuts both ways though.  Get that model wrong and bad things can happen on either end of the spectrum.

CrazyCooter's picture

And this fixes the monstrous "debts that can't be paid, won't be paid" assets of pretty much every investment entity out there?

Big data ain't gonna do shit when this thing gets momentum to the downside. The only option is to be completely out, at least to the extent it is possible.



SafelyGraze's picture

during spring break, paw-paw took me on a ride to the country. out where his parents used to live.

it was spooky.

no hotels. no mcdonalds. dumpy houses and warehouses.

no visible commerce.

nobody out at night.

I asked him what all the people *do* out here.

he said they go to the hospitals. and they hire in-home respiratory therapists. and they go to the pharmacy. so that helps keep the local economy going.

but what about the young people?

he said they have school programs and stuff. and they can go to the community college. and borrow money to buy cars. plus they qualify for assistance programs.

but what keeps the whole thing afloat?

tv, he said. 

everybody has a tv.


Bananamerican's picture

:Tomorrow a handful of governments will influence aggregate political behaviors by triggering small communications that Big Data tells them will be voluntarily magnified by individual citizens, snowballing into outsized, long-lasting, and untraceable “popular” actions."

This is the single most inscrutable fucking thing Ive ever read on ZH....Salut.


dlmaniac's picture

Big-Data is mostly hype and garbage. 

NaN's picture

Think of it this way. If the executive branch can take advantage of embarrassing info gathered by NSA to target political enemies, then central planning by the FED could go to the next level with sufficient big data. A nudge here, a nudge there, targeting particular investors, is possible. The goal might be to manipulate the whole market or perhaps punish a firm that speaks the truth.

StateofFraud's picture

Stop fighting in the box they give you pal. All those words and flow charts end in regulatory capture. End the Fed. Or hope it and the rest of the printers eat themselves before they can activate skynet.

Edit. Jeb is a crotch swab.

heisenberg991's picture

Sleaze Cohen is a fucking scuzzball who deserves to die.

Occident Mortal's picture

I call Horse Shit on this entire Big Data sales pitch.

It's just the latest and greatest 2 and 20 sales pitch for Cohen and fellow criminals to scam UHNWI heirs.

sam i am's picture

Election platform of the Mariupol mayor wannabe fascist-NATO battalion commander Aleksey a.k.a. "Worm" - "First we will talk to people, afterward we will shoot them."

Mariupol is a port city on the NATO occupied territory of former Ukraine

Beware the NATO worms bearing gifts. Soon, there will be no Ukrainians left to enjoy the fruits of the Western democracy.

disabledvet's picture

The one thing that is certain is that we are not in charge here...we are being manipulated to create what is clearly real but still a mistake.


That means there is an ENEMY amongst and indeed perhaps even WITHIN us.


"Panic towards liquidity" (oil) is normative.  However such behavior is not "rigorous" (meaning following in the scientific method) thus causing what the Government terms "procedural errors."


In this case...TRILLIONS OF THEM.


There is no way to put this "peculiar" genie (the internet and its nefarious underpinnings)  back in the bottle.


In short a few nukes might be necessary here.

i_call_you_my_base's picture

The author doesn't really understand Big Data and talks about it in an odd way.

buzzy_the_pirate_dog's picture

damn, just google Hadoop.  It is a tool that works very well for some use-cases.  The magic is in the human using it.  No more, no less.

NaN's picture

First step is to make it a felony to misuse data collected or aggregated by the government.

asierguti's picture

Big Data is a concept that many don't really understand.


It comes in 2 parts, a distributed database capable of holding billions of records across hundreds or thousands of computers, replicated, fail tolerant and extremely fast. The second part is a distributed computing system, a way to use hundreds of computers to break down a very complex task into smaller tasks, complete them and put all the pieces together.


So, why do companies use Big Data? To do data mining, which this article failed to mention. The idea is pretty simple, if you have thousands of records of data, you can probably use a mathematical model, let's say a probability distribution, to extract information. If the data set is enormous, you are better off creating a custom mathematical model. This is what data mining does, it goes through all those records, try to create a custom model to analyse or classify them, and then, with that model we can try to predict future events.


This is what the NSA does, but it's also what google does (in instant search, for example). This is very interesting for companies, cause they can know how to target individuals in advertising campaigns.


Here is the things, trading and investment is nothing like that. If you think you can apply data mining and make a lot of money, you're wrong. Maybe for some time, but you will go bust eventually. These algorithms fail to see data from 20 or 50 years ago. There is a reason why cycles exist, and it's because humans don't learn from past experiences. And algos have very limited memory.


The best investment strategy is to read extensively history, philosophy, psychology and some maths.


My 2 cents.

Sanity Bear's picture

The day after tomorrow, Big Data will discover human beings aren't quite as mechanical and deterministic as was assumed.

Ballin D's picture

Come someone explain to me the moral issue with trading on trends predicted by anonymous and legally collected data?


Im not supporting the collection/sale of personal data or government spying by any means, I just dont see how it is wrong to use data to predict trends.  It sounds like an honest strategy that is closer to the fundamentally driven analysis that piqued my interest in the capital markets in the first place.  Its certainly better than using inside info from DC or using your size and political clout to get a faster connection than the rest of the world and effectively frontrunning the markets.

asierguti's picture

There is nothing wrong with using data mining to extract market information, but it's wrong is to rescue these institutions when their mathematical models fail and they go bankrupt.


This is what happened to Long Term Capital Management, they were using some mathematical models by Myron Scholes, which worked beautifully, and made a lot of money. Until the asian and russian crisis hit, volatility picked up and the company was bankrupt very quickly. Then, we had Wall Street backed by Greenspan rescuing LTCM.


I bet many of the algorithmic trading firms are in a similar situation now.

armageddon addahere's picture

LTCM was killed by one of those once in a million years crises that hit the markets every 5 to 10 years.

fauxhammer's picture

Say...isn't that Chris Christie behind those Finster Glints?

Encroaching Darkness's picture

(1) Act randomly, and occasionally irrationally; buy 1 share of stock you don't want, just to break your own patterns.

(2) Have multiple accounts at multiple brokers, with different patterns / objectives; use them all at all times, to diversify patterns to meaninglessness.

(3) Work through overseas brokers, to buy and sell at odd times when the NYSE is closed.

(4) HIPAA protects NOTHING: do you ever read the fine print? Almost anyone can have your data who asks for it (mostly government agencies), so that the blackmail can continue unimpeded if you kick up any trouble.

ALL these bastards are corrupt; act unpredictably, just to screw up their models and ideas. After the Crunch hits, all models will invalidate spontaneously, but none of the control freaks in power understand that.

Act unaccordingly.

armageddon addahere's picture

Manipulating or influencing investor behavior by wash trades, spreading rumors, etc has been illegal since the 1930s. Wonder when they will start enforcing the securities laws again.

Kirk2NCC1701's picture

"So here’s Plan B, a plan for a crowd-sourcing world."

You'd be amazed, but blogs are also used for Crowd-Sourcing. E.g. don't think for a moment that ZH isn't useful for that purpose to both sides (US and Russia).

SuperVinci's picture

say NO to BUSH



seataka's picture




virtualInsanity's picture

I don't think we really know nor understand what was "put online" in October 2014. QE was just feeding this monster with energy and it has taken off since reacting to press releases from the controlled media.

Other countries/regions have followed suit with QE, even China will. All in for feeding SkyNet and eventually joining it. One ring to rule them all! Only the ring is a quibit!


John Doeman's picture

Nature long ago foresaw all of this and came up with a shrewd and simple solution: Mutation.

That is the only true defense to AI and its big data mindset.

Throw in a dash of hubris and the recipe is complete.

Things have a way of taking care of themselves.

The way in has always been the way out.

DipshitMiddleClassWhiteKid's picture

i dont see what the big deal is here


there are plenty of quant funds that use this stuff and alot have blown up


RenTech beats the market because they're a CIA front and some of the profits go to the CIA/NSA for their clandestine activities


I work in data mining. It works pretty good on data that companies have stored in their servers but the difference between a data point on a customer who bought doritos with his credit card at 3:00 PM on a saturday is that there arent a bunch of peopel manipulating that data and the relationship exists between that person and the company.


The markets work through brokers and the prices are constantly changing.


Hedge funds using this stuff is NOTING new.


If I had the $$ and the data I could use a pretty simple machine learning algo to predict corporate come no ones writing an article about me? lol



armageddon addahere's picture

Or you could predict corporate bankruptcies by reading their annual reports, including the footnotes.

enloe creek's picture

Wait you want the government to make a level playing field.   Oh they will level it OK. Right at the back of our heads

Livermore Legend's picture

The "Black Box" is a Pure Illusion, and this is NOT Opinion, but FACT:

Human Nature Cannot be Quantified, and Human Nature is THE CRITICAL INPUT.

As I've explained:

People Want to Believe that the Answers are In Math, So They Can Simply Push that 'Magic" Button, and Out Comes Money.......

This is the Timeless Lure.......

Four Letters Summarize ALL of the "Technology" and "Advances":


Prometheus Unbound's picture

The real cold war at the moment is in supercomputers. China is currently on top. [This news is three years old: hedge accordingly] (Via Hackernews 2 days ago)


I'll unpack this joke, since no-one on HN did (muzzled little ones they are):

1) Capchas have always existed to program machine learning. It was literally crowd-sourcing image recognition - Google are srs smarts

2) Machine reading can already do text reading (apart from things like.. burns = bums etc when scanning old font documents) and is fairly damn decent at linking identity across multiple platforms* and **

3) Military usages of text processing have been going on for a while now. The Defense Advanced Research Projects Agency (DARPA) Machine Translation (MT) Initiative spanned four years. [WARNING: PDF]


We  present  highlights  from  three  experiments  that  test  the readability of current state ­of­ the art system output from (1) an automated  English  speech­to­text  system  (2)  a  text t­based Arabic­ to­ English machine translation system and (3) an audio­ based Arabic­ to­ English MT process. [WARNING: PDF]


Now, there's a reason I'm linking the MIT papers instead of the links, cause.. well. Meh, people tend to spoil the real booty. But, yeah, that's why Google's translate is really good an English <> Arabic.


4) Told you 60% HFTs were about to be culled. 'Cause it's coming.


*Some really smart oldsters pretend to be Schizoid online for this very reason. Others prefer Drag. Or Aliens. YMMV.

** Never ever ask about the color badges @ Google. Yellow = peons / workers who do the scanning; Red = morale / emotional investment / sexy for PR (hint: current Yahoo CEO? a RED); Green = creatives; Blue = Strategy; White = architects. If I were you, I'd never wonder why the EU and Google share the same color coding for their workers. Ho-hum.