• Steve H. Hanke
    05/04/2016 - 08:00
    Authored by Steve H. Hanke of The Johns Hopkins University. Follow him on Twitter @Steve_Hanke. A few weeks ago, the Monetary Authority of Singapore (MAS) sprang a surprise. It announced that a...

How To Lose $172,222 Per Second For 45 Minutes

Tyler Durden's picture




 

Originally posted at Python Sweetness blog,

This is probably the most painful bug report I’ve ever read, describing in glorious technicolor the steps leading to Knight Capital’s $460m trading loss due to a software bug that struck late last year, effectively bankrupting the company.

The tale has all the hallmarks of technical debt in a huge, unmaintained, bitrotten codebase (the bug itself due to code that hadn’t been used for almost 9 years), and a really poor, undisciplined dev-ops story.

Highlights:

To enable its customers’ participation in the Retail Liquidity Program (“RLP”) at the New York Stock Exchange,5 which was scheduled to commence on August 1, 2012, Knight made a number of changes to its systems and software code related to its order handling processes. These changes included developing and deploying new software code in SMARS. SMARS is an automated, high speed, algorithmic router that sends orders into the market for execution. A core function of SMARS is to receive orders passed from other components of Knight’s trading platform (“parent” orders) and then, as needed based on the available liquidity, send one or more representative (or “child”) orders to external venues for execution.

 

13. Upon deployment, the new RLP code in SMARS was intended to replace unused code in the relevant portion of the order router. This unused code previously had been used for functionality called “Power Peg,” which Knight had discontinued using many years earlier. Despite the lack of use, the Power Peg functionality remained present and callable at the time of the RLP deployment. The new RLP code also repurposed a flag that was formerly used to activate the Power Peg code. Knight intended to delete the Power Peg code so that when this flag was set to “yes,” the new RLP functionality—rather than Power Peg—would be engaged.

 

14. When Knight used the Power Peg code previously, as child orders were executed, a cumulative quantity function counted the number of shares of the parent order that had been executed. This feature instructed the code to stop routing child orders after the parent order had been filled completely. In 2003, Knight ceased using the Power Peg functionality. In 2005, Knight moved the tracking of cumulative shares function in the Power Peg code to an earlier point in the SMARS code sequence. Knight did not retest the Power Peg code after moving the cumulative quantity function to determine whether Power Peg would still function correctly if called.

 

15. Beginning on July 27, 2012, Knight deployed the new RLP code in SMARS in stages by placing it on a limited number of servers in SMARS on successive days. During the deployment of the new code, however, one of Knight’s technicians did not copy the new code to one of the eight SMARS computer servers. Knight did not have a second technician review this deployment and no one at Knight realized that the Power Peg code had not been removed from the eighth server, nor the new RLP code added. Knight had no written procedures that required such a review.

 

16. On August 1, Knight received orders from broker-dealers whose customers were eligible to participate in the RLP. The seven servers that received the new code processed these orders correctly. However, orders sent with the repurposed flag to the eighth server triggered the defective Power Peg code still present on that server. As a result, this server began sending child orders to certain trading centers for execution.

 

19. On August 1, Knight also received orders eligible for the RLP but that were designated for pre-market trading.6 SMARS processed these orders and, beginning at approximately 8:01 a.m. ET, an internal system at Knight generated automated e-mail messages (called “BNET rejects”) that referenced SMARS and identified an error described as “Power Peg disabled.” Knight’s system sent 97 of these e-mail messages to a group of Knight personnel before the 9:30 a.m. market open. Knight did not design these types of messages to be system alerts, and Knight personnel generally did not review them when they were received

It gets better:

27. On August 1, Knight did not have supervisory procedures concerning incident response. More specifically, Knight did not have supervisory procedures to guide its relevant personnel when significant issues developed. On August 1, Knight relied primarily on its technology team to attempt to identify and address the SMARS problem in a live trading environment. Knight’s system continued to send millions of child orders while its personnel attempted to identify the source of the problem. In one of its attempts to address the problem, Knight uninstalled the new RLP code from the seven servers where it had been deployed correctly. This action worsened the problem, causing additional incoming parent orders to activate the Power Peg code that was present on those servers, similar to what had already occurred on the eighth server.

The remainder of the document is definitely worth a read, but importantly recommends new human processes to avoid a similar tragedy. None of the ops failures leading to the bug were related to humans, but rather, due to most likely horrible deployment scripts and woeful production monitoring. What kind of cowboy shop doesn’t even have monitoring to ensure a cluster is running a consistent software release!? Not to mention deployment scripts that check return codes..

We can also only hope that references to "written test procedures" for the unused code refer to systematic tests, as opposed to a 10 year old wiki page.

The best part is the fine: $12m, despite the resulting audit also revealing that the system was systematically sending naked shorts.

0
Your rating: None
 

- advertisements -

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Tue, 10/22/2013 - 21:44 | 4081510 NoDebt
NoDebt's picture

I feel sorry for these guys.  If they were big enough they could have gotten all the trades reversed, like Goldman does.

Sucks being them, I guess.

Moral of the story:  be systemically imortant before you screw up.

Tue, 10/22/2013 - 21:56 | 4081531 chump666
chump666's picture

The NY Fed is Goldman Sachs

Tue, 10/22/2013 - 22:11 | 4081557 ZerOhead
ZerOhead's picture

"Sucks being them, I guess"


Good thing it probably wasn't their own money they were losing... unless you count the lost bonus checks that is...

Tue, 10/22/2013 - 22:38 | 4081635 Godisanhftbot
Godisanhftbot's picture

 Doubt it. The only trade that are reversed are those clearly erroneous. These trades were all kosher and at the market.

Wed, 10/23/2013 - 00:13 | 4081813 ebworthen
ebworthen's picture

Kosher, as long as Rabbi Blankfein blesses them.

Wed, 10/23/2013 - 02:12 | 4081943 fourchan
fourchan's picture

wasnt algoagogo an elvis flick?

Tue, 10/22/2013 - 23:06 | 4081705 icanhasbailout
icanhasbailout's picture

The US business community has an incredible inability to recognize the value of programming talent, so it ends up with idiots who run bankrupt-the-company type risks without anyone else knowing about it.

If your average businessperson knew how much power IT people in their individual discretion really have over their businesses, they'd shit their pants.

Tue, 10/22/2013 - 23:18 | 4081736 NickVegas
NickVegas's picture

I regularly have their business on the pointy end of my keyboard. The business people think IT is a commodity, until me, or someone sitting in my seat, shows them who really is in charge. Outsource all your IT to India, or China, or Backmanistan, or La La Land, I'm all for it, cause you gonna pay up when you come back begging, if you make it back. It has always been labor vs. capital, but now my labor is capital, hmmmm, the knowledge worker rises, as the parasites look for new ways to deceive, and enslave.

Tue, 10/22/2013 - 21:44 | 4081511 ninja247
ninja247's picture

Awesome article

Tue, 10/22/2013 - 22:11 | 4081575 Ignatius
Ignatius's picture

This is nothing.  I once misplaced my keys and searched for them for almost an hour.

Tue, 10/22/2013 - 21:54 | 4081526 Charles Nelson ...
Charles Nelson Reilly's picture

Jaime Dimon printed this one out, taped it to his chest and had an Asian hooker shit on him/it for pleasure.

Tue, 10/22/2013 - 21:57 | 4081537 NoDebt
NoDebt's picture

I don't care who you are, that's funny right there.

Tue, 10/22/2013 - 21:54 | 4081528 Joebloinvestor
Joebloinvestor's picture

HAHAHAHA

I bet there was a guy who got a bonus that cut out all the human shit.

 

Probably works for HHS.

Tue, 10/22/2013 - 21:58 | 4081540 RafterManFMJ
RafterManFMJ's picture

Hello! I am Samuel welcome to tex support! How am I about to be helping you?

Wed, 10/23/2013 - 00:40 | 4081859 JuicedGamma
JuicedGamma's picture

.

Tue, 10/22/2013 - 21:57 | 4081539 0b1knob
0b1knob's picture

This sounds a little like "the dog ate my homework".

More interesting would be a report on who MADE the $460 million that they "lost".    Its a zero sum game after all in the short term.  Was some one aware of the bug, or even perhaps planted it, and decided to get rich rather than get a good employee evaluation?

Tue, 10/22/2013 - 22:19 | 4081594 ZerOhead
ZerOhead's picture

Unless you're a banker it's far easier to crash and burn something than it is to make it a success. Either outcome can make you money if you know what you are doing.

Enormous fortunes no doubt will be made when our sociopathic CEO's figure that little shortcut out...

Tue, 10/22/2013 - 22:36 | 4081633 Harbanger
Harbanger's picture

"Unless you're a banker it's far easier to crash and burn something than it is to make it a success."

 

It's always easier to crash and burn something than it is to build something of value, that's the long history of failed collectivism.  But what are you sayin?  Everyone except Bankers wants to crash and burn something? 

Tue, 10/22/2013 - 23:10 | 4081716 icanhasbailout
icanhasbailout's picture

He's saying this could have always been a planned destruction, with its principal movers having large positions on the other side of the trade, and the "rogue computer code" being nothing more than a convenient excuse for the destruction. Someone DID get the money that Knight lost - who got it and how much? $460m is more than enough to be worth pulling off a major scam for.

Tue, 10/22/2013 - 23:23 | 4081742 Harbanger
Harbanger's picture

Techies rule the modern world!  There's rogue computer code everywhere.  Thanks for clarifying what Zerohead is sayin.  What do you think of Alan Gaysons recent attack on the Tea Party?

Wed, 10/23/2013 - 01:13 | 4081890 zhandax
zhandax's picture

Hate to ruin a good conspiracy, but read article III (1.) of the administrative proceeding.  "While processing 212 small retail orders...".  Their retail clients were were your average piss-poor stock pickers.  Besides, any insider who wanted to take the other side of the trade would have to know where it was routed.  Since the purpose of an order routing system is to find the best bid/offer, the duplicates would have been routed all over the street.

Wed, 10/23/2013 - 03:51 | 4082000 Harbanger
Harbanger's picture

Yes.  Your particular answer is in the details.  Keep searchin.......

What do you think of Alan Gaysons recent attack on the Tea Party?

Wed, 10/23/2013 - 13:29 | 4083214 zhandax
zhandax's picture

Grayson's a harvard lawyer.  I wouldn't trust him to take my garbage out.

Tue, 10/22/2013 - 23:38 | 4081770 aerojet
aerojet's picture

I doubt it.  I work with the same kind of people.  Too much complacence and not enough people who act like real engineers.  Just a lot of buffoons who don't know how to do their jobs or what the consequences for failure are.  Here's a hint:  You can get away with being a fuckup for as long as it doesn't put your company out of business.  Then you stop getting a paycheck all together.

Tue, 10/22/2013 - 23:22 | 4081741 NickVegas
NickVegas's picture

You are so right, sir. It is the pink elephant in the room. Who was on the other side of that trade, baby? How to win by losing. Shucks, there goes 460 million down the durn drain. I'm sorry I programmed it wrong. 

Tue, 10/22/2013 - 23:39 | 4081772 aerojet
aerojet's picture

It's a nano-traded bot world now, the other side of the trade was all the algos whose programmers and ops people had their shit wired tight.

Wed, 10/23/2013 - 05:40 | 4082072 StandardDeviant
StandardDeviant's picture

Who was on the other side?  Everyone and anyone who saw and hit their dodgy bids/offers.  Sheesh...

Tue, 10/22/2013 - 21:59 | 4081543 lasvegaspersona
lasvegaspersona's picture

def: "systemically important":  dangerous, too risky to be allowed to continue, capable of anihilation, also...well connected, major contributor, and also so large as to be able to change the very definition of itself at will. New Mahem Dictionary

Tue, 10/22/2013 - 22:10 | 4081556 Yen Cross
Yen Cross's picture

  How does one obtain a copy of that tainted "Power Peg" code? I can think of a few .gov assholes that I would like to anonymously share it with. ;-)

 

Tue, 10/22/2013 - 22:22 | 4081598 zorba THE GREEK
zorba THE GREEK's picture

Yen... That code is not available at this time. It is being use for the

Obamacare website.

Tue, 10/22/2013 - 22:39 | 4081640 Yen Cross
Yen Cross's picture

lol good one. :-)

Wed, 10/23/2013 - 01:41 | 4081920 dunce
dunce's picture

Were they both written by the same bunch of penis puffers?

Wed, 10/23/2013 - 09:18 | 4082371 Grinder74
Grinder74's picture

Did they get the proper licensing or just steal it like the rest of their software?

Tue, 10/22/2013 - 22:21 | 4081580 Atomizer
Atomizer's picture

If you download the new Apple Maverick software, your purchasing buying habits will assist us in developing new semi-periphery manufacturing zones. We appreciate the text/email questionnaire feedback sent to your smartphone.

Tue, 10/22/2013 - 22:17 | 4081591 geotrader
geotrader's picture

Knight!  Where the deal gets done.

Tue, 10/22/2013 - 22:18 | 4081592 freedogger
freedogger's picture

That they didn't automate the deploy, ie, someone has to manually copy files is a red flag. Automate that shit and automate the rollback. Test the deployment scripts. Even written procedures for deployment are a sign of failure, documents are seldom read and followed, especially if the process is repeated often. No, this has to be automatic, fully vetted and documented in executable and well tested code. The testing of the deployment code should be automated. Changes in source control trigger automated tests to run. Failing tests or code without tests stop the release cold until it is addressed. 

The problem with many companies is that executive technology decision makers are really not at all qualified or competent enough to make the decisions they make. Why is this? The usual nepotism, fraternal and rotten from the top down answers apply. 

Tue, 10/22/2013 - 23:42 | 4081775 aerojet
aerojet's picture

I've been trying to make the people in my company do this for two years now.  They still don't get it.  

I can't get eight servers setup exactly the same way no matter what I do, they always fuck up something!  It isn't that fucking hard to automate, even.

The problem is that most companies are HR organized, in other words, not organized to be successful and efficient.

Wed, 10/23/2013 - 00:16 | 4081819 freedogger
freedogger's picture

If you can't change your company, change your company.

vagrant up!

Wed, 10/23/2013 - 05:44 | 4082076 StandardDeviant
StandardDeviant's picture

Absolutely, "freedogger".  They set up seven machines properly, but screwed up the eighth?!  This sort of thing absolutely needs to be deployed and verified automatically.

Oh, and the bit about "repurposing" parts of the configuration?  That's just sloppiness, and laziness, and is begging for Murphy to come and pay you a visit.

Tue, 10/22/2013 - 22:38 | 4081637 4 wheel drift
4 wheel drift's picture

no worries mate.....

 

compared to oblamascare........

 

just a flesh wound.....    :)

Wed, 10/23/2013 - 01:56 | 4081933 Element
Element's picture

More like a paper cut really.

Tue, 10/22/2013 - 22:54 | 4081669 RaceToTheBottom
RaceToTheBottom's picture

I bet this place used domain specific tech outfits, that were familiar with NYC WS.  This reduces the amount of firms that work there and makes the ones that do complacent.  I don't think financial outfits have the intelligence to hire the best, reduces their bonuses....

Tue, 10/22/2013 - 22:55 | 4081674 Stuck on Zero
Stuck on Zero's picture

The problem was traced down to their lack of skilled Cobol programmers.

 

Wed, 10/23/2013 - 01:32 | 4081909 zhandax
zhandax's picture

The SEC complaint was due to their lack of paid congressdouches.

Tue, 10/22/2013 - 22:56 | 4081676 infinity8
infinity8's picture

It's all mediocracy, all the time, everywhere now.

 

Wed, 10/23/2013 - 00:14 | 4081814 I Write Code
I Write Code's picture

werd

Wed, 10/23/2013 - 00:24 | 4081831 Bruce Flea
Bruce Flea's picture

Here's a bit of code that is would have saved $460 million dollars:

/* */

Problem solved.

Wed, 10/23/2013 - 00:57 | 4081881 cwwang
cwwang's picture

LOL agreed!

Whoever does build and deploy on that code base needs to be hung.  I am surprised they try to debug the problem "real-time" while the problem was occuring. Complete operations issue since they didn't even have a roll-back plan involved seems like.

I wouldn't even blame it on the software or the programmer completely!

 

 

Wed, 10/23/2013 - 03:41 | 4082001 New World Chaos
New World Chaos's picture

I'm surprised they didn't just pull out the ethernet cables as soon as they realized something had gone horribly wrong.  Were they afraid of losing money during their 45 minutes of running around like headless chickens?  Bwahahahah

Wed, 10/23/2013 - 01:07 | 4081891 jballz
jballz's picture

 

I see titties.

Hey I want to learn to write code and make a shitload of money. What's the fastest way to get from the above looking like tities, and making a shitload of money writing code?

Thanks for your guidance!

Do NOT follow this link or you will be banned from the site!