You have been granted access to this page through First Click Free. Subsequent use of TabbFORUM will require logging in. If you don't have an account, registration is free.



More Video | Podcasts

Innovations in Trading and Technology

18 January 2013

Are We Testing Our Systems Enough?

Several recent failures indicate that we are not sufficiently investing in Quality Assurance at the system and market level.

We have recently witnessed several serious trading failures caused by software defects, which makes you wonder: Are we investing enough in testing to deliver a safe and stable market?

Bug-free software is a fallacy.  We can test all we want, but we will never identify all bugs nor have the resources to remove them.  In fact, if you have no known bugs you probably are not testing enough or properly.  More worrying, the number of defects will not decrease over time, but rather increase.  As trading continues to shift towards electronic markets and system complexity and interconnectivity increase, so will the number and severity of defects.  How do we handle this?

[Related: "BATS’s Pricing Errors Fuel Complexity Debate"]

We can learn from the aircraft industry.  Manufacturers strive to build jet engines that never fail, but they knowingly release the engines with defects, otherwise, they could never deliver a product to market.  Instead, they use testing as a means to gain insight into the jet engine: operational characteristics, performance profile, safe operating parameters, and defect identification.  This provides a rich understanding of the jet system, which they use to to write operation guidelines and maintenance manuals and mitigate operational failures.  They employ an engineering practice called Failure Modes, Effects, and Critical Analysis (FMECA) for every defect to identify potential problems and identify actions or compensating provisions to remedy their impact.

Likewise, we should not narrow our view of testing as a means to eviscerate defects.  Rather, testing is one aspect of a broader topic, called Quality Assurance(QA), which is the  process to ensure that the delivered system conforms to requirements specified by target specifications (including functional, non-functional, regulatory, business objectives, coding standards, …).  The process starts at project kickoff and continues through the system's operational lifecycle until decommissioning.  Testing is a quality control tool to measure compliance to specifications, but in and of itself does not fix anything.  A good testing process reveals developed system behavior, problem areas and specific defects: A quantitive view of system's conformance to specification and verification that system deliver required capability.

But, as an industry are we investing enough in QA to mitigate potential future failures?  It appears not.  The recent NASDAQ's Facebook IPO glitch, BATS own IPO failure , Knights Capital trading glitch, and BATS order mispricing do not share a root cause and are individually and uniquely serious failures.  They have received extensive press coverage, incurred (or potentially incur) financial impairment, and caused reputation damage. 

Potential causes for these failures:

   * Stakeholder expectations leading to ambiguous or conflicting requirements,

   * Insufficient test coverage and result analysis,

   * Rushing system release (especially with known defects),

   * Inadequate operational guidance to handle potential problem scenarios.

I can't definitely state that these reasons caused the above failures, but these are common QA deficiencies.  In particular, disparate stakeholder (inclusive of business, regulators, operations, and users) expectations and the demand for rapid changes can lead to defects, which may manifest immediately, or more concerning, cause irrefutable damage over time.  Components, such as trading platforms, matching engines, order routing systems, and market distribution systems will fail to properly operate, and when they do fail, we should have well defined vetted guidelines to handle these failures.  Like FMECA, we should identify potential problems and identify actions and compensating provisions to remedy their impact, including at the market level.

Yet, even increased investment in QA will not guarantee absence of faults.  As we started with, bug-free software is a fallacy; testing can prove the existence of defects, but can't prove non-existence.  We can increase investment in the QA function to improve system resiliency, but ultimately, we need to test at the market level to ensure that component defects are controlled and limited.  Many years ago, trading systems were simpler and we regularly conducted industry-wide tests; this would be very difficult to accomplish today.  We should build to keep the market up, expect failures, but be ready to handle failures quickly and effectively. Achieving this goal requires a market-level QA process and matching investment to support this process.

Spotlight-white-trans For more stories in the Innovations in Trading and Technology Spotlight Series click here.

Add a Comment

You must log in to comment.