OptionStream: An automated system for tracking derivative effects on equity prices

https://doi.org/10.1016/j.eswa.2005.06.015Get rights and content

Abstract

We present the design and development of a flow-based software system that became necessary for the study of a class of options-based price-estimators in the financial markets. Because a combination of factors-cost, reliability, uniformity and convention-made it virtually impossible to obtain historical options data, we developed a data-flow system to capture, process and analyze streaming data over a period of over a year. The system utilizes distributed processing nodes with checkpointing to process input data streams and reliably compute a variety of estimator updates for studying relationships between equity prices and the functions of corresponding option-related variables. The flow-based architecture is designed to support the high-volume data characteristics of the options market, along with the severely taxing computational requirements of hypothetical option-pricing so that experimental investigation of novel estimators becomes possible with current and historical data. Features such as high-volume stream-processing, intermediate checkpointing and load distribution make the design viable for more general streaming data processing applications.

Introduction

Financial engineering algorithms typically attempt to predict the price trajectory of some security or predict a portfolios' returns over time. More generally, research studies in this area are geared towards capturing relationships between the prices of financial trading instruments, market-impacting events and market-related variables, with a view towards profits under sound principles of risk-management. In this paper, we describe an automated flow-based software system for the study of relationships between financial derivatives and the pricing of their underlying over time. We demonstrate the idea with a prototype system developed to track the behavior of functions of option interest and prices alongside the underlying equity, so that aberrant, unexpected or interesting behavior may be exposed through transformed views of data over time. To demonstrate the utility of the idea, we utilize options (Natenberg, 1994) data to gauge the near-term price-behavior of a security or the near-term direction of the broad market-which is closely tied to the behavior of important tradable and optionable sector indexes-based on a phenomenon known as MaxPain in the trading folklore (Software, 2002).

This MaxPain hypothesis has apparently not been investigated in any systematic way but survives as one of a handful of myths that is routinely used by traders (Maximum-Pain Theory Options Analysis, 2004) to gauge price direction using the distribution of the number of outstanding option contracts for the next expiry period (Fullman, 1992). The implication is that option writers, i.e. option sellers, who may or may not be market-makers in options, are able to influence the direction of the underlying's share price either through concerted share buying and selling or through linkage with market-makers in the underlying shares. If a subclass of the class of option writers is able to influence share price at or near the time of option expiry (Fullman, 1992), it is conceivable that this influence is sufficient to make the price move towards a value that tends to minimizes their potential payout (in terms of exercisable options) to option holders. The hypothesis relies on the explicit assumption that option holders retain their options through expiry instead of trading their options, though there is scant evidence to support this view. Nevertheless, the idea carries some weight in the trading community given that traders often refer (Maximum-Pain Theory Options Analysis, 2004) to MaxPain in estimating near-term price direction. The term MaxPain comes from the fact that the share price that brings minimum loss to the option writers also causes maximum pain, in terms of collective monetary loss, to the option holders.

Flow-based software architectures exploit pipelined or distributed processing modules connected together by data flows, with processing triggered by the availability of data. Examples include the Stream Machine (Paul Barth) and the Basic Data-Flow processor (Dennis and Misunas, 1975); also, marginally related to our application is the Data-Flow Database machine (Haran Boral, David, & DeWitt, 1980) which attempts to exploit a data-flow processor for executing relational queries. Software architectures based on data flows range from simple graphs with basic functionality to full fledged packet-switched networks with concurrency and general recursion. A key advantage of a flow-based system is that it can be made to scale with respect to the number and volume of input flows; processing can be distributed for load-balancing, with support for module addition/modification. Distribution of processing accommodates and encourages cooperative software development, where individuals with expertise in algorithmics for financial derivatives may contribute to experimental features or even use proprietary analysis at processing nodes. We adapt the general flow-based architecture to our needs: efficient processing, timing constraints, checkpointing and plugins for extending functionality.

In this paper, we report on our experiences with the design and implementation of a flow-based system for processing the option-data streams-a choice dictated by the need for scalability and load distribution. Such an architecture also offers the benefit of enabling modification or addition of modules on-the fly-features that we learned are necessary when interacting with data services over which we have no control, or when experimenting with different data-collection and mining algorithms. While the approach we take is applicable to more general real-time (RT) data collection and analysis, our focus is on a challenging computational problem involving options data in the financial markets. Here, we restrict our attention to the basic options-tracking and equity price-estimation problem and software system design; the results of a statistical study involving the MaxPain and related estimators is given in (Stef-Praun and Rego, 2005). In Section 2 we review options, and in Section 3 we provide the first formal description of the MaxPain problem, with an outline of options pricing. In Section 4 we present the flow-based system architecture in terms of data and computation flows, explaining how general flow-based computation is done, and tailoring the application to MaxPain-related estimators. We conclude briefly with our experiences in Section 5.

Section snippets

The options market

The term financial investment is broadly associated with stock trading and the popular financial exchanges. Over the past thirty years the development and growth of the derivative product industry (Smithson and Smith, 1998) has made a wide spectrum of prospective investments available to the average investor. In addition to trading stocks and proxies for indexes, it is now possible to trade derivatives on a large number of commodities including metals, energy, meats, grains and also interest

Market forces and options-based price estimators

The recent spate of problems on Wall Street, including the Enron and WorldCom debacles, general financial scandals, late-trading by hedge funds, alleged collusion by NYSE specialists and NASDAQ market-makers has made us aware that the potential for significant profits makes various forms of questionable market-manipulation a highly tempting occupation (Wall Street scandals at a glance, 2003). This is especially true of specialists and market-makers in lucrative and volatile markets (The

A Flow-based system architecture

The characteristics of a flow-based architecture that specifically suits our needs include a capacity for high-volume input (ranging from RT streaming input to volumes of EOD market data), highly-intensive computation, intermediate checkpointing, and result dissemination. Because sources of data-streams and types of data streams are highly fluid in a changing market-environment, loosely-coupled and easily modifiable modules are ideal. The goal is to be able to scale to handle multiple inputs

Experiences and conclusions

What began as a project to simply test the predictive capacity of certain estimators based on options data turned into a project requiring a software architecture for processing EOD web-based and RT streaming data. We encountered many problems such as the lack of uniformity and convention in the storage of raw historical data for options. Different data services tend to capture the complex options data streams differently and, because of an archaic naming convention based on rotating the same

References (26)

  • F. Black et al.

    The pricing of options and corporate liabilities

    Journal of Political Economy

    (1973)
  • Paul Barth, Scott Guthery, David Barstow. The stream machine: a data flow architecture for real-time applications....
  • W. Christie et al.

    Did nasdaq market makers implicitly collude?

    Journal of Economic Perspectives

    (1995)
  • C. Clewlow et al.

    Implementing derivatives models

    (2001)
  • Cramer, J (1997). Cramer on Expiration Week Fun and Games, Street.com Commentary, December 17....
  • Cramer, J (1998). Cramer's Rewrite of His Column on the Options Firefight over Intel, Street.com Commentary, July 18....
  • J.B. Dennis et al.

    A preliminary architecture for a basic data-flow processor

    (1975)
  • G. Fontanills

    The stock market course

    (2001)
  • S.H. Fullman

    Options

    A personal seminar

    (1992)
  • P. Gaughen

    An analysis of odd-eighth quotation avoidance on the Nasdaq securities exchange: collusion and conspiracy

    Social Sciences

    (1999)
  • L. Gitman et al.

    Fundamentals of investing

    (2004)
  • Haran Boral et al.

    Design considerations for data-flow database machines

    Proceedings of the 1980 ACM SIGMOD International Conference on Management of Data

    (1980)
  • J. Hull

    Options

    Futures and other derivatives

    (2003)
  • Cited by (0)

    View full text