Predicting box office with and without markets: Do internet users know anything?

https://doi.org/10.1016/j.infoecopol.2013.05.001Get rights and content

Highlights

  • HSX and Derby game generate reliable estimates of opening weekend box office.

  • Participants over predict low-earning films but under predict high-earning films.

  • Prediction bias is greater in the Derby (non-market) mechanism.

  • Higher budget films, sequels and films featuring stars are more accurately predicted.

Abstract

This study investigates and compares predictions of opening weekend box office revenue from an online prediction game, the Derby, and an online prediction market, the Hollywood Stock Exchange (HSX), using a sample of 141 films released in 2007. Overall, both mechanisms provide accurate predictions of box office outcomes but tend to over-predict small-earning films and under-predict large-earning films. This bias is present across a number of sub-samples disaggregated by film-specific variables. The bias is consistently greater in the Derby game, suggesting that the market mechanism is superior to the non-market mechanism. There is also evidence that larger budget films, sequels and films featuring stars are predicted more accurately in both settings, and that individual-level predictions improve as films spend more time at the box office and as players gain experience.

Introduction

At least since the work of Galton (1907), economists and other social scientists have observed that collectives possess the ability to formulate accurate predictions about unknown future events in a variety of contexts. Sometimes referred to as the ‘wisdom-of-crowds’ or ‘collective-intelligence’ effect, these aggregated predictions have been shown to routinely outperform expert, statistical, and other predictive techniques. Hayek (1945) explicated the role markets may play in this process and, in more recent times, prediction markets have received much attention as powerful tools by which to aggregate widely dispersed information.1 The aggregation mechanism may take a variety of forms such as a simple survey or poll, through to more exotic mechanisms such as a double auction (e.g. Chen and Plott, 2002) or pari-mutuel design (e.g. Plott et al., 2003, Axelrod et al., 2009).

But while most prediction mechanisms rely on offering real financial incentives, there are many examples of online games and betting markets which attract large numbers of dedicated and enthusiastic participants yet offer no financial incentives – for example fantasy sports betting or virtual stock-markets such as the Hollywood Stock Exchange (HSX). Although some may argue there is little value studying these so-called ‘artificial’ markets (due to the lack of real incentives), others have observed that predictions arising within them are similar to markets where real financial gains are attainable (Pennock et al., 2001, Servan-Schreiber et al., 2004). It would seem that individuals who participate in such markets and games do so for the implicit satisfaction they derive from simply competing against other participants.

Beyond questions pertaining to artificial vs. real markets, there is also an open question as to whether market-based mechanisms actually do provide better forecasts than other mechanisms – for example statistical forecasting, polling or surveying. In the context of elections, Erikson and Wlezien (2008) and Rothschild (2009) show the superiority of polls under certain conditions, and in the context of football, baseball and movies Goel et al. (2010) point out that the benefits of prediction markets are often only marginal and advocate that benefits need to be weighed against costs before considering their implementation over alternative and, often more simple, mechanisms.

This study investigates and compares predictions of opening weekend box office revenue from an online prediction game, the Derby, and an online prediction market, the Hollywood Stock Exchange (HSX) on a sample of 141 films released in 2007. Predicting box office returns is an important task for film producers and investors but has proven difficult in an industry characterised by extreme levels of uncertainty. Such uncertainty manifests in the famous quote of screenwriter William Goldman who declared about the industry: “nobody knows anything” (Goldman, 1983). The causes and consequences of box office uncertainty have been explored by De Vany and Walls, 1996, De Vany and Walls, 1999, De Vany and Walls, 2004 who attribute it to the inherent unpredictability of audience reception and bandwagon effects caused by consumer interaction – both features of markets for information goods such as movies.

While a number of studies have applied statistical techniques for predicting box office outcomes, statistical models are often complicated by data availability and/or reliability, endogeneity of key explanatory variables, and the heavy-tailed nature of box office revenue data.2 An alternative approach is to use prediction markets. A number of studies have shown that box office predictions arising from the HSX are accurate even though the market is artificial with no real incentives.3 Pennock et al. (2001) investigate the HSX and show a high degree of correlation between actual and predicted outcomes. Spann and Skiera (2003) similarly conclude that the HSX market has strong predictive power but find that the predictions are biased upwards for small-earning films and biased downwards for large-earning films. Elberse and Eliashberg (2003) use HSX prices as a measure of opening week expected revenue in a dynamic model of demand and supply. Elberse and Anand (2006) study advertising’s impact on HSX prices. Foutz and Jank (2010) find not only strong predictive power of closing HSX prices, but that predictive accuracy is improved when the history of the price is included using functional shape analysis techniques. And, in a more general survey of prediction markets, Wolfers and Zitzewitz (2004) also note the strong correlation of the HSX closing prices and actual revenues.

The primary purpose of this study is to evaluate another, and arguably simpler, prediction mechanism and compare it with the predictions of the HSX market. The Derby game, a discontinued internet-based game formerly hosted on the popular BoxOfficeMojo.com website, involves participants making weekly predictions of weekend box office revenue for a selection of nominated films. Both the Derby and HSX mechanisms share common features in that they are accessible to the entire internet community at zero cost, yet offer no financial incentives to participants. The mechanisms differ in that the HSX is based on a traditional stock market design with an automated market maker whereas the Derby mechanism simply aggregates participant revenue estimates.

This study uses a set of linear regression specifications to compare predicted with actual revenues for the US opening weekend box office revenues on a sample of 141 movies. The predictions are made the day before theatrical release. The value in studying prediction mechanisms in the context of movies derives principally from the potential they offer producers and investors for decision making at the various stages of the production and distribution processes. The very short-range data in this study are not useful for this purpose. However the results can be useful in designing prediction mechanisms to assist in decision making for longer-time horizons.

Consistent with previous studies, results show that the HSX market mechanism provides accurate predictions of opening weekend revenues but with some tendency towards under-prediction of high-earning films and over-prediction of low-earning films. The Derby game also generates reasonably accurate predictions with the same bias. The bias is consistently greater with the Derby game than with the HSX mechanism across a number of sub-sample analyses. There is also some evidence that large budget films, sequels and films featuring stars are more accurately predicted. Both the HSX and Derby mechanisms provide better forecasts than a simple regression of revenues on explanatory film covariates.

The nature of the Derby game data allow investigation of individual and pooled accuracy. The results show that individual participants typically make better predictions for higher budget films and on films playing at more theatres but are less accurate on films with stars. Further, individuals more accurately predict sequels, remakes, reissues and films further into their run. Individuals generally make more accurate predictions as they acquire more experience in the game. In terms of pooled accuracy, and consistent with the results of the comparative exercise, there is some evidence that large budget films are predicted more accurately. Not surprisingly, pooled predictions are also more accurate for films later in their theatrical run.

The remainder of the paper is structured as follows. Section 2 discusses the HSX and Derby prediction data as well as the institutional features of each prediction mechanism. It also presents basic summary statistics of predictive accuracy in relation to various film-specific covariates. Section 3 outlines the empirical methodology and results of three exercises: (i) aggregate-level comparison of the HSX and Derby prediction mechanisms; (ii) disaggregated comparison of the HSX and Derby mechanisms in relation to film-specific covariates; and (iii) analysis of the extended Derby data set of individual and pooled prediction accuracy. Finally Section 4 discusses results and concludes the paper.

Section snippets

Data

The primary data used in this study are predicted and actual opening-weekend box office revenues for a sample of 141 films released in US cinemas during 2007. Data from the HSX market are at the film level, whereas data from the Derby include individual predictions of 2523 participants giving a total of 177,812 prediction data points. The predicted and actual revenues were augmented with a set of film-specific variables (Neilsen EDI) including production budgets and weekly theatres, as well as

Mechanism performance of the HSX and Derby

To investigate mechanism performance and the role of film-specific covariates, a number of regressions are estimated in Table 3. In specification (1) actual revenue is regressed on HSX and Derby predictions, respectively.

Conclusions and discussion

Information goods, such as movies, are typically characterised by high levels of uncertainty and statistical models often encounter difficulties owing to data availability/reliability, endogeneity of key variables, and heavy-tailed outcomes. An alternative to using statistical techniques is to use prediction markets, or other information aggregation mechanisms, in making predictions to assist decision making. As previously observed in relation to the HSX, a high degree of correlation between

References (26)

  • B. Axelrod et al.

    Design improved parimutuel-type information aggregation mechanisms: inaccuracies and the long-shot bias as disequilibrium phenomena

    Journal of Economic Behavior and Organization

    (2009)
  • A. De Vany et al.

    Motion picture profit, the stable Paretian hypothesis and the curse of the superstar

    Journal of Economic Dynamics and Control

    (2004)
  • Arrow, K., Sunder, S., Forsythe, R., Litan, R., Zitzewitz, E., Gorham, M., Hahn, R., Hanson, R., Kahneman, D., Ledyard,...
  • Chen, K.Y., Plott, C.R., 2002, Information Aggregation Mechanisms: Concept, Design and Field Implementation. Social...
  • R.B. D’Agostino et al.

    A suggestion for using powerful and informative tests of normality

    American Statistician

    (1990)
  • A. De Vany et al.

    Bose-Einstein dynamics and adaptive contracting in the motion picture industry

    The Economic Journal

    (1996)
  • A. De Vany et al.

    Uncertainty in the movie industry: does star power reduce the terror of the box office?

    Journal of Cultural Economics

    (1999)
  • A. Elberse et al.

    Advertising and expectations: the effectiveness of pre-release advertising for motion pictures: an empirical investigation using a simulated market

    Information Economics and Policy

    (2006)
  • A. Elberse et al.

    Demand and supply dynamics for sequentially released products in international markets: the case of motion pictures

    Marketing Science

    (2003)
  • J. Eliashberg et al.

    The motion picture industry: critical issues in practice, current research, and new research directions

    Marketing Science

    (2006)
  • R.And. Erikson et al.

    Are political markets really superior to polls as election predictors?

    Public Opinion Quarterly

    (2008)
  • N. Foutz et al.

    The wisdom of crowds: pre-release forecasting via functional shape analysis of the online virtual stock market

    Marketing Science

    (2010)
  • F. Galton

    Vox Populi

    Nature

    (1907)
  • Cited by (21)

    • Predicting and ranking box office revenue of movies based on big data

      2020, Information Fusion
      Citation Excerpt :

      Jack Valenti, former CEO of the Motion Picture Association of America, claimed in 1979, “No one can tell you how a movie is going to do in the marketplace, not until the movie opens in darkened theatre and sparks fly up between the screen and the audience” Nevertheless, driven by the apparent financial benefit, researches have been attempting to predict BOR prediction since 1983 [9]. Since then, the issue was broadly studied by econometricians and operations researchers [10–15]. However, these models are usually based on empirical assumptions with hyper-parameters that are not easy to set.

    • Pre-production forecasting of movie revenues with a dynamic artificial neural network

      2015, Expert Systems with Applications
      Citation Excerpt :

      Much of the marketing research conducted in the domain of box-office revenue forecasting involves the examination of the role of online reviews (Dellarocas, Farag, & Zhang, 2005; Dellarocas, Zhang, & Awad, 2007; Duan, Gu, & Whinston, 2008a; Duan, Gu, & Whinston, 2008b). In particular, examinations of the word-of-mouth features found within the Hollywood Stock Exchange (HSX) have yielded highly accurate post-production and post-release forecasts of box-office revenues (Doshi, 2010; Elberse & Anand, 2007; Elberse & Eliashberg, 2003; Foutz & Jank, 2007; McKenzie, 2013; Spann & Skiera, 2003). Marketing research has included examinations of market share based forecasting models (Ainslie et al., 2005).

    • Prediction of box-office success: A review of trends and machine learning computational models

      2022, International Journal of Business Intelligence and Data Mining
    View all citing articles on Scopus

    I am grateful to Elina Gilbourd for research assistance in the early stages of this project. I am also extremely grateful for the comments of two anonymous referees and the editor, Lisa M. George. All remaining errors are my own.

    View full text