Abstract
Quantitative tools have been widely adopted in order to extract the massive information from a variety of financial data. Mathematics, statistics and computers algorithms have never been so important to financial practitioners in history. Investment banks develop equilibrium models to evaluate financial instruments; mutual funds applied time series to identify the risks in their portfolio; and hedge funds hope to extract market signals and statistical arbitrage from noisy market data. The rise of quantitative finance in the last decade relies on the development of computer techniques that make processing large datasets possible. As more data is available at a higher frequency, more researches in quantitative finance have switched to the microstructures of financial market. High frequency data is a typical example of big data that is characterized by the 3V’s: velocity, variety and volume. In addition, the signal-to-noise ratio in financial time series is usually very small. High frequency datasets are more likely to be exposed to extreme values, jumps and errors than the low frequency ones. Specific data processing techniques and quantitative models are elaborately designed to extract information from financial data efficiently. In this chapter, we present the quantitative data analysis approaches in finance. First, we review the development of quantitative finance in the past decade. Then we discuss the characteristics of high frequency data and the challenges it brings. The quantitative data analysis consists of two basic steps: (i) data cleaning and aggregating; (ii) data modeling. We review the mathematics tools and computing technologies behind the two steps. The valuable information extracted from raw data is represented by a group of statistics. The most widely used statistics in finance are expected return and volatility, which are the fundamentals of modern portfolio theory. We further introduce some simple portfolio optimization strategies as an example of the application of financial data analysis. Big data has already changed financial industry fundamentally; while quantitative tools for addressing massive financial data still have a long way to go. Adoptions of advanced statistics, information theory, machine learning and faster computing algorithms are inevitable in order to predict complicated financial markets. These topics are briefly discussed in the later part of this chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
I. Aldridge, High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems (Wiley, Hoboken, 2009)
I. Aldridge, Trends: all finance will soon be big data finance (2015). http://www.huffingtonpost.com/irene-aldridge/trends-all-finance-will-s_b_6613138.html
S.-I. Amari, H. Nagaoka, Methods of Information Geometry (American Mathematical Society, Providence, 2007)
T.G. Andersen, T. Bollerslev, Intraday periodicity and volatility persistence in financial markets. J. Empir. Financ. 4(2), 115–158 (1997)
T.G. Andersen, T. Bollerslev et al., Intraday and interday volatility in the Japanese stock market. J. Int. Financ. Mark. Inst. Money 10(2), 107–130 (2000)
A. Beck, Y.S.A. Kim et al., Empirical analysis of ARMA-GARCH models in market risk estimation on high-frequency US data. Stud. Nonlinear Dyn. Econom. 17(2), 167–177 (2013)
F. Black, M. Scholes, The pricing of options and corporate liabilities. J. Polit. Econ. 81, 637–654 (1973)
T. Bollerslev, Generalized autoregressive conditional heteroskedasticity. J. Econom. 31(3), 307–327 (1986)
C.T. Brownlees, G.M. Gallo, Financial econometric analysis at ultra-high frequency: data handling concerns. Comput. Stat. Data Anal. 51(4), 2232–2245 (2006)
N. Cesa-Bianchi, G. Lugosi, Prediction, Learning, and Games (Cambridge University Press, Cambridge, 2006)
A. Chekhlov, S.P. Uryasev et al., Portfolio optimization with drawdown constraints. Research report 2000-5. Available at SSRN http://dx.doi.org/10.2139/ssrn.223323 (2000)
J. Choi, A.P. Mullhaupt, Geometric shrinkage priors for Khlerian signal filters. Entropy 17(3), 1347–1357 (2015)
T.M. Cover, Universal portfolios. Math. Financ. 1(1), 1–29 (1991)
T.M. Cover, E. Ordentlich, Universal portfolios with side information. IEEE Trans. Inform. Theory 42(2), 348–363 (1996)
J.C. Cox, S.A. Ross, The valuation of options for alternative stochastic processes. J. Financ. Econ. 3(1–2), 145–166 (1976)
J.C. Cox, S.A. Ross et al., Option pricing: a simplified approach. J. Financ. Econ. 7(3), 229–263 (1979)
D.W. Diamond, R.E. Verrecchia, Constraints on short-selling and asset price adjustment to private information. J. Financ. Econ. 18(2), 277–311 (1987)
X. Dong, New development on market microstructure and macrostructure: patterns of US high frequency data and a unified factor model framework. Ph.D. Dissertation, State University of New York at Stony Brook (2013)
D. Duffie, Dynamic Asset Pricing Theory (Princeton University Press, Princeton, 2010)
A. Dufour, R.F. Engle, Time and the price impact of a trade. J. Financ. 55(6), 2467–2498 (2000)
D. Easley, M. O’hara, Time and the process of security price adjustment. J. Financ. 47(2), 577–605 (1992)
R.F. Engle, Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econom. J. Econom. Soc. 50, 987–1007 (1982)
R.F. Engle, The econometrics of ultra-high-frequency data. Econometrica 68(1), 1–22 (2000)
R.F. Engle, S. Manganelli, CAViaR: conditional autoregressive value at risk by regression quantiles. J. Bus. Econ. Stat. 22(4), 367–381 (2004)
R.F. Engle, J.R. Russell, Autoregressive conditional duration: a new model for irregularly spaced transaction data. Econometrica 66, 1127–1162 (1998)
B. Fang, P. Zhang, in Big Data in Finance. Big Data Concepts, Theories, and Applications, ed. by S. Yu, S. Guo (Springer International Publishing, Cham, 2016), pp. 391–412
R. Gençay, M. Dacorogna et al., An Introduction to High-Frequency Finance (Academic Press, San Diego, 2001)
L. Györfi, I. Vajda, Growth optimal investment with transaction costs. Algorithmic Learning Theory (Springer, Berlin, 2008)
J.M. Harrison, D.M. Kreps, Martingales and arbitrage in multiperiod securities markets. J. Econ. Theory 20(3), 381–408 (1979)
D.P. Helmbold, R.E. Schapire et al., On-line portfolio selection using multiplicative updates. Math. Financ. 8(4), 325–347 (1998)
T. Jia, Algorithms and structures for covariance estimates with application to finance. Ph.D. Dissertation, State University of New York at Stony Brook (2013)
Y.S. Kim, Multivariate tempered stable model with long-range dependence and time-varying volatility. Front. Appl. Math. Stat. 1, 1 (2015)
O. Ledoit, M. Wolf, Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. J. Empir. Financ. 10(5), 603–621 (2003)
B. Li, S.C. Hoi, Online portfolio selection: a survey. ACM Comput. Surv. (CSUR) 46(3), 35 (2014)
J. Lintner, The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. Rev. Econ. Stat. 47, 13–37 (1965)
C. Liu, D.B. Rubin, The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81(4), 633–648 (1994)
H. Markowitz, Portfolio selection. J. Financ. 7(1), 77–91 (1952)
S.L. Marple Jr., Digital Spectral Analysis with Applications (Prentice-Hall, Inc, Englewood Cliffs, 1987)
Y. Matsuyama, The alpha-EM algorithm: surrogate likelihood maximization using alpha-logarithmic information measures. IEEE Trans. Inform. Theory 49(3), 692–706 (2003)
A.J. McNeil, R. Frey et al., Quantitative Risk Management: Concepts, Techniques and Tools (Princeton University Press, Princeton, 2005)
X.-L. Meng, D.B. Rubin, Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80(2), 267–278 (1993)
R.C. Merton, Lifetime portfolio selection under uncertainty: the continuous-time case. Rev. Econ. Stat. 51, 247–257 (1969)
A. Meucci, ‘P’Versus ‘Q’: differences and commonalities between the two areas of quantitative finance. GARP Risk Prof., 47–50 (2011)
A.M. Mineo, F. Romito, A method to ‘clean up’ ultra high-frequency data, Vita e pensiero (2007)
A.M. Mineo, F. Romito, Different methods to clean up ultra high-frequency data. Atti della XLIV Riunione Scientifica della Societa’Italiana di Statistica (2008)
J. Mossin, Equilibrium in a capital asset market. Econom.: J. Econom. Soc. 34, 768–783 (1966)
A.P. Mullhaupt, K.S. Riedel, Band matrix representation of triangular input balanced form. IEEE Trans. Autom. Control (1998)
R.M. Neal, G.E. Hinton, A view of the EM algorithm that justifies incremental, sparse, and other variants, Learning in Graphical Models (Springer, New York, 1998), pp. 355–368
J. Nocedal, S. Wright, Numerical Optimization (Springer Science and Business Media, New York, 2006)
S.T. Rachev, S. Mittnik et al., Financial Econometrics: From Basics to Advanced Modeling Techniques (Wiley, New York, 2007)
R.T. Rockafellar, S. Uryasev, Optimization of conditional value-at-risk. J. Risk 2, 21–42 (2000)
D.B. Rubin, D.T. Thayer, EM algorithms for ML factor analysis. Psychometrika 47(1), 69–76 (1982)
J.R. Russell, R. Engle et al., Analysis of high-frequency data. Handb. Financ. Econom. 1, 383–426 (2009)
W.F. Sharpe, Capital asset prices: a theory of market equilibrium under conditions of risk. J. Financ. 19(3), 425–442 (1964)
X. Shi, A. Kim, Coherent risk measure and normal mixture distributions with application in portfolio optimization and risk allocation (2015). Available at SSRN http://dx.doi.org/10.2139/ssrn.2548057
W. Sun, S.Z. Rachev et al., Long-range dependence, fractal processes, and intra-daily data, Handbook on Information Technology in Finance (Springer, New York, 2008), pp. 543–585
S. Tomov, R. Nath et al., Dense linear algebra solvers for multicore with GPU accelerators, in IEEE International Symposium on Parallel and Distributed Processing, Workshops and PhD Forum (IPDPSW) (IEEE, 2010)
J.L. Treynor, Toward a theory of market value of risky assets. Available at SSRN (1961). doi:10.2139/ssrn.628187
Y. Yan, Introduction to TAQ. WRDS Users Conference Presentation (2007)
P. Zhang, Y. Gao, Matrix multiplication on high-density multi-GPU architectures: theoretical and experimental investigations, in High Performance Computing: 30th International Conference, ISC High Performance 2015, Frankfurt, Germany, 12–16 July 2015, Proceedings, ed. by M.J. Kunkel, T. Ludwig (Springer International Publishing, Cham, 2015), pp. 17–30
P. Zhang, Y. Gao et al., A data-oriented method for scheduling dependent tasks on high-density multi-GPU systems, in IEEE 17th International Conference on High Performance Computing and Communications (HPCC), IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), IEEE 12th International Conference on Embedded Software and Systems (ICESS) New York, NY, 2015, pp. 694–699
P. Zhang, L. Liu et al., A data-driven paradigm for mapping problems. Parallel Comput. 48, 108–124 (2015)
P. Zhang, K. Yu et al., QuantCloud: big data infrastructure for quantitative finance on the cloud. IEEE Trans. Big Data (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Shi, X., Zhang, P., Khan, S.U. (2017). Quantitative Data Analysis in Finance. In: Zomaya, A., Sakr, S. (eds) Handbook of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-49340-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-49340-4_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49339-8
Online ISBN: 978-3-319-49340-4
eBook Packages: Computer ScienceComputer Science (R0)