Skip to main content

Advertisement

Log in

Understanding the complexities of the fine structure of interest rates: a Wasserstein barycenter learning approach

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

A novel methodology to investigate the fine structure of interest rates based on Machine Learning techniques is discussed. The aim is to capture in an unsupervised way the common stochastic structure that drives the dynamics of interest rates of different maturities. The proposed approach is based on the Wasserstein barycenter, a powerful tool of analysis that allows us to construct, from a set of assigned probability distributions, a single probability distribution that captures the essential features of the whole set. To identify common stochastic factors, a Gaussian Mixture Model is fitted to the Wasserstein barycenter by maximum likelihood using the Expectation-Maximization algorithm with an initialization strategy based on Graph Machine Learning techniques. A fine-tuning of single-maturity interest rates is discussed in an attempt to capture maturity-specific stochastic factors. The proposed analysis also gives us the opportunity to test the hypothesis of a market segmentation into a short-term segment, the money market, and a long-term segment, the capital market, each with its own segment-specific stochastic factors. The methodology is tested on the US zero-coupon Treasury yield curve. The results obtained seem to show that most of the stochastic nature of the dynamics of the US zero-coupon yield curve can be captured by a three-component Gaussian Mixture Model describing the Wasserstein barycenter of the short-term segment of the yield curve.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

The dataset comprising zero-coupon yields utilized in this study is openly accessible via the following URL: https://sites.google.com/view/jingcynthiawu/yield-data.

References

  1. Gürkaynak RS, Wright JH (2012) Macroeconomics and the term structure. J Econ Lit 50(2):331–67. https://doi.org/10.1257/jel.50.2.331

    Article  Google Scholar 

  2. Filipovic D (2009) Term-structure models: a graduate course. Springer, Springer Finance, Berlin

    Book  Google Scholar 

  3. Dai Q, Singleton K (2015) Term structure dynamics in theory and reality. Rev Financ Stud 16(3):631–678. https://doi.org/10.1093/rfs/hhg010

    Article  Google Scholar 

  4. Vasicek OA, Venegas-Martínez F (2021) Models of the term structure of interest rates: review, trends, and perspectives. Remef - Revista Mexicana de Economía y Finanzas Nueva Época REMEF (The Mex J Econ Finance) 16(2):1–28

    Google Scholar 

  5. Vasicek O (1977) An equilibrium characterization of the term structure. J Financ Econ 5(2):177–188

    Article  Google Scholar 

  6. Cox J, Ingersoll J, Ross S (1985) A theory of the term structure of interest rates. Econometrica 53:385–407. https://doi.org/10.2307/1911242

    Article  MathSciNet  Google Scholar 

  7. Zhu Y-L (2003) Three-factor interest rate models. Commun Math Sci 1(3):557–573

    Article  MathSciNet  Google Scholar 

  8. Heath D, Jarrow R, Morton A (1992) Bond pricing and the term structure of interest rates: a new methodology for contingent claims valuation. Econom J Econom Soc 60:77–105

    Google Scholar 

  9. Chiarella C, Hung H, Tô T-D (2009) The volatility structure of the fixed income market under the hjm framework: a nonlinear filtering approach. Comput Stat Data Anal 53(6):2075–2088. https://doi.org/10.1016/j.csda.2008.07.036. (The Fourth Special Issue on Computational Econometrics)

    Article  MathSciNet  Google Scholar 

  10. Bose I, Mahapatra RK (2001) Business data mining-a machine learning perspective. Inf Manag 39(3):211–225. https://doi.org/10.1016/S0378-7206(01)00091-X

    Article  Google Scholar 

  11. Neñer J, Cardoso B-HF, Laguna MF, Gonçalves S, Iglesias JR (2022) Study of taxes, regulations and inequality using machine learning algorithms. Phil Trans R Soc A 380(2224):20210165

    Article  MathSciNet  Google Scholar 

  12. Nunes M, Gerding E, McGroarty F, Niranjan M (2019) A comparison of multitask and single task learning with artificial neural networks for yield curve forecasting. Expert Syst Appl 119:362–375

    Article  Google Scholar 

  13. Zheng S, Trott A, Srinivasa S, Parkes DC, Socher R (2022) The ai economist: Taxation policy design via two-level deep multiagent reinforcement learning. Sci Adv 8(18):2607. https://doi.org/10.1126/sciadv.abk2607

    Article  Google Scholar 

  14. Guenther DA, Peterson K, Searcy J, Williams BM (2023) How useful are tax disclosures in predicting effective tax rates? A machine learning approach. Account Rev 98(5):297–322. https://doi.org/10.2308/TAR-2021-0398

    Article  Google Scholar 

  15. Liu Y, Wu JC (2021) Reconstructing the yield curve. J Financ Econ 142(3):1395–1425

    Article  Google Scholar 

  16. Voit J, Lourie RW (2002) The statistical mechanics of financial markets. Phys Today 55(8):51–52. https://doi.org/10.1063/1.1510282

    Article  Google Scholar 

  17. McLachlan GJ, Peel D (2000) Finite Mixture Models. Wiley Series in Probability and Statistics, New York

    Book  Google Scholar 

  18. Tofallis C (2008) Selecting the best statistical distribution using multiple criteria. Comput Ind Eng 54(3):690–694. https://doi.org/10.1016/j.cie.2007.07.016

    Article  Google Scholar 

  19. Wang Y, Yam RCM, Zuo MJ (2004) A multi-criterion evaluation approach to selection of the best statistical distribution. Comput Ind Eng 47(2–3):165–180. https://doi.org/10.1016/j.cie.2004.06.003

    Article  Google Scholar 

  20. Böhning D, Seidel W, Alfó M, Garel B, Patilea V, Walther G (2007) Editorial: advances in mixture models. Comput Stat Data Anal 51(11):5205–5210. https://doi.org/10.1016/j.csda.2006.10.025

    Article  Google Scholar 

  21. Brochado A, Martins F (2005) Assessing the number of components in mixture models: a review. Universidade do Porto, Faculdade de Economia do Porto, FEP Working Papers

  22. Mari C, Baldassari C (2022) Unsupervised expectation-maximization algorithm initialization for mixture models: a complex network-driven approach for modeling financial time series. Inf Sci 617:1–16

    Article  Google Scholar 

  23. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J Roy Stat Soc Ser B (Methodol) 39(1):1–22. https://doi.org/10.1111/j.2517-6161.1977.tb01600.x

    Article  MathSciNet  Google Scholar 

  24. Mari C, Baldassari C (2023) A graph-based superframework for mixture model estimation using em: an analysis of us wholesale electricity markets. Neural Comput Appl 35(20):14867–14883

    Article  Google Scholar 

  25. Stekhoven DJ, Bühlmann P (2012) Missforest-non-parametric missing value imputation for mixed-type data. Bioinformatics 28(1):112–118

    Article  Google Scholar 

  26. Vaserstein LN (1969) Markov processes over denumerable products of spaces, describing large systems of automata. Problemy Peredachi Informatsii 5(3):64–72

    MathSciNet  Google Scholar 

  27. Kantorovich LV (1960) Mathematical methods of organizing and planning production. Manage Sci 6(4):366–422

    Article  MathSciNet  Google Scholar 

  28. Agueh M, Carlier G (2011) Barycenters in the wasserstein space. SIAM J Math Anal 43(2):904–924

    Article  MathSciNet  Google Scholar 

  29. Monge G (1781) Mémoire sur la Théorie des Déblais Et des remblais. Imprimerie royale,

  30. Kantorovitch L (1958) On the translocation of masses. Manage Sci 5(1):1–4

    Article  MathSciNet  Google Scholar 

  31. Villani C (2008) Optimal transport: old and new, vol 338. Springer Science & Business Media, Berlin

    Google Scholar 

  32. Peyré G, Cuturi M et al (2019) Computational optimal transport: with applications to data science. Found Trends® Mach Learn 11(5–6):355–607

    Article  Google Scholar 

  33. Gelbrich M, Rachev ST (1996) Discretization for stochastic differential equations, lp wasserstein metrics, and econometrical models. Lect Notes-Monogr Ser 28:97–119

    Google Scholar 

  34. Motamed M, Appelo D (2019) Wasserstein metric-driven bayesian inversion with applications to signal processing. Int J Uncertain Quantif 9(4):395–414

    Article  MathSciNet  Google Scholar 

  35. Bonneel N, Peyré G, Cuturi M (2016) Wasserstein barycentric coordinates: histogram regression using optimal transport. ACM Trans Graph 35(4):71–1

    Article  Google Scholar 

  36. Kolouri S, Zou Y, Rohde GK (2016) Sliced wasserstein kernels for probability distributions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5258–5267

  37. Figalli A, Ge Y, Kim Y-H, McCann R, Trudinger N (2010) 10w5025 optimal transportation and applications

  38. Altschuler JM, Boix-Adsera E (2022) Wasserstein barycenters are np-hard to compute. SIAM J Math Data Sci 4(1):179–203

    Article  MathSciNet  Google Scholar 

  39. Panaretos VM, Zemel Y (2019) Statistical aspects of Wasserstein distances. Annu Rev Stat Appl 6:405–431

    Article  MathSciNet  Google Scholar 

  40. Fatras K, Zine Y, Majewski S, Flamary R, Gribonval R, Courty N (2021) Minibatch optimal transport distances; analysis and applications

  41. Fatras K, Zine Y, Flamary R, Gribonval R, Courty N (2021) Learning with minibatch Wasserstein : asymptotic and gradient properties

  42. Linton O, Mammen E, Nielsen JP, Tanggaard C (2001) Yield curve estimation by kernel smoothing methods. J Econom 105(1):185–223

    Article  MathSciNet  Google Scholar 

  43. Mari C, Baldassari C (2021) Ensemble methods for jump-diffusion models of power prices. Energies 14(8):2084. https://doi.org/10.3390/en14082084

    Article  Google Scholar 

  44. Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254

    Article  Google Scholar 

  45. Hipp J, Bauer D (2006) Local solutions in the estimation of growth mixture models: Correction to hipp and bauer (2006). Psychol Methods 11:305–305. https://doi.org/10.1037/1082-989X.11.3.305

    Article  Google Scholar 

  46. Shireman E, Steinley D, Brusco MJ (2017) Examining the effect of initialization strategies on the performance of gaussian mixture modeling. Behav Res Methods 49:282–293

    Article  Google Scholar 

  47. Mari C, Baldassari C (2023) Optimization of mixture models on time series networks encoded by visibility graphs: an analysis of the us electricity market. CMS 20(1):28

    Article  Google Scholar 

  48. Ghosh SK (2007) Visibility algorithms in the plane. Cambridge University Press, Cambridge

    Book  Google Scholar 

  49. Lacasa L, Luque B, Ballesteros F, Luque J, Nuño JC (2008) From time series to complex networks: the visibility graph. Proc Natl Acad Sci 105(13):4972–4975. https://doi.org/10.1073/pnas.0709247105

    Article  MathSciNet  Google Scholar 

  50. Cai C, Wang D, Wang Y (2021) Graph coarsening with neural networks

  51. Rozemberczki B, Sarkar R (2018) Fast sequence-based embedding with diffusion graphs. In: Complex Networks IX: Proceedings of the 9th Conference on Complex Networks CompleNet 2018 9, pp 99–107. Springer

  52. Donnat C, Zitnik M, Hallac D, Leskovec J (2018) Learning structural node embeddings via diffusion wavelets. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1320–1329

  53. Liao L, He X, Zhang H, Chua T-S (2018) Attributed social network embedding. IEEE Trans Knowl Data Eng 30(12):2257–2270. https://doi.org/10.1109/tkde.2018.2819980

    Article  Google Scholar 

  54. Yang C, Sun M, Liu Z, Tu C (2017) Fast network embedding enhancement via high order proximity approximation. In: IJCAI, pp 3894–3900

  55. Edelsbrunner H, Letscher D, Zomorodian A (2000) Topological persistence and simplification. Discrete Comput Geom 28:511–533. https://doi.org/10.1007/s00454-002-2885-2

    Article  MathSciNet  Google Scholar 

  56. Zomorodian A, Carlsson G (2005) Computing persistent homology. Discrete Comput Geom 33(2):249–274. https://doi.org/10.1007/s00454-004-1146-y

    Article  MathSciNet  Google Scholar 

  57. Koontz WLG, Narendra PM, Fukunaga K (1976) A graph-theoretic approach to nonparametric cluster analysis. IEEE Trans Comput 25(09):936–944

    Article  MathSciNet  Google Scholar 

  58. Koontz WLG, Narendra PM, Fukunaga K (1975) A branch and bound clustering algorithm. IEEE Trans Comput C–24(9):908–915. https://doi.org/10.1109/T-C.1975.224336

    Article  MathSciNet  Google Scholar 

  59. Chazal F, Guibas LJ, Oudot SY, Skraba P (2013) Persistence-based clustering in riemannian manifolds. J ACM (JACM) 60(6):1–38

    Article  MathSciNet  Google Scholar 

  60. Cohen-Steiner D, Edelsbrunner H, Harer J (2007) Stability of persistence diagrams. Discret Comput Geom 37(1):103–120. https://doi.org/10.1007/S00454-006-1276-5

    Article  MathSciNet  Google Scholar 

  61. Chazal F, De Silva V, Glisse M, Oudot S (2016) The structure and stability of persistence modules, vol 10. Springer, Berlin

    Book  Google Scholar 

  62. Chazal F, Cohen-Steiner D, Glisse M, Guibas LJ, Oudot SY (2009) Proximity of persistence modules and their diagrams. In: Proceedings of the twenty-fifth annual symposium on computational geometry. SCG ’09, pp 237–246. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/1542362.1542407

  63. Kuha J (2004) Aic and bic: comparisons of assumptions and performance. Sociol Methods Res 33(2):188–229. https://doi.org/10.1177/0049124103262065

    Article  MathSciNet  Google Scholar 

  64. Jacomy M (2009) Force-atlas graph layout algorithm. http://gephi.org/2011/forceatlas2-the-new-version-of-our-home-brew-layout

Download references

Acknowledgements

This research has been carried out within the Project ECS 0000024 Rome Technopole, - CUP B83C22002820006, NRP Mission 4 Component 2 Investment 1.5, Funded by the European Union - NextGenerationEU.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cristiano Baldassari.

Ethics declarations

Conflict of interest

The authors declare no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mari, C., Baldassari, C. Understanding the complexities of the fine structure of interest rates: a Wasserstein barycenter learning approach. Neural Comput & Applic 36, 19291–19305 (2024). https://doi.org/10.1007/s00521-024-10202-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-024-10202-5

Keywords