Skip to main content

Probabilistic Compositional Semantics, Purely

  • Conference paper
  • First Online:
New Frontiers in Artificial Intelligence (JSAI-isAI 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13856))

Included in the following conference series:

Abstract

We provide a general framework for the integration of formal semantics with probabilistic reasoning. This framework is conservative, in the sense that it relies only on typed \(\lambda \)-calculus and is thus compatible with logical systems already in use. The framework is also presented modularly, in that it regards probabilistic effects (i.e., sampling and marginalization) as side effects, using continuations. We show how our framework may be used to build probabilistic programs compositionally within typed \(\lambda \)-calculus and then illustrate its use on two applications: semantic learning and pragmatic inference within the Rational Speech Act framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The \(\lambda \)-homomorphisms that we employ map one higher-order language into another, preserving variables, abstractions, applications, pairing, and projection. They are accompanied by type-homomorphisms \(\overline{ \alpha }\) which, for us, preserve implication and products (i.e., \(\overline{ \alpha \rightarrow \beta } = \overline{ \alpha } \rightarrow \overline{\beta }\) and \(\overline{ \alpha \times \beta } = \overline{ \alpha } \times \overline{\beta }\)), but which may in principle affect base types. In general, if \(M : \alpha \), then \(\llparenthesis M\rrparenthesis : \overline{ \alpha }\). The motivation for these constraints is that they provide meanings to the constants of the source language, leaving the surrounding \(\lambda \)-calculus unaffected (as analogous to a traditional model-theoretic interpretation). In this case, both \(\llparenthesis \cdot \rrparenthesis \) and its associated type homomorphism are trivial, mapping both constants and base types onto themselves.

  2. 2.

    There is some precedent for this representation of probabilistic programs, by Mohammed Ismail and Shan [17], who describe a small typed probabilistic programming language and provide a denotational semantics for it in terms of continuations. Our formulation is chiefly inspired by the dependently typed language of Bernardy et al. [3]. See also Jansson et al. [13].

  3. 3.

    Here, we leave \(\mathcal {N} : d_{tall} \times d_{tall} \rightarrow (d_{tall} \rightarrow r) \rightarrow r\) unanalyzed. In general, computing a continuous distribution \(\mathcal {D} : p_1 \times ... \times p_n \rightarrow (d \rightarrow r) \rightarrow r\) over \(d\) amounts to computing

    $$\lambda \langle p_1, ..., p_n\rangle , f.\int _{-\infty }^\infty \text {PDF}_{\mathcal {D}(p_1, ..., p_n)}(x) * f(x) dx$$

    where PDF\(_{\mathcal {D}(p_1, ..., p_n)}\) provides the probability density function associated with \(\mathcal {D}\) (given parameters \(p_1, ..., p_n\)). Such integrals don’t in general admit closed-form solutions, and so one must resort to approximations. We implement this via Markov chain Monte Carlo sampling in our Haskell implementation, using the library at https://github.com/jyp/ProbProg.

  4. 4.

    Some may recognize it as akin to the \(guard\) function of Haskell’s MonadPlus and Alternative classes.

  5. 5.

    Note that we define this posterior in terms of a joint prior distribution \(P_{L_1}(w, \theta )\). Lassiter and Goodman [14] assume the prior distributions over world states and linguistic parameters to be independent, with an effectively uniform prior over parameters.

  6. 6.

    That is, \( observe ( \phi )(f) = factor (\mathbbm {1}( \phi ))(f)\).

  7. 7.

    An alternative, syntactically closer to the discrete case, relies on the Dirac \({ \delta }\) distribution, whose value is zero everywhere except when its argument is zero, and whose total mass sums to one. Thus we recover a non-zero result after integration:

    $$\text {PDF}_p = \lambda x.p(\lambda y. \delta (x - y))$$

    .

  8. 8.

    More accurately, we would take \(U\) to be uniform over a finite set, \(S_U\). Thus we would define it as \(U = \lambda k.\varSigma _{u \in S_U}k(u)\).

  9. 9.

    To implement the definition of cost employed by RSA models, for example, \(U^*\) could be \(U \star \lambda u. factor (e^{- \alpha * C(u)}) \star \lambda {\diamond }. \eta (u)\), given some uniform distribution \(U\).

  10. 10.

    Emerson [5] advocates yet a third approach to RSA, in which linguistic parameters are marginalized out in the listener model altogether.

  11. 11.

    Systematically, if \( \alpha \) tends to \(\infty \); probabilistically, otherwise.

  12. 12.

    Available at https://github.com/juliangrove/grove-bernardy-lenls18.

References

  1. Barker, C., Shan, C.C.: Continuations and Natural Language, vol. 53. Oxford Studies in Theoretical Linguistics (2014)

    Google Scholar 

  2. Bernardy, J.P., Blanck, R., Chatzikyriakidis, S., Lappin, S., Maskharashvili, A.: Predicates as boxes in Bayesian semantics for natural language. In: Proceedings of the 22nd Nordic Conference on Computational Linguistics, Turku, Finland, pp. 333–337. Linköping University Electronic Press (2019). https://www.aclweb.org/anthology/W19-6137

  3. Bernardy, J.P., Blanck, R., Chatzikyriakidis, S., Maskharashvili, A.: Bayesian natural language semantics and pragmatics. In: Bernardy, J.P., Blanck, R., Chatzikyriakidis, S., Lappin, S., Maskharashvili, A. (eds.) Probabilistic Approaches to Linguistic Theory. CSLI Publications (2022)

    Google Scholar 

  4. Charlow, S.: On the semantics of exceptional scope. Ph.D. thesis, NYU, New York (2014). https://semanticsarchive.net/Archive/2JmMWRjY

  5. Emerson, G.: Probabilistic lexical semantics: from gaussian embeddings to bernoulli fields. In: Bernardy, J.P., Blanck, R., Chatzikyriakidis, S., Lappin, S., Maskharashvili, A. (eds.) Probabilistic Approaches to Linguistic Theory. CSLI Publications (2022)

    Google Scholar 

  6. Girard, J.Y.: Interprétation fonctionnelle et élimination des coupures de l’arithmétique d’ordre supérieur. Ph.D. thesis, Université Paris 7 (1972)

    Google Scholar 

  7. Goodman, N.D., Frank, M.C.: Pragmatic language interpretation as probabilistic inference. Trends Cogn. Sci. 20(11), 818–829 (2016). ISSN 1364-6613. https://doi.org/10.1016/j.tics.2016.08.005. https://www.sciencedirect.com/science/article/pii/S136466131630122X

  8. Goodman, N.D., Lassiter, D.: Probabilistic semantics and pragmatics uncertainty in language and thought. In: Lappin, S., Fox, C. (eds.) The Handbook of Contemporary Semantic Theory, pp. 655–686. Wiley (2015). ISBN 978-1-118-88213-9. https://doi.org/10.1002/9781118882139.ch21. http://onlinelibrary.wiley.com/doi/abs/10.1002/9781118882139.ch21, section: 21 _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781118882139.ch21

  9. Goodman, N.D., Mansinghka, V.K., Roy, D., Bonawitz, K., Tenenbaum, J.B.: Church: a language for generative models. In: Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence, UAI 2008, Arlington, Virginia, USA, pp. 220–229. AUAI Press (2008). ISBN 978-0-9749039-4-1

    Google Scholar 

  10. Goodman, N.D., Stuhlmüller, A.: Knowledge and implicature: modeling language understanding as social cognition. Top. Cogn. Sci. 5(1), 173–184 (2013). ISSN 1756-8765. https://doi.org/10.1111/tops.12007. https://onlinelibrary.wiley.com/doi/abs/10.1111/tops.12007

  11. Grice, H.P.: Logic and conversation. In: Cole, P., Morgan, J.L. (eds.) Syntax and Semantics. Speech Acts, vol. 3, pp. 41–58. Academic Press, New York (1975)

    Google Scholar 

  12. Grove, J., Bernardy, J.P., Chatzikyriakidis, S.: From compositional semantics to Bayesian pragmatics via logical inference. In: Proceedings of the 1st and 2nd Workshops on Natural Logic Meets Machine Learning (NALOMA), Groningen, The Netherlands, pp. 60–70. Association for Computational Linguistics (2021). https://aclanthology.org/2021.naloma-1.8

  13. Jansson, P., Ionescu, C., Bernardy, J.P.: Probability theory. In: Domain Specific Languages of Mathematics. Texts in Computing, no. 24, pp. 223–246 (2022)

    Google Scholar 

  14. Lassiter, D., Goodman, N.D.: Context, scale structure, and statistics in the interpretation of positive-form adjectives. Semant. Linguist. Theory 23(0), 587–610 (2013). ISSN 2163-5951. https://doi.org/10.3765/salt.v23i0.2658. https://journals.linguisticsociety.org/proceedings/index.php/SALT/article/view/2658

  15. Lassiter, D., Goodman, N.D.: Adjectival vagueness in a Bayesian model of interpretation. Synthese 194(10), 3801–3836 (2015). https://doi.org/10.1007/s11229-015-0786-1

    Article  MathSciNet  MATH  Google Scholar 

  16. Lebedeva, E.: Expressing discourse dynamics through continuations. phdthesis, Université de Lorraine (2012). https://tel.archives-ouvertes.fr/tel-01749193

  17. Mohammed Ismail, W., Shan, C.C.: Deriving a probability density calculator (functional pearl). In: Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, pp. 47–59. Association for Computing Machinery, New York (2016). ISBN 978-1-4503-4219-3. https://doi.org/10.1145/2951913.2951922. https://doi.org/10.1145/2951913.2951922

  18. Shan, C.C.: Monads for natural language semantics. arXiv:cs/0205026 (2002). http://arxiv.org/abs/cs/0205026. arXiv: cs/0205026

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julian Grove .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Grove, J., Bernardy, JP. (2023). Probabilistic Compositional Semantics, Purely. In: Yada, K., Takama, Y., Mineshima, K., Satoh, K. (eds) New Frontiers in Artificial Intelligence. JSAI-isAI 2021. Lecture Notes in Computer Science(), vol 13856. Springer, Cham. https://doi.org/10.1007/978-3-031-36190-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36190-6_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36189-0

  • Online ISBN: 978-3-031-36190-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics