Probabilistic Compositional Semantics, Purely

Grove, Julian; Bernardy, Jean-Philippe

doi:10.1007/978-3-031-36190-6_17

Julian Grove¹¹ &
Jean-Philippe Bernardy¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13856))

Included in the following conference series:

JSAI International Symposium on Artificial Intelligence

432 Accesses
1 Altmetric

Abstract

We provide a general framework for the integration of formal semantics with probabilistic reasoning. This framework is conservative, in the sense that it relies only on typed $\lambda $-calculus and is thus compatible with logical systems already in use. The framework is also presented modularly, in that it regards probabilistic effects (i.e., sampling and marginalization) as side effects, using continuations. We show how our framework may be used to build probabilistic programs compositionally within typed $\lambda $-calculus and then illustrate its use on two applications: semantic learning and pragmatic inference within the Rational Speech Act framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The $\lambda $-homomorphisms that we employ map one higher-order language into another, preserving variables, abstractions, applications, pairing, and projection. They are accompanied by type-homomorphisms $\overline{ \alpha }$ which, for us, preserve implication and products (i.e., $\overline{ \alpha \rightarrow \beta } = \overline{ \alpha } \rightarrow \overline{\beta }$ and $\overline{ \alpha \times \beta } = \overline{ \alpha } \times \overline{\beta }$), but which may in principle affect base types. In general, if $M : \alpha $, then $\llparenthesis M\rrparenthesis : \overline{ \alpha }$. The motivation for these constraints is that they provide meanings to the constants of the source language, leaving the surrounding $\lambda $-calculus unaffected (as analogous to a traditional model-theoretic interpretation). In this case, both $\llparenthesis \cdot \rrparenthesis $ and its associated type homomorphism are trivial, mapping both constants and base types onto themselves.
2.
There is some precedent for this representation of probabilistic programs, by Mohammed Ismail and Shan [17], who describe a small typed probabilistic programming language and provide a denotational semantics for it in terms of continuations. Our formulation is chiefly inspired by the dependently typed language of Bernardy et al. [3]. See also Jansson et al. [13].
3.
Here, we leave $\mathcal {N} : d_{tall} \times d_{tall} \rightarrow (d_{tall} \rightarrow r) \rightarrow r$ unanalyzed. In general, computing a continuous distribution $\mathcal {D} : p_1 \times ... \times p_n \rightarrow (d \rightarrow r) \rightarrow r$ over $d$ amounts to computing
$$\lambda \langle p_1, ..., p_n\rangle , f.\int _{-\infty }^\infty \text {PDF}_{\mathcal {D}(p_1, ..., p_n)}(x) * f(x) dx$$
where PDF$_{\mathcal {D}(p_1, ..., p_n)}$ provides the probability density function associated with $\mathcal {D}$ (given parameters $p_1, ..., p_n$). Such integrals don’t in general admit closed-form solutions, and so one must resort to approximations. We implement this via Markov chain Monte Carlo sampling in our Haskell implementation, using the library at https://github.com/jyp/ProbProg.
4.
Some may recognize it as akin to the $guard$ function of Haskell’s MonadPlus and Alternative classes.
5.
Note that we define this posterior in terms of a joint prior distribution $P_{L_1}(w, \theta )$. Lassiter and Goodman [14] assume the prior distributions over world states and linguistic parameters to be independent, with an effectively uniform prior over parameters.
6.
That is, $ observe ( \phi )(f) = factor (\mathbbm {1}( \phi ))(f)$.
7.
An alternative, syntactically closer to the discrete case, relies on the Dirac ${ \delta }$ distribution, whose value is zero everywhere except when its argument is zero, and whose total mass sums to one. Thus we recover a non-zero result after integration:
$$\text {PDF}_p = \lambda x.p(\lambda y. \delta (x - y))$$
.
8.
More accurately, we would take $U$ to be uniform over a finite set, $S_U$. Thus we would define it as $U = \lambda k.\varSigma _{u \in S_U}k(u)$.
9.
To implement the definition of cost employed by RSA models, for example, $U^*$ could be $U \star \lambda u. factor (e^{- \alpha * C(u)}) \star \lambda {\diamond }. \eta (u)$, given some uniform distribution $U$.
10.
Emerson [5] advocates yet a third approach to RSA, in which linguistic parameters are marginalized out in the listener model altogether.
11.
Systematically, if $ \alpha $ tends to $\infty $; probabilistically, otherwise.
12.
Available at https://github.com/juliangrove/grove-bernardy-lenls18.

References

Barker, C., Shan, C.C.: Continuations and Natural Language, vol. 53. Oxford Studies in Theoretical Linguistics (2014)
Google Scholar
Bernardy, J.P., Blanck, R., Chatzikyriakidis, S., Lappin, S., Maskharashvili, A.: Predicates as boxes in Bayesian semantics for natural language. In: Proceedings of the 22nd Nordic Conference on Computational Linguistics, Turku, Finland, pp. 333–337. Linköping University Electronic Press (2019). https://www.aclweb.org/anthology/W19-6137
Bernardy, J.P., Blanck, R., Chatzikyriakidis, S., Maskharashvili, A.: Bayesian natural language semantics and pragmatics. In: Bernardy, J.P., Blanck, R., Chatzikyriakidis, S., Lappin, S., Maskharashvili, A. (eds.) Probabilistic Approaches to Linguistic Theory. CSLI Publications (2022)
Google Scholar
Charlow, S.: On the semantics of exceptional scope. Ph.D. thesis, NYU, New York (2014). https://semanticsarchive.net/Archive/2JmMWRjY
Emerson, G.: Probabilistic lexical semantics: from gaussian embeddings to bernoulli fields. In: Bernardy, J.P., Blanck, R., Chatzikyriakidis, S., Lappin, S., Maskharashvili, A. (eds.) Probabilistic Approaches to Linguistic Theory. CSLI Publications (2022)
Google Scholar
Girard, J.Y.: Interprétation fonctionnelle et élimination des coupures de l’arithmétique d’ordre supérieur. Ph.D. thesis, Université Paris 7 (1972)
Google Scholar
Goodman, N.D., Frank, M.C.: Pragmatic language interpretation as probabilistic inference. Trends Cogn. Sci. 20(11), 818–829 (2016). ISSN 1364-6613. https://doi.org/10.1016/j.tics.2016.08.005. https://www.sciencedirect.com/science/article/pii/S136466131630122X
Goodman, N.D., Lassiter, D.: Probabilistic semantics and pragmatics uncertainty in language and thought. In: Lappin, S., Fox, C. (eds.) The Handbook of Contemporary Semantic Theory, pp. 655–686. Wiley (2015). ISBN 978-1-118-88213-9. https://doi.org/10.1002/9781118882139.ch21. http://onlinelibrary.wiley.com/doi/abs/10.1002/9781118882139.ch21, section: 21 _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781118882139.ch21
Goodman, N.D., Mansinghka, V.K., Roy, D., Bonawitz, K., Tenenbaum, J.B.: Church: a language for generative models. In: Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence, UAI 2008, Arlington, Virginia, USA, pp. 220–229. AUAI Press (2008). ISBN 978-0-9749039-4-1
Google Scholar
Goodman, N.D., Stuhlmüller, A.: Knowledge and implicature: modeling language understanding as social cognition. Top. Cogn. Sci. 5(1), 173–184 (2013). ISSN 1756-8765. https://doi.org/10.1111/tops.12007. https://onlinelibrary.wiley.com/doi/abs/10.1111/tops.12007
Grice, H.P.: Logic and conversation. In: Cole, P., Morgan, J.L. (eds.) Syntax and Semantics. Speech Acts, vol. 3, pp. 41–58. Academic Press, New York (1975)
Google Scholar
Grove, J., Bernardy, J.P., Chatzikyriakidis, S.: From compositional semantics to Bayesian pragmatics via logical inference. In: Proceedings of the 1st and 2nd Workshops on Natural Logic Meets Machine Learning (NALOMA), Groningen, The Netherlands, pp. 60–70. Association for Computational Linguistics (2021). https://aclanthology.org/2021.naloma-1.8
Jansson, P., Ionescu, C., Bernardy, J.P.: Probability theory. In: Domain Specific Languages of Mathematics. Texts in Computing, no. 24, pp. 223–246 (2022)
Google Scholar
Lassiter, D., Goodman, N.D.: Context, scale structure, and statistics in the interpretation of positive-form adjectives. Semant. Linguist. Theory 23(0), 587–610 (2013). ISSN 2163-5951. https://doi.org/10.3765/salt.v23i0.2658. https://journals.linguisticsociety.org/proceedings/index.php/SALT/article/view/2658
Lassiter, D., Goodman, N.D.: Adjectival vagueness in a Bayesian model of interpretation. Synthese 194(10), 3801–3836 (2015). https://doi.org/10.1007/s11229-015-0786-1
Article MathSciNet MATH Google Scholar
Lebedeva, E.: Expressing discourse dynamics through continuations. phdthesis, Université de Lorraine (2012). https://tel.archives-ouvertes.fr/tel-01749193
Mohammed Ismail, W., Shan, C.C.: Deriving a probability density calculator (functional pearl). In: Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, pp. 47–59. Association for Computing Machinery, New York (2016). ISBN 978-1-4503-4219-3. https://doi.org/10.1145/2951913.2951922. https://doi.org/10.1145/2951913.2951922
Shan, C.C.: Monads for natural language semantics. arXiv:cs/0205026 (2002). http://arxiv.org/abs/cs/0205026. arXiv: cs/0205026

Download references

Author information

Authors and Affiliations

Centre for Linguistic Theory and Studies in Probability, Department of Philosophy, Linguistics and Theory of Science, University of Gothenburg, Gothenburg, Sweden
Julian Grove & Jean-Philippe Bernardy

Authors

Julian Grove
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Philippe Bernardy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julian Grove .

Editor information

Editors and Affiliations

Kansai University, Suita, Japan
Katsutoshi Yada
Tokyo Metropolitan University, Tokyo, Japan
Yasufumi Takama
Keio University, Tokyo, Japan
Koji Mineshima
National Institute of Informatics, Tokyo, Japan
Ken Satoh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Grove, J., Bernardy, JP. (2023). Probabilistic Compositional Semantics, Purely. In: Yada, K., Takama, Y., Mineshima, K., Satoh, K. (eds) New Frontiers in Artificial Intelligence. JSAI-isAI 2021. Lecture Notes in Computer Science(), vol 13856. Springer, Cham. https://doi.org/10.1007/978-3-031-36190-6_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-36190-6_17
Published: 19 July 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36189-0
Online ISBN: 978-3-031-36190-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics