skip to main content
10.1145/2541940.2541958acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Uncertain<T>: a first-order type for uncertain data

Published: 24 February 2014 Publication History

Abstract

Emerging applications increasingly use estimates such as sensor data (GPS), probabilistic models, machine learning, big data, and human data. Unfortunately, representing this uncertain data with discrete types (floats, integers, and booleans) encourages developers to pretend it is not probabilistic, which causes three types of uncertainty bugs. (1) Using estimates as facts ignores random error in estimates. (2) Computation compounds that error. (3) Boolean questions on probabilistic data induce false positives and negatives.
This paper introduces Uncertain<T>, a new programming language abstraction for uncertain data. We implement a Bayesian network semantics for computation and conditionals that improves program correctness. The runtime uses sampling and hypothesis tests to evaluate computation and conditionals lazily and efficiently. We illustrate with sensor and machine learning applications that Uncertain<T> improves expressiveness and accuracy.
Whereas previous probabilistic programming languages focus on experts, Uncertain<T> serves a wide range of developers. Experts still identify error distributions. However, both experts and application writers compute with distributions, improve estimates with domain knowledge, and ask questions with conditionals. The Uncertain<T> type system and operators encourage developers to expose and reason about uncertainty explicitly, controlling false positives and false negatives. These benefits make Uncertain<T> a compelling programming model for modern applications facing the challenge of uncertainty.

References

[1]
D. Barbara, H. Garcia-Molina, and D. Porter. The management of probabilistic data. phIEEE Transactions on Knowledge and Data Engineering, 4 (5): 487--502, 1992.
[2]
O. Benjelloun, A. D. Sarma, A. Halevy, and J. Widom. ULDBs: Databases with uncertainty and lineage. In phACM International Conference on Very Large Data Bases (VLDB), pages 953--964, 2006.
[3]
E. R. Berlekamp, J. H. Conway, and R. K. Guy. phWinning Ways for Your Mathematical Plays, volume 4. A K Peters, 2004.
[4]
S. Bhat, A. Agarwal, R. Vuduc, and A. Gray. A type theory for probability density functions. In phACM Symposium on Principles of Programming Languages (POPL), pages 545--556, 2012.
[5]
C. M. Bishop. phPattern Recognition and Machine Learning. Springer, 2006.
[6]
et al.(2011)Borgström, Gordon, Greenberg, Margetson, and Van Gael}BGG
[7]
:11J. Borgström, A. D. Gordon, M. Greenberg, J. Margetson, and J. Van Gael. Measure transformer semantics for Bayesian machine learning. In phEuropean Symposium on Programming (ESOP), pages 77--96, 2011.
[8]
J. Bornholt. phAbstractions and techniques for programming with uncertain data. Honors thesis, Australian National University, 2013.
[9]
G. E. P. Box and M. E. Muller. A note on the generation of random normal deviates. phThe Annals of Mathematical Statistics, 29 (2): 610--611, 1958.
[10]
M. Carbin, S. Misailovic, and M. C. Rinard. Verifying quantitative reliability of programs that execute on unreliable hardware. In phACM Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), pages 33--52, 2013.
[11]
A. T. Chaganty, A. V. Nori, and S. K. Rajamani. Efficiently sampling probabilistic programs via program analysis. In phProceedings of the 16th international conference on Artificial Intelligence and Statistics, AISTATS 2013, Scottsdale, AZ, USA, April 29 - May 1, 2013, pages 153--160. JMLR, 2013.
[12]
N. Dalvi and D. Suciu. Management of probabilistic data: Foundations and challenges. In phACM Symposium on Principles of Database Systems (PODS), pages 1--12, 2007.
[13]
H. Esmaeilzadeh, A. Sampson, L. Ceze, and D. Burger. Neural acceleration for general-purpose approximate programs. In phIEEE/ACM International Symposium on Microarchitecture (MICRO), pages 449--460, 2012.
[14]
W. R. Gilks, A. Thomas, and D. J. Spiegelhalter. A language and program for complex Bayesian modelling. phJournal of the Royal Statistical Society. Series D (The Statistician), 43 (1): 169--177, 1994.
[15]
M. Giry. A categorical approach to probability theory. In phCategorical Aspects of Topology and Analysis, volume 915 of phLecture Notes in Mathematics, pages 68--85. 1982.
[16]
N. D. Goodman, V. K. Mansinghka, D. M. Roy, K. Bonawitz, and J. B. Tenenbaum. Church: A language for generative models. In phConference in Uncertainty in Artificial Intelligence (UAI), pages 220--229, 2008.
[17]
S. Jaroszewicz and M. Korzen. Arithmetic operations on independent random variables: A numerical approach. phSIAM Journal on Scientific Computing, 34: A1241--A1265, 2012.
[18]
C. Jennison and B. W. Turnbull. phGroup sequential methods with applications to clinical trials. Chapman & Hall, 2000.
[19]
KNR:13A. Kumar, F. Niu, and C. Ré. Hazy: Making it Easier to Build and Maintain Big-data Analytics. phACM Queue, 11 (1): 30--46, 2013.
[20]
R. E. Moore. phInterval analysis. Prentice-Hall, 1966.
[21]
R. M. Neal. phBayesian learning for neural networks. PhD thesis, University of Toronto, 1994.
[22]
P. Newson and J. Krumm. Hidden Markov map matching through noise and sparseness. In phACM International Conference on Advances in Geographic Information Systems (GIS), pages 336--343, 2009.
[23]
A. Papoulis and S. U. Pillai. phProbability, random variables, and stochastic processes. New York, NY, 4th edition, 2000.
[24]
S. Park, F. Pfenning, and S. Thrun. A probabilistic language based on sampling functions. In phACM Symposium on Principles of Programming Languages (POPL), pages 171--182, 2005.
[25]
A. Pfeffer. IBAL: a probabilistic rational programming language. In phInternational Joint Conference on Artificial Intelligence (IJCAI), pages 733--740, 2001.
[26]
N. Ramsey and A. Pfeffer. Stochastic lambda calculus and monads of probability distributions. In phACM Symposium on Principles of Programming Languages (POPL), pages 154--165, 2002.
[27]
A. Sampson, W. Dietl, E. Fortuna, D. Gnanapragasam, L. Ceze, and D. Grossman. EnerJ: Approximate data types for safe and general low-power computation. In phACM Conference on Programming Language Design and Implementation (PLDI), pages 164--174, 2011.
[28]
S. Sankaranarayanan, A. Chakarov, and S. Gulwani. Static analysis for probabilistic programs: inferring whole program properties from finitely many paths. In phACM Conference on Programming Language Design and Implementation (PLDI), pages 447--458, 2013.
[29]
J. Schwarz, J. Mankoff, and S. E. Hudson. Monte Carlo methods for managing interactive state, action and feedback under uncertainty. In phACM Symposium on User Interface Software and Technology (UIST), pages 235--144, 2011.
[30]
R. Thompson. Global positioning system: the mathematics of GPS receivers. phMathematics Magazine, 71 (4): 260--269, 1998.
[31]
S. Thrun. Towards programming tools for robots that integrate probabilistic computation and learning. In phIEEE International Conference on Robotics and Automation (ICRA), pages 306--312, 2000.
[32]
1970)}T:70F. Topsøe. On the Glivenko-Cantelli theorem. phZeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 14: 239--250, 1970.
[33]
A. Wald. Sequential Tests of Statistical Hypotheses. phThe Annals of Mathematical Statistics, 16 (2): 117--186, 1945.

Cited By

View all
  • (2024) Deciding to Stop Early or Continue the Experiment After Checking p -Values at Interim Points: Introducing Group Sequential Designs to UI-Based Comparative Studies International Journal of Human–Computer Interaction10.1080/10447318.2024.2407662(1-10)Online publication date: 8-Oct-2024
  • (2024)Probabilistic Access Policies with Automated Reasoning SupportComputer Aided Verification10.1007/978-3-031-65633-0_20(443-466)Online publication date: 24-Jul-2024
  • (2023)Approximate Computing: Hardware and Software Techniques, Tools and Their ApplicationsJournal of Circuits, Systems and Computers10.1142/S021812662430001033:04Online publication date: 20-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '14: Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
February 2014
780 pages
ISBN:9781450323055
DOI:10.1145/2541940
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 February 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. estimates
  2. probabilistic programming
  3. uncertain data

Qualifiers

  • Research-article

Conference

ASPLOS '14

Acceptance Rates

ASPLOS '14 Paper Acceptance Rate 49 of 217 submissions, 23%;
Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)43
  • Downloads (Last 6 weeks)3
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024) Deciding to Stop Early or Continue the Experiment After Checking p -Values at Interim Points: Introducing Group Sequential Designs to UI-Based Comparative Studies International Journal of Human–Computer Interaction10.1080/10447318.2024.2407662(1-10)Online publication date: 8-Oct-2024
  • (2024)Probabilistic Access Policies with Automated Reasoning SupportComputer Aided Verification10.1007/978-3-031-65633-0_20(443-466)Online publication date: 24-Jul-2024
  • (2023)Approximate Computing: Hardware and Software Techniques, Tools and Their ApplicationsJournal of Circuits, Systems and Computers10.1142/S021812662430001033:04Online publication date: 20-Sep-2023
  • (2022)GOAL: Supporting General and Dynamic Adaptation in Computing SystemsProceedings of the 2022 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software10.1145/3563835.3567655(16-32)Online publication date: 29-Nov-2022
  • (2022)Conditional Quantitative Program AnalysisIEEE Transactions on Software Engineering10.1109/TSE.2020.301677848:4(1212-1227)Online publication date: 1-Apr-2022
  • (2021)An Adaptive Application Framework with Customizable Quality MetricsACM Transactions on Design Automation of Electronic Systems10.1145/347742827:2(1-33)Online publication date: 2-Nov-2021
  • (2021)The Laplace Microarchitecture for Tracking Data Uncertainty and Its Implementation in a RISC-V ProcessorMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480131(1254-1269)Online publication date: 18-Oct-2021
  • (2020)Warp: A Hardware Platform for Efficient Multimodal Sensing With Adaptive ApproximationIEEE Micro10.1109/MM.2019.295100440:1(57-66)Online publication date: 1-Jan-2020
  • (2020)Path Planning Under Malicious Injections and Removals of Perceived Obstacles: A Probabilistic Programming ApproachIEEE Robotics and Automation Letters10.1109/LRA.2020.30213825:4(6884-6891)Online publication date: Oct-2020
  • (2020)Emerging Hardware Technologies for IoT Data ProcessingIntelligent Internet of Things10.1007/978-3-030-30367-9_9(433-471)Online publication date: 22-Jan-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media