research-article

Uncertain<T>: a first-order type for uncertain data

Authors:

James Bornholt,

Todd Mytkowicz,

Kathryn S. McKinleyAuthors Info & Claims

ASPLOS '14: Proceedings of the 19th international conference on Architectural support for programming languages and operating systems

Pages 51 - 66

https://doi.org/10.1145/2541940.2541958

Published: 24 February 2014 Publication History

Abstract

Emerging applications increasingly use estimates such as sensor data (GPS), probabilistic models, machine learning, big data, and human data. Unfortunately, representing this uncertain data with discrete types (floats, integers, and booleans) encourages developers to pretend it is not probabilistic, which causes three types of uncertainty bugs. (1) Using estimates as facts ignores random error in estimates. (2) Computation compounds that error. (3) Boolean questions on probabilistic data induce false positives and negatives.

This paper introduces Uncertain<T>, a new programming language abstraction for uncertain data. We implement a Bayesian network semantics for computation and conditionals that improves program correctness. The runtime uses sampling and hypothesis tests to evaluate computation and conditionals lazily and efficiently. We illustrate with sensor and machine learning applications that Uncertain<T> improves expressiveness and accuracy.

Whereas previous probabilistic programming languages focus on experts, Uncertain<T> serves a wide range of developers. Experts still identify error distributions. However, both experts and application writers compute with distributions, improve estimates with domain knowledge, and ask questions with conditionals. The Uncertain<T> type system and operators encourage developers to expose and reason about uncertainty explicitly, controlling false positives and false negatives. These benefits make Uncertain<T> a compelling programming model for modern applications facing the challenge of uncertainty.

References

[1]

D. Barbara, H. Garcia-Molina, and D. Porter. The management of probabilistic data. phIEEE Transactions on Knowledge and Data Engineering, 4 (5): 487--502, 1992.

Digital Library

[2]

O. Benjelloun, A. D. Sarma, A. Halevy, and J. Widom. ULDBs: Databases with uncertainty and lineage. In phACM International Conference on Very Large Data Bases (VLDB), pages 953--964, 2006.

Digital Library

[3]

E. R. Berlekamp, J. H. Conway, and R. K. Guy. phWinning Ways for Your Mathematical Plays, volume 4. A K Peters, 2004.

[4]

S. Bhat, A. Agarwal, R. Vuduc, and A. Gray. A type theory for probability density functions. In phACM Symposium on Principles of Programming Languages (POPL), pages 545--556, 2012.

Digital Library

[5]

C. M. Bishop. phPattern Recognition and Machine Learning. Springer, 2006.

Digital Library

[6]

et al.(2011)Borgström, Gordon, Greenberg, Margetson, and Van Gael}BGG

[7]

:11J. Borgström, A. D. Gordon, M. Greenberg, J. Margetson, and J. Van Gael. Measure transformer semantics for Bayesian machine learning. In phEuropean Symposium on Programming (ESOP), pages 77--96, 2011.

Digital Library

[8]

J. Bornholt. phAbstractions and techniques for programming with uncertain data. Honors thesis, Australian National University, 2013.

[9]

G. E. P. Box and M. E. Muller. A note on the generation of random normal deviates. phThe Annals of Mathematical Statistics, 29 (2): 610--611, 1958.

[10]

M. Carbin, S. Misailovic, and M. C. Rinard. Verifying quantitative reliability of programs that execute on unreliable hardware. In phACM Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), pages 33--52, 2013.

Digital Library

[11]

A. T. Chaganty, A. V. Nori, and S. K. Rajamani. Efficiently sampling probabilistic programs via program analysis. In phProceedings of the 16th international conference on Artificial Intelligence and Statistics, AISTATS 2013, Scottsdale, AZ, USA, April 29 - May 1, 2013, pages 153--160. JMLR, 2013.

[12]

N. Dalvi and D. Suciu. Management of probabilistic data: Foundations and challenges. In phACM Symposium on Principles of Database Systems (PODS), pages 1--12, 2007.

Digital Library

[13]

H. Esmaeilzadeh, A. Sampson, L. Ceze, and D. Burger. Neural acceleration for general-purpose approximate programs. In phIEEE/ACM International Symposium on Microarchitecture (MICRO), pages 449--460, 2012.

Digital Library

[14]

W. R. Gilks, A. Thomas, and D. J. Spiegelhalter. A language and program for complex Bayesian modelling. phJournal of the Royal Statistical Society. Series D (The Statistician), 43 (1): 169--177, 1994.

[15]

M. Giry. A categorical approach to probability theory. In phCategorical Aspects of Topology and Analysis, volume 915 of phLecture Notes in Mathematics, pages 68--85. 1982.

[16]

N. D. Goodman, V. K. Mansinghka, D. M. Roy, K. Bonawitz, and J. B. Tenenbaum. Church: A language for generative models. In phConference in Uncertainty in Artificial Intelligence (UAI), pages 220--229, 2008.

[17]

S. Jaroszewicz and M. Korzen. Arithmetic operations on independent random variables: A numerical approach. phSIAM Journal on Scientific Computing, 34: A1241--A1265, 2012.

[18]

C. Jennison and B. W. Turnbull. phGroup sequential methods with applications to clinical trials. Chapman & Hall, 2000.

[19]

KNR:13A. Kumar, F. Niu, and C. Ré. Hazy: Making it Easier to Build and Maintain Big-data Analytics. phACM Queue, 11 (1): 30--46, 2013.

Digital Library

[20]

R. E. Moore. phInterval analysis. Prentice-Hall, 1966.

[21]

R. M. Neal. phBayesian learning for neural networks. PhD thesis, University of Toronto, 1994.

Digital Library

[22]

P. Newson and J. Krumm. Hidden Markov map matching through noise and sparseness. In phACM International Conference on Advances in Geographic Information Systems (GIS), pages 336--343, 2009.

Digital Library

[23]

A. Papoulis and S. U. Pillai. phProbability, random variables, and stochastic processes. New York, NY, 4th edition, 2000.

[24]

S. Park, F. Pfenning, and S. Thrun. A probabilistic language based on sampling functions. In phACM Symposium on Principles of Programming Languages (POPL), pages 171--182, 2005.

Digital Library

[25]

A. Pfeffer. IBAL: a probabilistic rational programming language. In phInternational Joint Conference on Artificial Intelligence (IJCAI), pages 733--740, 2001.

Digital Library

[26]

N. Ramsey and A. Pfeffer. Stochastic lambda calculus and monads of probability distributions. In phACM Symposium on Principles of Programming Languages (POPL), pages 154--165, 2002.

Digital Library

[27]

A. Sampson, W. Dietl, E. Fortuna, D. Gnanapragasam, L. Ceze, and D. Grossman. EnerJ: Approximate data types for safe and general low-power computation. In phACM Conference on Programming Language Design and Implementation (PLDI), pages 164--174, 2011.

Digital Library

[28]

S. Sankaranarayanan, A. Chakarov, and S. Gulwani. Static analysis for probabilistic programs: inferring whole program properties from finitely many paths. In phACM Conference on Programming Language Design and Implementation (PLDI), pages 447--458, 2013.

Digital Library

[29]

J. Schwarz, J. Mankoff, and S. E. Hudson. Monte Carlo methods for managing interactive state, action and feedback under uncertainty. In phACM Symposium on User Interface Software and Technology (UIST), pages 235--144, 2011.

Digital Library

[30]

R. Thompson. Global positioning system: the mathematics of GPS receivers. phMathematics Magazine, 71 (4): 260--269, 1998.

[31]

S. Thrun. Towards programming tools for robots that integrate probabilistic computation and learning. In phIEEE International Conference on Robotics and Automation (ICRA), pages 306--312, 2000.

[32]

1970)}T:70F. Topsøe. On the Glivenko-Cantelli theorem. phZeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 14: 239--250, 1970.

[33]

A. Wald. Sequential Tests of Statistical Hypotheses. phThe Annals of Mathematical Statistics, 16 (2): 117--186, 1945.

Cited By

Yamanaka S(2024) Deciding to Stop Early or Continue the Experiment After Checking p -Values at Interim Points: Introducing Group Sequential Designs to UI-Based Comparative Studies International Journal of Human–Computer Interaction10.1080/10447318.2024.2407662(1-10)Online publication date: 8-Oct-2024
https://doi.org/10.1080/10447318.2024.2407662
Zhu SZhang Y(2024)Probabilistic Access Policies with Automated Reasoning SupportComputer Aided Verification10.1007/978-3-031-65633-0_20(443-466)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1007/978-3-031-65633-0_20
Raza MJaved SKazmi MAziz AUl Haque MQazi S(2023)Approximate Computing: Hardware and Software Techniques, Tools and Their ApplicationsJournal of Circuits, Systems and Computers10.1142/S021812662430001033:04Online publication date: 20-Sep-2023
https://doi.org/10.1142/S0218126624300010
Show More Cited By

Index Terms

Uncertain<T>: a first-order type for uncertain data
1. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language features
        Data types and structures

Recommendations

Uncertain<T>: a first-order type for uncertain data
ASPLOS '14

Emerging applications increasingly use estimates such as sensor data (GPS), probabilistic models, machine learning, big data, and human data. Unfortunately, representing this uncertain data with discrete types (floats, integers, and booleans) encourages ...
Uncertain<T>: a first-order type for uncertain data
ASPLOS '14

Emerging applications increasingly use estimates such as sensor data (GPS), probabilistic models, machine learning, big data, and human data. Unfortunately, representing this uncertain data with discrete types (floats, integers, and booleans) encourages ...
Probabilistic skylines on uncertain data: model and bounding-pruning-refining methods

Uncertain data are inherent in some important applications. Although a considerable amount of research has been dedicated to modeling uncertain data and answering some types of queries on uncertain data, how to conduct advanced analysis on uncertain ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASPLOS '14: Proceedings of the 19th international conference on Architectural support for programming languages and operating systems

February 2014

780 pages

ISBN:9781450323055

DOI:10.1145/2541940

General Chairs:
Rajeev Balasubramonian
University of Utah
,
Al Davis
University of Utah
,
Program Chair:
Sarita Adve
University of Illinois at Urbana-Champ

ACM SIGARCH Computer Architecture News Volume 42, Issue 1
ASPLOS '14
March 2014
729 pages
ISSN:0163-5964
DOI:10.1145/2654822
Editor:
Doug DeGroot
acm dot org
Issue’s Table of Contents
ACM SIGPLAN Notices Volume 49, Issue 4
ASPLOS '14
April 2014
729 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2644865
Editors:
Mark W. Bailey
Hamilton College, Clinton, NY
,
Rajeev Balasubramonian
University of Utah
,
Al Davis
University of Utah
,
Sarita Adve
University of Illinois at Urbana-Champ
Issue’s Table of Contents

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 February 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ASPLOS '14

Sponsor:

ASPLOS '14: Architectural Support for Programming Languages and Operating Systems

March 1 - 5, 2014

Utah, Salt Lake City, USA

Acceptance Rates

ASPLOS '14 Paper Acceptance Rate 49 of 217 submissions, 23%;

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

89
Total Citations
View Citations
1,203
Total Downloads

Downloads (Last 12 months)43
Downloads (Last 6 weeks)3

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yamanaka S(2024) Deciding to Stop Early or Continue the Experiment After Checking p -Values at Interim Points: Introducing Group Sequential Designs to UI-Based Comparative Studies International Journal of Human–Computer Interaction10.1080/10447318.2024.2407662(1-10)Online publication date: 8-Oct-2024
https://doi.org/10.1080/10447318.2024.2407662
Zhu SZhang Y(2024)Probabilistic Access Policies with Automated Reasoning SupportComputer Aided Verification10.1007/978-3-031-65633-0_20(443-466)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1007/978-3-031-65633-0_20
Raza MJaved SKazmi MAziz AUl Haque MQazi S(2023)Approximate Computing: Hardware and Software Techniques, Tools and Their ApplicationsJournal of Circuits, Systems and Computers10.1142/S021812662430001033:04Online publication date: 20-Sep-2023
https://doi.org/10.1142/S0218126624300010
Pervaiz AYang YDuracz ABartha FSai RImes CCartwright RPalem KLu SHoffmann HScholliers CSinger J(2022)GOAL: Supporting General and Dynamic Adaptation in Computing SystemsProceedings of the 2022 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software10.1145/3563835.3567655(16-32)Online publication date: 29-Nov-2022
https://dl.acm.org/doi/10.1145/3563835.3567655
Gerrard MBorges MDwyer MFilieri A(2022)Conditional Quantitative Program AnalysisIEEE Transactions on Software Engineering10.1109/TSE.2020.301677848:4(1212-1227)Online publication date: 1-Apr-2022
https://doi.org/10.1109/TSE.2020.3016778
Liu LIsaacman SKremer U(2021)An Adaptive Application Framework with Customizable Quality MetricsACM Transactions on Design Automation of Electronic Systems10.1145/347742827:2(1-33)Online publication date: 2-Nov-2021
https://dl.acm.org/doi/10.1145/3477428
Tsoutsouras VKaparounakis OBilgin BSamarakoon CMeech JHeck JStanley-Marbell P(2021)The Laplace Microarchitecture for Tracking Data Uncertainty and Its Implementation in a RISC-V ProcessorMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480131(1254-1269)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3466752.3480131
Stanley-Marbell PRinard M(2020)Warp: A Hardware Platform for Efficient Multimodal Sensing With Adaptive ApproximationIEEE Micro10.1109/MM.2019.295100440:1(57-66)Online publication date: 1-Jan-2020
https://doi.org/10.1109/MM.2019.2951004
Banfi JZhang YSuh GMyers ACampbell M(2020)Path Planning Under Malicious Injections and Removals of Perceived Obstacles: A Probabilistic Programming ApproachIEEE Robotics and Automation Letters10.1109/LRA.2020.30213825:4(6884-6891)Online publication date: Oct-2020
https://doi.org/10.1109/LRA.2020.3021382
Bojnordi MBehnam P(2020)Emerging Hardware Technologies for IoT Data ProcessingIntelligent Internet of Things10.1007/978-3-030-30367-9_9(433-471)Online publication date: 22-Jan-2020
https://doi.org/10.1007/978-3-030-30367-9_9
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten