No abstract available.
Proceeding Downloads
SliceLens: Guided Exploration of Machine Learning Datasets
SliceLens is a tool for exploring labeled, tabular, machine learning datasets. To explore a dataset, the user selects combinations of features in the dataset that they are interested in. The tool splits those features into bins and then visualizes the ...
Camera-First Form Filling: Reducing the Friction in Climate Hazard Reporting
The effective reporting of climate hazards, such as flash floods, hurricanes, and earthquakes, is critical. To quickly and correctly assess the situation and deploy resou rces, emergency services often rely on citizen reports that must be timely, ...
Raven: Accelerating Execution of Iterative Data Analytics by Reusing Results of Previous Equivalent Versions
Using GUI-based workflows for data analysis is an iterative process. During each iteration, an analyst makes changes to the workflow to improve it, generating a new version each time. The results produced by executing these versions are materialized to ...
Overlay Spreadsheets
Efforts to scale spreadsheets either follow a 'virtual' strategy that layers a spreadsheet interface on top of an existing database engine or a 'materialized' strategy based on re-engineering a spreadsheet engine. Because databases are not optimized for ...
A Human-in-the-loop Workflow for Multi-Factorial Sensitivity Analysis of Algorithmic Rankers
Algorithmic rankers are ubiquitously applied in automated decision systems such as hiring, admission, and loan-approval systems. Without appropriate explanations, decision-makers often cannot audit or trust algorithmic rankers' outcomes. In recent years,...
Facilitating Dependency Exploration in Computational Notebooks
Computational notebooks promote exploration by structuring code, output, and explanatory text, into cells. The input code and rich outputs help users iteratively investigate ideas as they explore or analyze data. The links between these cells--how the ...
DIG: The Data Interface Grammar
Building interactive data interfaces is hard because the design of an interface depends on the data processing needs for the underlying analysis task, yet we do not have a good representation for analysis tasks. To fill this gap, this paper advocates ...
Aggregation Consistency Errors in Semantic Layers and How to Avoid Them
Analysts often struggle with analyzing data from multiple tables in a database due to their lack of knowledge on how to join and aggregate the data. To address this, data engineers pre-specify "semantic layers" which include the join conditions and "...
VALUE: Visual Analytics driven Linked data Utility Evaluation
The widespread adoption of open datasets across various domains has emphasized the significance of joining and computing their utility. However, the interplay between computation and human interaction is vital for informed decision-making. To address ...
Visualizing a Tabular Data Repository to Facilitate Descriptive Tag Augmentation for New Tables
Many online tabular datasets are maintained in centralized repositories and annotated with descriptive tags. These tags are helpful for data practitioners to search and understand tables. However, manually annotating descriptive tags for new tables ...
Approximate Query Answering over Open Data
Open knowledge, including open data and publicly available knowledge bases, offers a rich opportunity for data scientists for analysis and query answering, but comes with big obstacles due to the diverse, noisy, and incomplete nature of its data eco-...
Data Makes Better Data Scientists
With the goal of identifying common practices in data science projects, this paper proposes a framework for logging and understanding incremental code executions in Jupyter notebooks. This framework aims to allow reasoning about how insights are ...
Interactive Data Cleaning for Real-Time Streaming Applications
The importance of data cleaning systems has continuously grown in recent years. Especially for real-time streaming applications, it is crucial, to identify and possibly remove anomalies in the data on the fly before further processing. The main ...
Index Terms
- Proceedings of the Workshop on Human-In-the-Loop Data Analytics
Recommendations
WSDM'15 Workshop Summary / Scalable Data Analytics: Theory and Applications
WSDM '15: Proceedings of the Eighth ACM International Conference on Web Search and Data MiningThe SDA workshop at WSDM 2015 is the fifth International Workshop on Scalable Data Analytics, following the previous four workshops of SDA respectively held at IEEE Big Data 2013, PAKDD 2014, IEEE Big Data 2014, and IEEE ICDM 2014. This series of ...