Abstract
The recent profusion of sensors has given consumers and researchers the ability to collect significant amounts of data. However, understanding sensor data can be a challenge, because it is voluminous, multi-sourced, and unintelligible. Nonetheless, intelligent systems, such as activity recognition, require pattern analysis of sensor data streams to produce compelling results; machine learning (ML) applications enable this type of analysis. However, the number of ML experts able to proficiently classify sensor data is limited, and there remains a lack of interactive, usable tools to help intermediate users perform this type of analysis. To learn which features these tools must support, we conducted interviews with intermediate users of ML and conducted two probe-based studies with a prototype ML and visual analytics system, Gimlets. Our system implements ML applications for sensor-based time-series data as a novel domain-specific prototype that integrates interactive visual analytic features into the ML pipeline. We identify future directions for usable ML systems based on sensor data that will enable intermediate users to build systems that have been prohibitively difficult.
- Wolfgang Aigner, Silvia Miksch, Heidrun Schumann, and Christian Tominski. 2011. Visualization of time-oriented data. Springer Science 8 Business Media.Google Scholar
- Saleema Amershi, James Fogarty, and Daniel Weld. 2012. Regroup: Interactive machine learning for on-demand group creation in social networks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’12). ACM, New York, NY, 21--30. Google ScholarDigital Library
- Saleema Amershi. 2012. Designing for effective end-user interaction with machine learning. Ph.D. Dissertation. University of Washington, Seattle, WA.Google Scholar
- Saleema Amershi, Maya Cakmak, W. Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. AI Mag. 35, 4, 105120.Google ScholarCross Ref
- Saleema Amershi, Max Chickering, Steven M. Drucker, Bongshin Lee, Patrice Simard, and Jina Suh. 2015. ModelTracker: Redesigning performance analysis tools for machine learning. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’15), 337--346. Google ScholarDigital Library
- Ling Bao and Stephen S. Intille. 2004. Activity recognition from user-annotated acceleration data. In Proceedings of the 2nd International Conference on Pervasive Computing, LNCS 3001, 1--17. Google ScholarCross Ref
- Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, New York, NY.Google Scholar
- Tanzeem Choudhury, Gaetano Borriello, Sunny Consolvo, Dirk Haehnel, Beverly Harrison, Bruce Hemingway, Jeffrey Hightower, Predrag “Pedja” Klasnja, Karl Koscher, Anthony LaMarca, James A. Landay, Louis LeGrand, Jonathan Lester, Ali Rahimi, Adam Rea, and Danny Wyatt. 2008. The mobile sensing platform: An embedded activity recognition system. IEEE Perv. Comput. 7, 2 (April 2008), 32--41. Google ScholarDigital Library
- Anind K. Dey, Raffay Hamid, Chris Beckmann, Ian Li, and Daniel Hsu. 2004. a CAPpella: Programming by demonstration of context-aware applications. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’04). ACM, New York, NY, 33--40. Google ScholarDigital Library
- Richard O. Duda and Peter E. Hart. 1973. Pattern Classification and Scene Analysis. Vol. 3. New York: Wiley, 1973.Google Scholar
- Jerry Alan Fails and Dan R. Olsen, Jr. 2003. Interactive machine learning. In Proceedings of the 8th International Conference on Intelligent User Interfaces (IUI’03). ACM, New York, NY, 39--45. Google ScholarDigital Library
- James Fogarty, Desney Tan, Ashish Kzpoor, and Simon Winder. 2008. CueFlik: Interactive concept learning in image search. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’08). ACM, New York, NY, 29--38. Google ScholarDigital Library
- Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: An update. SIGKDD Explor. Newsl. 11, 1 (November 2009), 10--18. Google ScholarDigital Library
- Björn Hartmann, Leith Abdulla, Manas Mittal, and Scott R. Klemmer. 2007. Authoring sensor-based interactions by demonstration with direct manipulation and pattern recognition. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’07). ACM, New York, NY, 145--154. Google ScholarDigital Library
- Björn Hartmann, Scott R. Klemmer, Michael Bernstein, Leith Abdulla, Brandon Burr, Avi Robinson-Mosher, and Jennifer Gee. 2006. Reflective physical prototyping through integrated design, test, and analysis. In Proceedings of the 19th Annual ACM Symposium on User Interface Software and Technology (UIST’06). ACM, New York, NY, 299--308. Google ScholarDigital Library
- Douglas M. Hawkins. 2004. The problem of overfitting. J. Chem. Inf. Comput. Sci. 44, 1, 1--12. Google Scholar
- Jin-Hyuk Hong, Julian Ramos, Choonsung Shin, and Anind K. Dey. 2013. An activity recognition system for ambient assisted living environments. In Evaluating AAL Systems Through Competitive Benchmarking (EvAAL'12), Communications in Computer and Information Science, S. Chessa and S. Knauth (Eds.). Vol. 362. Springer, Berlin, Heidelberg. Google ScholarCross Ref
- Jin-Hyuk Hong, Julian Ramos, and Anind K. Dey. 2012. Understanding physiological responses to stressors during physical activity. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing (UbiComp’12). ACM, New York, NY, 270--279. Google ScholarDigital Library
- Deng-Yuan Huang, Wu-Chih Hu, and Sung-Hsiang Chang. 2009. Vision-based hand gesture recognition using PCA + Gabor filters and SVM. In Proceedings of the 2009 5th International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP’09). IEEE Computer Society, Washington, DC, 1--4. Google ScholarDigital Library
- Jun Kato, Sean McDirmid, and Xiang Cao. 2012. DejaVu: Integrated support for developing interactive camera-based programs. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology (UIST’12). ACM, New York, NY, 189--196. Google ScholarDigital Library
- Daniel Keim, Gennady Andrienko, Jean-Daniel Fekete, Carsten Görg, Jörn Kohlhammer, and Guy Melançon. 2008. Visual analytics: Definition, process, and challenges. In Information Visualization, Andreas Kerren, John T. Stasko, Jean-Daniel Fekete, and Chris North (Eds.). Lecture Notes in Computer Science, Vol. 4950. Springer-Verlag, Berlin, 154--175. Google ScholarDigital Library
- SeungJun Kim, Jaemin Chun, and Anind K. Dey. 2015. Sensors know when to interrupt you in the car: Detecting driver interruptibility through monitoring of peripheral interactions. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI’15). ACM, New York, NY, 487--496.Google Scholar
- Khee Poh Lam, Michael Höynck, Bing Dong, Burton Andrews, Yun-Shang Chiou, Rui Zhang, Diego Benitez, and Joonho Choi. 2009. Occupancy detection through an extensive environmental sensor network in an open-plan office building. In Proceedings of the International Building Performance Simulation Association Conference (IBPSA’09) 1452--1459.Google Scholar
- Elijah Mayfield and Carolyn Penstein Rosé. 2013. LightSIDE: Open Source Machine Learning for Text. In M. D. Shermis and J. C. Burstein (Eds.), Handbook of automated essay evaluation: Current application and new directions. Psychology Press, New York, NY, 124--135, 2013.Google Scholar
- Dan Maynes-Aminzade, Terry Winograd, and Takeo Igarashi. 2007. Eyepatch: Prototyping camera-based interaction through examples. In Proceedings of the 20th Annual ACM Symposium on User Interface Software and Technology (UIST’07). ACM, New York, NY, 33--42. Google ScholarDigital Library
- Kayur Patel. 2013. Lowering the Barrier to Applying Machine Learning. Ph.D. Dissertation. Computer Science and Engineering, University of Washington, Seattle, WA.Google Scholar
- Kayur Patel, Steven M. Drucker, James Fogarty, Ashish Kapoor, and Desney S. Tan. 2011. Using multiple models to understand data. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI’11), Toby Walsh (Ed.), Volume Two. AAAI Press 1723--1728.Google Scholar
- Kayur Patel, Naomi Bancroft, Steven M. Drucker, James Fogarty, Andrew J. Ko, and James Landay. 2010. Gestalt: integrated support for implementation and analysis in machine learning. In Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology (UIST’10). ACM, New York, NY, 37--46. Google ScholarDigital Library
- Kayur Patel, James Fogarty, James A. Landay, and Beverly Harrison. 2008a. Examining difficulties software developers encounter in the adoption of statistical machine learning. In Proceedings of the 23rd National Conference on Artificial Intelligence (AAAI’08), Anthony Cohn (Ed.), Vol. 3. AAAI Press 1563--1566.Google Scholar
- Kayur Patel, James Fogarty, James A. Landay, and Beverly Harrison. 2008b. Investigating statistical machine learning as a tool for software development. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’08). ACM, New York, NY, 667--676. Google ScholarDigital Library
- Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 12 (November 2011), 2825--2830.Google ScholarDigital Library
- Jian Zhao, Fanny Chevalier, Emmanuel Pietriga, and Ravin Balakrishnan. 2011a. Exploratory analysis of time-series with chronolenses. IEEE Trans. Vis. Comput. Graph. 17, 12 (December 2011), 2422--2431. Google ScholarDigital Library
- Jian Zhao, Fanny Chevalier, and Ravin Balakrishnan. 2011b. KronoMiner: Using multi-foci navigation for the visual exploration of time-series data. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’11). ACM, New York, NY, 1737--1746. Google ScholarDigital Library
Recommendations
A Framework for Real-Time Information Derivation from Big Sensor Data
HPCC-CSS-ICESS '15: Proceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conf on Embedded Software and SystemsIn data-intensive real-time applications, e.g., transportation management and location-based services, the amount of sensor data is exploding. In these applications, it is desirable to extract value-added information, e.g., fast driving routes, from ...
Support high-order tensor data description for outlier detection in high-dimensional big sensor data
The various high-dimensional sensor data can be collected by wireless sensor networks, video monitoring systems and multimedia sensor networks, while High-dimensional sensor data is inherently large-scale because each sensor node has spatial attributes ...
Scalable machine-learning algorithms for big data analytics: a comprehensive review
Big data analytics is one of the emerging technologies as it promises to provide better insights from huge and heterogeneous data. Big data analytics involves selecting the suitable big data storage and computational framework augmented by scalable ...
Comments