Abstract
We focus on enhancing the user experience by predicting entries when a form is filled, according to past interactions. The purpose of having a predictive model of form filling is to reduce the amount of time required to fill a form, and thus to reduce the fatigue and repetitiveness associated to this common task. Generally predictive models ignore the values entered by users in the other fields in the form, and just focus on the value getting entered at the current field. This is a limit to the model capabilities. Instead, we are aimed at predicting the sequence of entries in a form, instead of the value of single fields in isolation. This is done by means of inference over a Bayesian network, able to compute the a posteriori probability that remaining fields will assume certain values, given the set of values entered so far. The model structure and parameters can be learned from a dataset of past entries. The paper investigates computational and convergence issues under both the closed world assumption and the open world assumptions. As case study, we considered forms used for online payment of money order used at Poste Italiane, and we exploited this approach to prototype two different solutions for desktop and mobile applications. Results of experimentation with a user test group prove the proposed approach is able to provide an effective and appreciated support in filling a form.
Similar content being viewed by others
Notes
It is a theoretical limit as it does not take into account typing errors, control strokes (e.g., Alt, Tab, Ctrl, etc.) and the cognitive overhead in assuming a prediction.
This is also related to the definition of keystroke savings as it does not consider the effect of wrong predictions entailing keystrokes for deleting mistakes.
References
Bérard C, Niemeijer D (2004) Evaluating effort reduction through different word prediction systems. SMC 3:2658–2663
Bickel S, Haider P, Scheffer T (2005) Predicting sentences using n-gram language models. In: HLT ’05: proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, Morristown, pp 193–200
Bonino D, Corno F, Squillero G (2003) Dynamic prediction of web requests. In: Sarker R, Reynolds R, Abbass H, Tan KC, McKay B, Essam D, Gedeon T (eds) Proceedings of the 2003 Congress on evolutionary computation CEC2003. IEEE Press, Canberra, pp 2034–2041
Box GEP, Tiao GC (1992) Bayesian inference in statistical analysis. Wiley Classics Library edn. Wiley-Interscience, New York
Card SK, Moran TP, Newell A (1980) The keystroke-level model for user performance time with interactive systems. Commun ACM 23(7):396–410. doi:10.1145/358886.358895
Davison BD (2002) Predicting web actions from html content. In: Proceedings of the 13th ACM conference on hypertext and hypermedia (HT’02), College Park, MD, pp 159–168
Davison BD (2004) Learning web request patterns. In: Web dynamics, pp 435–460
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38
Dong Y, Li Q (2012) A deep web crawling approach based on query harvest model. J Comput Inf Syst 8(3):973–981
Dongshan X, Junyi S (2002) A new Markov model for web access prediction. Comput Sci Eng 4(6):34–39
Firmenich S, Gaits V, Gordillo S, Rossi G, Winckler M (2012) Supporting users tasks with personal information management and web forms augmentation. In: Proceedings of the 12th international conference on web engineering. ICWE’12. Springer, Berlin, pp 268–282. doi:10.1007/978-3-642-31753-8_20
Fitchett S, Cockburn A (2012) AccessRank: predicting what users will do next. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, New York, USA, pp 2239–2242
Furche T, Gottlob G, Grasso G, Guo X, Orsi G, Schallhart C (2013) The ontological key: automatically understanding and integrating forms to access the deep web. VLDB J 22(5):615–640. doi:10.1007/s00778-013-0323-0
Grabski K, Scheffer T (2004) Sentence completion. In: SIGIR ’04: proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp. 433–439. doi:10.1145/1008992.1009066
Guo X, Kranzdorf J, Furche T, Grasso G, Orsi G, Schallhart C (2012) Opal: a passe-partout for web forms. In: Proceedings of the 21st international conference on World Wide Web. WWW ’12 companion. ACM, New York, pp 353–356. doi:10.1145/2187980.2188047
Ipeirotis PG, Gravano L, Sahami M (2001) Probe, count, and classify: categorizing hidden web databases. In: Proceedings of the 2001 ACM SIGMOD international conference on management of data. SIGMOD ’01. ACM, New York, pp 67–78. doi:10.1145/375663.375671
Jiang L, Wu Z, Zheng Q, Liu J (2009) Learning deep web crawling with diverse features. In: Proceedings of the 2009 IEEE/WIC/ACM international joint conference on Web intelligence and intelligent agent technology, vol 01. WI-IAT ’09. IEEE Computer Society, Washington, DC, pp 572–575. doi:10.1109/WI-IAT.2009.96
Jiang L, Wu Z, Feng Q, Liu J, Zheng Q (2010) Efficient deep web crawling using reinforcement learning. In: Proceedings of the 14th Pacific-Asia conference on advances in knowledge discovery and data mining—volume Part I, PAKDD’10. Springer, Berlin, pp 428–439. doi:10.1007/978-3-642-13657-3_46
Kantorski GZ, Moreira VP, Heuser CA (2015) Automatic filling of hidden web forms: a survey. SIGMOD Rec 44(1):24–35. doi:10.1145/2783888.2783898
Khosravi M, Tarokh M (2010) Dynamic mining of users interest navigation patterns using naive Bayesian method. In: 2010 IEEE international conference on intelligent computer communication and processing (ICCP), pp 119–122. doi:10.1109/ICCP.2010.5606453
Lage JP, da Silva AS, Golgher PB, Laender AHF (2004) Automatic generation of agents for collecting hidden web pages for data extraction. Data Knowl Eng 49(2):177–196. doi:10.1016/j.datak.2003.10.003
Liddle S, Embley D, Scott D, Yau S (2003) Extracting data behind web forms. In: Oliv A, Yoshikawa M, Yu E (eds) Advanced conceptual modeling techniques. Lecture notes in computer science, vol 2784. Springer, Berlin, pp 402–413. doi:10.1007/978-3-540-45275-1_35
Lin K, Wang CJ, Chen HH (2011) Predicting next search actions with search engine query logs. In: 2011 IEEE/WIC/ACM international conference on Web intelligence and intelligent agent technology (WI-IAT), vol 1, pp 227–234. doi:10.1109/WI-IAT.2011.15
Nandi A, Jagadish HV (2007) Effective phrase prediction. In: VLDB ’07: proceedings of the 33rd international conference on very large data bases. VLDB Endowment, Vienna, pp 219–230
Nanopoulos A, Nanopoulos R, Katsaros D, Manolopoulos Y, Society IC (2002) A data mining algorithm for generalized web prefetching. IEEE Trans Knowl Data Eng 15:1155–1169
Raghavan S, Garcia-Molina H (2001) Crawling the hidden web. In: Proceedings of the 27th international conference on very large data bases. VLDB ’01. Morgan Kaufmann Publishers Inc., San Francisco, pp 129–138. http://dl.acm.org/citation.cfm?id=645927.672025
Rukzio E, Noda C, De Luca A, Hamard J, Coskun F (2008) Automatic form filling on mobile devices. Pervasive Mob Comput 4(2):161–181. doi:10.1016/j.pmcj.2007.09.001
Russo G, Birtolo C, Troiano L (2008) Generative UI design in SAPI project. In: CHI ’08 extended abstracts on human factors in computing systems. CHI EA ’08. ACM, New York, pp 3627–3632. doi:10.1145/1358628.1358903
Soulemane M, Rafiuzzaman M, Mahmud H (2012) Article: Crawling the hidden web: an approach to dynamic web indexing. Int J Comput Appl 55(1):7–15 (Full text available)
Steck H (2001) Constraint-based structural learning in Bayesian networks using finite data sets. PhD thesis, Department of Informatics, Technical University Munich, Munich, Germany
Steck H, Tresp V (1999) Bayesian belief networks for data mining. In: Proceedings of the 2nd workshop on data mining und data warehousing als Grundlage moderner entscheidungsunter-stuetzender Systeme, pp 145–154
Su Z, Yang Q, Lu Y, Zhang H (2000) Whatnext: a prediction system for web requests using n-gram sequence models. In: International conference on Web information systems engineering, vol 1, p 0214. doi:10.1109/WISE.2000.882395
Toda GA, Cortez E, Mesquita F, da Silva AS, Moura E, Neubert M (2009) Automatically filling form-based web interfaces with free text inputs. In: Proceedings of the 18th international conference on World Wide Web. WWW ’09. ACM, New York, pp 1163–1164. doi:10.1145/1526709.1526908
Toda GA, Cortez E, da Silva AS, de Moura E (2010) A probabilistic approach for automatically filling form-based web interfaces. Proc VLDB Endow 4(3):151–160. doi:10.14778/1929861.1929862
Trnka K, McCaw J, Yarrington D, McCoy KF, Pennington C (2009) User interaction with word prediction: the effects of prediction quality. ACM Trans Access Comput 1:17:1–17:34
Troiano L, Birtolo C (2014) Genetic algorithms supporting generative design of user interfaces: examples. Inf Sci 259:433–451. doi:10.1016/j.ins.2012.01.006
Troiano L, Birtolo C, Miranda M (2008) Adapting palettes to color vision deficiencies by genetic algorithm. In: Proceedings of the 10th annual conference on genetic and evolutionary computation. GECCO ’08. ACM, New York, pp 1065–1072. doi:10.1145/1389095.1389291
Troiano L, Cirillo G, Armenise R, Birtolo C (2009) A preliminary experience in optimizing the layout of web pages by genetic algorithms to fit mobile devices. In: Proceedings of the 2009 ninth international conference on intelligent systems design and applications, ISDA ’09. IEEE Computer Society, Washington, DC, pp 1055–1061. doi:10.1109/ISDA.2009.53
W3C (2008) Mobile web best practices 1.0. Technical report, W3C Recommendation. http://www.w3.org/TR/mobile-bp/
Winckler M, Gaits V, Vo DB, Sergio F, Rossi G (2011) An approach and tool support for assisting users to fill-in web forms with personal information. In: Proceedings of the 29th ACM international conference on design of communication. SIGDOC ’11. ACM, New York, pp 195–202. doi:10.1145/2038476.2038515
Wroblewski L (2008) Web form design: filling in the blanks. Louis Rosenfeld, Brooklyn
Wu P, Wen JR, Liu H, Ma WY (2006) Query selection techniques for efficient crawling of structured web sources. In: Proceedings of the 22nd international conference on data engineering. ICDE ’06. IEEE Computer Society, Washington, DC, p 47. doi:10.1109/ICDE.2006.124
Zhu J, Hong J, Hughes JG (2002) Using Markov chains for link prediction in adaptive web sites. In: Soft-ware 2002: proceedings, of the 1st international conference on computing in an imperfect world. Springer, London, pp 60–73
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Troiano, L., Birtolo, C. & Armenise, R. Modeling and predicting the user next input by Bayesian reasoning. Soft Comput 21, 1583–1600 (2017). https://doi.org/10.1007/s00500-015-1870-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-015-1870-7