skip to main content
10.1145/2835776.2855091acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
abstract

Mining the Web for Intelligent Problem Solving for Programmers

Published: 08 February 2016 Publication History

Abstract

Programming can be hard to learn and master. Novice programmers often find themselves struggling with terminology, concepts, or different solutions to the same problem with little clue on how to choose the best one. Professional programmers often spend a considerable amount of time learning to use third-party libraries, APIs, or an unfamiliar piece of code. Although programmers can turn to search engines or question-and-answer websites for help, the problem solving process can often take multiple iterations and can be time-consuming. An integrated system that can recognize a programmer's difficulties and provide contextualized solutions is thus desirable, as it may significantly reduce the amount of manual effort required in the loop of troubleshooting.
Ideally, a programmer should be able to interact with such an intelligent system using natural language, in a way similar to how they document code or communicate with peers. However, using automatic natural language processing techniques to address programming questions is very difficult, mainly due to the following reasons: (1) the terms and common expressions vary greatly across different domains and individual programmers, making it difficult to associate relevant concepts together; (2) the solution to the user's trouble in programming often requires multiple steps or different resources, which requires deep understanding of the relations or dependencies of the possible solutions, as well as the user's personal capability of handling those solutions; (3) the documents in the training data usually include a mixture of general-domain expressions with mentions of variables, functions, and classes, as well as source code, making low-level text processing difficult; (4) the evaluation of the system generally requires skilled experts to provide ground truth, which is expensive and often unreliable.
We address the above difficulties and build an intelligent programming helper system by mining the massive data available online related to programming, including question-and-answer websites, tutorials, blogs, and code repositories. In specific, the study involves three important components. First, we use information extraction techniques to extract common programming tasks, issues, and solutions from the Web data, and establish connections between these extracted elements by leveraging their discrete or distributed representations (e.g., using neural embedding models). Such techniques have been shown to be useful in helping general users solve problems that require interactions with a complex computer software application through the interface of natural language. Second, we study how to handle complicated problems that require multiple steps to solve. The existing troubleshooting instances documented online are collectively modeled as a heterogeneous network, on which the random walk paths can be exploited to recommend solutions. Third, we study how to personalize the problem-solving process for users with varying levels of skills and background knowledge. In particular, each user's past adoptions of technologies and the adoption behavior in his/her social community can be jointly leveraged to provide the appropriate recommendations of technologies and may even promote innovations (e.g., new algorithms) in the process. Collectively, these three components form an integral solution to computer-assisted problem solving for programmers driven by big data, and may have impact on various different domains, including information extraction, language modeling, natural language understanding, automatic problem solving, and social network analysis.

References

[1]
Eytan Adar, Mira Dontcheva, and Gierad Laput. Commandspace: Modeling the relationships between tasks, descriptions and features. In User interface software and technology, pages 167--176. ACM, 2014.
[2]
Adam Fourney, Richard Mann, and Michael Terry. Query-feature graphs: bridging user vocabulary and system functionality. In User interface software and technology, pages 207--216. ACM, 2011.
[3]
Ni Lao, Tom Mitchell, and William W Cohen. Random walk inference and learning in a large scale knowledge base. In EMNLP, pages 529--539. Association for Computational Linguistics, 2011.
[4]
Xin Rong and Qiaozhu Mei. Diffusion of innovations revisited: from social network to innovation network. In CIKM, pages 499--508. ACM, 2013.

Cited By

View all
  • (2021)Solution knowledge mining and recommendation for quality problem-solvingComputers & Industrial Engineering10.1016/j.cie.2021.107313(107313)Online publication date: Apr-2021

Index Terms

  1. Mining the Web for Intelligent Problem Solving for Programmers

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WSDM '16: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining
    February 2016
    746 pages
    ISBN:9781450337168
    DOI:10.1145/2835776
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 February 2016

    Check for updates

    Author Tags

    1. human-computer interaction
    2. language modeling
    3. software engineering

    Qualifiers

    • Abstract

    Conference

    WSDM 2016
    WSDM 2016: Ninth ACM International Conference on Web Search and Data Mining
    February 22 - 25, 2016
    California, San Francisco, USA

    Acceptance Rates

    WSDM '16 Paper Acceptance Rate 67 of 368 submissions, 18%;
    Overall Acceptance Rate 498 of 2,863 submissions, 17%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Solution knowledge mining and recommendation for quality problem-solvingComputers & Industrial Engineering10.1016/j.cie.2021.107313(107313)Online publication date: Apr-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media