Mining Unit Tests for Discovery and Migration of Math APIs

Published: 07 October 2014


Today's programming languages are supported by powerful third-party APIs. For a given application domain, it is common to have many competing APIs that provide similar functionality. Programmer productivity therefore depends heavily on the programmer's ability to discover suitable APIs both during an initial coding phase, as well as during software maintenance.
The aim of this work is to support the discovery and migration of math APIs. Math APIs are at the heart of many application domains ranging from machine learning to scientific computations. Our approach, called MathFinder, combines executable specifications of mathematical computations with unit tests (operational specifications) of API methods. Given a math expression, MathFinder synthesizes pseudo-code comprised of API methods to compute the expression by mining unit tests of the API methods. We present a sequential version of our unit test mining algorithm and also design a more scalable data-parallel version.
We perform extensive evaluation of MathFinder (1) for API discovery, where math algorithms are to be implemented from scratch and (2) for API migration, where client programs utilizing a math API are to be migrated to another API. We evaluated the precision and recall of MathFinder on a diverse collection of math expressions, culled from algorithms used in a wide range of application areas such as control systems and structural dynamics. In a user study to evaluate the productivity gains obtained by using MathFinder for API discovery, the programmers who used MathFinder finished their programming tasks twice as fast as their counterparts who used the usual techniques like web and code search, IDE code completion, and manual inspection of library documentation. For the problem of API migration, as a case study, we used MathFinder to migrate Weka, a popular machine learning library. Overall, our evaluation shows that MathFinder is easy to use, provides highly precise results across several math APIs and application domains even with a small number of unit tests per method, and scales to large collections of unit tests.


Richard John Botting

Unit tests are not just about testing [1]. A unit test is a piece of code that executes a part of a program (the unit) and checks to see if it worked. Therefore, the test documents a way to use the unit. Two teams (at least) are independently exploring ways to use this information. This paper shows how a tool (MathFinder) can use unit tests to help a programmer select a math library suitable for an algorithm. Once selected, the tool proposes detailed code for the algorithm using the library's application programming interface (API). The paper describes experiments that show how a typical maintenance project within a Java/JUnit/Eclipse plus Scilab environment is done faster when programmers use MathFinder. The results may generalize to other environments. The key idea is to specify requirements as unit tests for a very high-level interpreted language, and secondly, as queries to search an index of unit tests in a lower-level language plus API. MathFinder acts as a partial compiler and produces a list of possible sequences of function calls that pass the tests. Apparently, 90 percent of the time the top of the list is a suitable piece of code to implement the given algorithm. This is a typical research paper in the software engineering field and will interest fellow researchers. Meanwhile, a quarter of the way round the world, another team [2] (not referred to here) is also starting to mine unit tests to recommend code to programmers. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.


Published: 07 October 2014

Published: 07 October 2014
Accepted: 01 May 2014
Revised: 01 January 2014
Received: 01 August 2013
Published in TOSEM Volume 24, Issue 1


