10:30 - 11:00 |
Exploring Variability-Aware Execution for Testing Plugin-Based Web Applications
In plugin-based systems, plugin conflicts may occur when two or more plugins interfere with one another, changing their expected behaviors. It is highly challenging to detect plugin conflicts due to the exponential explosion of the combinations of plugins (i.e., configurations). In this paper, we address the challenge of executing a test case over many configurations. Leveraging the fact that many executions of a test are similar, our variability-aware execution runs common code once. Only when encountering values that are different depending on specific configurations will the execution split to run for each of them. To evaluate the scalability of variability-aware execution on a large real-world setting, we built a prototype PHP interpreter called Varex and ran it on the popular WordPress blogging Web application. The results show that while plugin interactions exist, there is a significant amount of sharing that allows variability-aware execution to scale to 2^50 configurations within seven minutes of running time. During our study, with Varex, we were able to detect two plugin conflicts: one was recently reported on WordPress forum and another one was not previously discovered.
|
|
Hung Viet Nguyen, Christian Kästner, and Tien N. Nguyen |
|
Iowa State University, USA; Carnegie Mellon University, USA |
|
|
11:00 - 11:30 |
A Study of Equivalent and Stubborn Mutation Operators using Human Analysis of Equivalence
Though mutation testing has been widely studied for more than thirty years, the prevalence and properties of equivalent mutants remain largely unknown. We report on the causes and prevalence of equivalent mutants and their relationship to stubborn mutants (those that remain undetected by a high quality test suite, yet are non-equivalent). Our results, based on manual analysis of 1,230 mutants from 18 programs, reveal a highly uneven distribution of equivalence and stubbornness. For example, the ABS class and half UOI class generate many equivalent and almost no stubborn mutants, while the LCR class generates many stubborn and few equivalent mutants. We conclude that previous test effectiveness studies based on fault seeding could be skewed, while developers of mutation testing tools should prioritise those operators that we found generate disproportionately many stubborn (and few equivalent) mutants.
|
|
Xiangjuan Yao, Mark Harman, and Yue Jia |
|
China University of Mining and Technology, China; University College London, UK |
|
|
11:30 - 12:00 |
Cross-Checking Oracles from Intrinsic Software Redundancy
Despite the recent advances in automatic test generation, testers must still write test oracles manually. If formal specifications are available, it might be possible to use decision procedures derived from those specifications. We present a technique that is based on a form of specification but also leverages more information from the system under test. We assume that the system under test is somewhat redundant, in the sense that some operations are designed to behave like others but their executions are different. Our experience in this and previous work indicates that this redundancy exists and is easily documented. We then generate oracles by cross-checking the execution of a test with the same test in which we replace some operations with redundant ones. We develop this notion of cross-checking oracles into a generic technique to automatically insert oracles into unit tests. An experimental evaluation shows that cross-checking oracles, used in combination with automatic test generation techniques, can be very effective in revealing faults, and that they can even improve good hand-written test suites.
|
|
Antonio Carzaniga, Alberto Goffi, Alessandra Gorla, Andrea Mattavelli, and Mauro Pezzè |
|
University of Lugano, Switzerland; Saarland University, Germany |
|
|
12:00 - 12:30 |
Mind the Gap: Assessing the Conformance of Software Traceability to Relevant Guidelines
Many guidelines for safety-critical industries such as aeronautics, medical devices, and railway communications, specify that traceability must be used to demonstrate that a rigorous process has been followed and to provide evidence that the system is safe for use. In practice, there is a gap between what is prescribed by guidelines and what is implemented in practice, making it difficult for organizations and certifiers to fully evaluate the safety of the software system. In this paper we present an approach, which parses a guideline to extract a Traceability Model depicting software artifact types and their prescribed traces. It then analyzes the traceability data within a project to identify areas of traceability failure. Missing traceability paths, redundant and/or inconsistent data, and other problems are highlighted. We used our approach to evaluate the traceability of seven safety-critical software systems and found that none of the evaluated projects contained traceability that fully conformed to its relevant guidelines.
|
|
Patrick Rempel, Patrick Mäder, Tobias Kuschke, and Jane Cleland-Huang |
|
TU Ilmenau, Germany; DePaul University, USA |