ArgoUMLSPL Challenge

Feature location is a traceability recovery activity to identify the implementation elements associated to a characteristic of a system. Besides its relevance for software maintenance of a single system, feature location in a collection of systems received a lot of attention as a first step to re-engineer system variants (created through clone-and-own) into a Software Product Line (SPL). In this context, the objective is to unambiguously identify the boundaries of a feature inside a family of systems to later create reusable assets from these implementation elements. Among all the case studies in the SPL literature, variants derived from ArgoUML SPL stands out as the most used one. However, the use of different settings, or the omission of relevant information (e.g., the exact configurations of the variants or the way the metrics are calculated), makes it difficult to reproduce or benchmark the different feature location techniques even if the same ArgoUML SPL is used. With the objective to foster the research area on feature location, we provide a set of common scenarios using ArgoUML SPL and a set of utils to obtain metrics based on the results of existing and novel feature location techniques.

Solutions

Spectrum-Based Feature Localization: A Case Study using ArgoUML

Gabriela K. Michelon, Bruno Sotto-Mayor, Jabier Martinez, Aitor Arrieta, Rui Abreu and Wesley K. G. Assunção
SPLC 2021 Solution
Feature localization (FL) is a basic activity in re-engineering legacy systems into software product lines. In this work, we explore the use of the Spectrum-based localization technique for this task. This technique is traditionally used for fault localization but with practical applications in other tasks like the dynamic FL approach that we propose. The ArgoUML SPL benchmark is used as a case study and we compare it with a previous hybrid (static and dynamic) approach from which we reuse the manual and testing execution traces of the features. We conclude that it is feasible and sound to use the Spectrum-based approach providing promising results in the benchmark metrics.
Solution paper
Video presentation of the solution

A Graph-Based Feature Location Approach Using Set Theory

Richard Müller and Ulrich Eisenecker
SPLC 2019 Solution
The ArgoUML SPL benchmark addresses feature location in Software Product Lines (SPLs), where single features as well as feature combinations and feature negations have to be identified. We present a solution for this challenge using a graph-based approach and set theory. The results are promising. Set theory allows to exactly define which parts of feature locations can be computed and which precision and which recall can be achieved. This has to be complemented by a reliable identification of feature-dependent class and method traces as well as refinements. The application of our solution to one scenario of the benchmark supports this claim.
Solution paper
Slides

Comparison-Based Feature Location in ArgoUML Variants

Gabriela Karoline Michelon, Lukas Linsbauer, Wesley Klewerton Guez Assunção and Alexander Egyed
SPLC 2019 Solution
Identifying and extracting parts of a system’s implementation for reuse is an important task for re-engineering system variants into Software Product Lines (SPLs). An SPL is an approach that enables systematic reuse of existing assets across related product variants. The re-engineering process to adopt an SPL from a set of individual variants starts with the location of features and their implementation, to be extracted and migrated into an SPL and reused in new variants. Therefore, feature location is of fundamental importance to the success in the adoption of SPLs. Despite its importance, existing feature location techniques struggle with huge, complex, and numerous system artifacts. This is the scenario of ArgoUML-SPL, which stands out as the most used case study for the validation of feature location approaches. In this paper we use an automated feature location technique and apply it to the ArgoUML feature location challenge posed.
Solution paper
Slides

Discussion

References outside the SPLC Challenge track

Previous to the Challenge paper, we can find many references to the case study.
ArgoUML SPL case study has its own entry in the ESPLA Catalog (extractive SPL adoption catalog of case studies).
The following entries explicitly mention using the benchmark.

Gabriela K. Michelon, Lukas Linsbauer, Wesley K.G. Assunção, Stefan Fischer, Alexander Egyed. VaMoS 2021
A Hybrid Feature Location Technique for Re-engineering Single Systems into Software Product Lines
A combination of a dynamic analysis exercising the features in the GUI or through tests and static analysis using the ECCO tool. Metrics are provided using the benchmark traces, and also at method and line levels.
Paper
Slides

Johann Mortara, Xhevahire Tërnava, Philippe Collet. VaMoS 2020
Mapping features to automatically identified object-oriented variability implementations: the case of ArgoUML-SPL
The symfinder tool statically analyses patterns in the Java source code of the original ArgoUML (the scenario with just one product) for potential variability. They conducted a manual mapping from these candidates to ArgoUML-SPL features. Promising results at Java class level using the benchmark ground-truth (adapted to work only at Java class level).
Paper
Slides

Abdul Razzaq, Andrew Le Gear, Chris Exton, Jim Buckley. Empirical Software Engineering (EMSE) journal, published 2019
An empirical assessment of baseline feature location techniques
Paper

Daniel Cruz, Eduardo Figueiredo, Jabier Martinez. VaMoS 2019
A Literature Review and Comparison of Three Feature Location Techniques using ArgoUML-SPL
Among the contributions of this work, we can find some directly related to their use of the ArgoUML-SPL benchmark

A characterization of generated variants regarding the textual information that they contain. This information is relevant as this is the primary source for text-based information retrieval techniques.
A comparison of three text-based information retrieval techniques (Paragraph Vectors, Latent Dirichlet Allocation, and Latent Semantic Indexing) using only the feature names as queries and documents pre-processing.
The results suggest that Latent Semantic Indexing (LSI) outperforms the other two in this benchmark. However, precision, recall, and F-Measure have very low values (0.16, 0.19, and 0.079 respectively). This suggests that text-based information retrieval techniques should be combined with other techniques forming hybrid feature location techniques, and that LSI seems to be a good candidate for the text-based information retrieval part.
The implementation of the techniques are publicly available: https://github.com/DVSCross/TextualIRFeaturesImpl

Paper
Slides

SPLC 2018 Challenge

Feature Location Benchmark with ArgoUML SPL

Material