This repository presents a benchmark for the view-based Retriever approach to reverse engineering software architecture models. This approach enables the reverse engineering of software architecture models from heterogeneous artifacts by extracting structural and behavioral views from existing software artifacts using rule-based processes. The views are then refined to ensure completeness and consistency, and model-driven composition connects the different views. These views provide a comprehensive understanding of software systems and support model-driven analysis and quality prediction.
Each benchmark project is structured as follows:
.retriever.yml
file contains the configuration for running the retriever approach.
repository
value is the ID of a GitHub repository.current_version
value is the latest version of the Retriever approach used to build the architectural models.rules
values are the rules used to build the architectural models.model_re
folder contains the architectural model of the system that is automatically generated by the Retriever approach.
pcm
folder contains the Palladio Component Model (PCM) of the system.uml
folder contains the PlantUML model.model_gs
folder contains our manual gold standards for the system.The easiest way to use our approach is to use the CLI that we have released. Here’s an example of how to call the CLI application with the given parameters:
./eclipse -i /path/to/input/directory -o /path/to/output/directory -r supported_rules
Here, replace /path/to/input/directory
with the path to the root directory of the project you want to reverse engineer, /path/to/output/directory
with the path to the output directory where you want to store the generated models, and supported_rules
with the rules you want to use for reverse engineering.
This GitHub workflow is designed for reverse-engineering case studies using the Palladio Reverse-Engineering Retriever. It automates the process of collecting information, generating the PCM (Palladio Component Model) and committing these results back to the repository. It ensures that the latest version is always used and that results are consistently integrated back into the repository.
The workflow can be dispatched manually, allowing to overwrite the benchmark
parameter for all projects to force the retriever action to use the hyperfine
benchmark to measure the execution time.
In addition to that, the workflow is automatically triggered every day at 2:00 AM UTC to keep the results up-to-date.
Since the workflow commits the analysis results back to the repository it requires the contents: write
permission.
The workflow consists of 3 jobs:
collect_info
)This job determines the benchmark projects by searching for .retriever.yml
files.
array
:
A list of directories that contain a .retriever.yml
file.latest_version
:
The latest version tag of the Palladio-ReverseEngineering-Retriever on GitHub.
Used to determine if the retriever action needs to be run for a project or not..retriever.yml
files and outputs an array of these directories.generate_pcm
)This job performs the actual benchmark on the projects found in the previous job.
It is a matrix job that takes the output array
of the collect_info
job as an input (list of project directories).
The job is executed in parallel on the projects and if one project fails, the others still continue.
This ensures that the results of one project don’t affect other projects.
.retriever.yml
using yq
and handles the overrideBenchmark
input if the workflow is executed manually..retriever.yml
to the latest version that was used for the analysis.Steps 3-8 are only executed if the current_version
in the .retriever.yml
file doesn’t match the latest_version
that was determined in the previous job.
This avoids unnecessary execution as the workflow is scheduled every day.
commit_results
)This job commits the results back to the benchmark repository.
generate_pcm
job.README.md
and the UML diagrams for all benchmarked projects.In order to add a new project to the benchmark repository, create a folder for the project and add the following .retriever.yml
file:
repository: [GitHub repository id]
current_version: v5.2.0.202402260843
rules:
- org.palladiosimulator.retriever.extraction.rules.maven
- org.palladiosimulator.retriever.extraction.rules.spring
benchmark: 'false'
rules_path: '.'
The parameter current_version
is updated by the benchmark after a successful run to the used version of the retriever to avoid unnecessary repeated analysis runs if the version hasn’t changed.
The parameters rules
, benchmark
and rules_path
are mapped to the parameters of the used retriever action.
Only the parameter rules
is required.