This repository presents a benchmark for the view-based Retriever approach to reverse engineering software architecture models. This approach enables the reverse engineering of software architecture models from heterogeneous artifacts by extracting structural and behavioral views from existing software artifacts using rule-based processes. The views are then refined to ensure completeness and consistency, and model-driven composition connects the different views. These views provide a comprehensive understanding of software systems and support model-driven analysis and quality prediction.
Each benchmark project is structured as follows:
.retriever.yml file contains the configuration for running the retriever approach.
repository value is the ID of a GitHub repository.current_version value is the latest version of the Retriever approach used to build the architectural models.rules values are the rules used to build the architectural models.model_re folder contains the architectural model of the system that is automatically generated by the Retriever approach.
pcm folder contains the Palladio Component Model (PCM) of the system.uml folder contains the PlantUML model.model_gs folder contains our manual gold standards for the system.The easiest way to use our approach is to use the CLI that we have released. Here’s an example of how to call the CLI application with the given parameters:
./eclipse -i /path/to/input/directory -o /path/to/output/directory -r supported_rules
Here, replace /path/to/input/directory with the path to the root directory of the project you want to reverse engineer, /path/to/output/directory with the path to the output directory where you want to store the generated models, and supported_rules with the rules you want to use for reverse engineering.
This GitHub workflow is designed for reverse-engineering case studies using the Palladio Reverse-Engineering Retriever. It automates the process of collecting information, generating the PCM (Palladio Component Model) and committing these results back to the repository. It ensures that the latest version is always used and that results are consistently integrated back into the repository.
The workflow can be dispatched manually, allowing to overwrite the benchmark parameter for all projects to force the retriever action to use the hyperfine benchmark to measure the execution time.
In addition to that, the workflow is automatically triggered every day at 2:00 AM UTC to keep the results up-to-date.
Since the workflow commits the analysis results back to the repository it requires the contents: write permission.
The workflow consists of 3 jobs:
collect_info)This job determines the benchmark projects by searching for .retriever.yml files.
array:
A list of directories that contain a .retriever.yml file.latest_version:
The latest version tag of the Palladio-ReverseEngineering-Retriever on GitHub.
Used to determine if the retriever action needs to be run for a project or not..retriever.yml files and outputs an array of these directories.generate_pcm)This job performs the actual benchmark on the projects found in the previous job.
It is a matrix job that takes the output array of the collect_info job as an input (list of project directories).
The job is executed in parallel on the projects and if one project fails, the others still continue.
This ensures that the results of one project don’t affect other projects.
.retriever.yml using yq and handles the overrideBenchmark input if the workflow is executed manually..retriever.yml to the latest version that was used for the analysis.Steps 3-8 are only executed if the current_version in the .retriever.yml file doesn’t match the latest_version that was determined in the previous job.
This avoids unnecessary execution as the workflow is scheduled every day.
commit_results)This job commits the results back to the benchmark repository.
generate_pcm job.README.md and the UML diagrams for all benchmarked projects.In order to add a new project to the benchmark repository, create a folder for the project and add the following .retriever.yml file:
repository: [GitHub repository id]
current_version: v5.2.0.202402260843
rules:
- org.palladiosimulator.retriever.extraction.rules.maven
- org.palladiosimulator.retriever.extraction.rules.spring
benchmark: 'false'
rules_path: '.'
The parameter current_version is updated by the benchmark after a successful run to the used version of the retriever to avoid unnecessary repeated analysis runs if the version hasn’t changed.
The parameters rules, benchmark and rules_path are mapped to the parameters of the used retriever action.
Only the parameter rules is required.