-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
3 changed files
with
114 additions
and
29 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,27 +1,112 @@ | ||
# Linking Sketches and Software Architecture (LiSSA) | ||
|
||
The LiSSA approach aims to connect sketches and informal diagrams (such as class diagrams, component diagrams, ...) with | ||
formal models like component models. | ||
|
||
The following diagram shows the pipeline that is planned for the LiSSA approach. | ||
|
||
```mermaid | ||
stateDiagram-v2 | ||
DiagramDetection | ||
TextPreprocessing | ||
ArchitectureModel | ||
TextExtraction | ||
EntityRecognition | ||
RecommendationGeneration | ||
ConnectionGeneration | ||
InconsistencyDetection | ||
DiagramDetection --> RecommendationGeneration | ||
TextPreprocessing --> TextExtraction | ||
ArchitectureModel --> RecommendationGeneration | ||
TextExtraction --> EntityRecognition | ||
DiagramDetection --> EntityRecognition | ||
EntityRecognition --> RecommendationGeneration | ||
RecommendationGeneration --> ConnectionGeneration | ||
ConnectionGeneration --> InconsistencyDetection | ||
``` | ||
# LiSSA: A Framework for Generic Traceability Link Recovery | ||
|
||
Welcome to the LiSSA project! | ||
This framework leverages Large Language Models (LLMs) enhanced through Retrieval-Augmented Generation (RAG) to establish traceability links across various software artifacts. | ||
|
||
## Overview | ||
|
||
In software development and maintenance, numerous artifacts such as requirements, code, and architecture documentation are produced. | ||
Understanding the relationships between these artifacts is crucial for tasks like impact analysis, consistency checking, and maintenance. | ||
LiSSA aims to provide a generic solution for Traceability Link Recovery (TLR) by utilizing LLMs in combination with RAG techniques. | ||
|
||
The concept and evaluation of LiSSA are detailed in our paper: | ||
|
||
> Fuchß, D., Hey, T., Keim, J., Liu, H., Ewald, N., Thirolf, T., & Koziolek, A. (2025). LiSSA: Toward Generic Traceability Link Recovery through Retrieval-Augmented Generation. In Proceedings of the IEEE/ACM 47th International Conference on Software Engineering, Ottawa, Canada. | ||
You can access the paper [here](https://ardoco.de/c/icse25). | ||
|
||
## Features | ||
|
||
- **Generic Applicability**: LiSSA is designed to recover traceability links across various types of software artifacts, including: | ||
- [Requirements to code](https://ardoco.de/c/icse25) | ||
- [Documentation to code](https://ardoco.de/c/icse25) | ||
- [Architecture documentation to architecture models](https://ardoco.de/c/icse25) | ||
|
||
- **Retrieval-Augmented Generation**: By combining LLMs with RAG, LiSSA enhances the accuracy and relevance of the recovered traceability links. | ||
|
||
## Getting Started | ||
|
||
To get started with LiSSA, follow these steps: | ||
|
||
1. **Clone the Repository**: | ||
```bash | ||
git clone https://github.com/ArDoCo/LiSSA-RATLR | ||
cd LiSSA-RATLR | ||
``` | ||
|
||
2. **Install Dependencies**: | ||
Ensure you have Java JDK 21 or later installed. Then, build the project using Maven: | ||
```bash | ||
mvn clean package | ||
``` | ||
|
||
3. **Run LiSSA**: | ||
Execute the main application: | ||
```bash | ||
java -jar target/ratlr-*-jar-with-dependencies.jar eval -c config.json | ||
``` | ||
|
||
### Configuration | ||
|
||
1. Create a configuration you want to use for evaluation / execution. E.g., you can find configurations [here](https://github.com/ArDoCo/ReplicationPackage-ICSE25_LiSSA-Toward-Generic-Traceability-Link-Recovery-through-RAG/tree/main/LiSSA-RATLR-V2/lissa/configs/req2code-significance). You can also provide a directory containing multiple configurations. | ||
2. Configure your OpenAI API key and organization in a `.env` file. You can use the provided template file as a template `env-template`. | ||
3. LiSSA caches requests in order to be reproducible. The cache is located in the cache folder that can be specified in the configuration. | ||
4. Run `java -jar target/ratlr-*-jar-with-dependencies.jar eval -c configs/....` to run the evaluation. You can provide a JSON or a directory containing JSON configurations. | ||
5. The results will be printed to the console and saved to a file in the current directory. The name is also printed to the console. | ||
|
||
### Results of Evaluation / Execution | ||
The results will be stored as markdown files. | ||
A result file can look like below. | ||
It contains the configuration and the results of the evaluation. | ||
Additionally, the LiSSA generate CSV files that contain the traceability links as pairs of identifiers. | ||
|
||
<details> | ||
<summary>Example Result</summary> | ||
|
||
```json | ||
## Configuration | ||
{ | ||
"cache_dir" : "./cache-r2c/dronology-dd--102959883", | ||
"gold_standard_configuration" : { | ||
"hasHeader" : false, | ||
"path" : "./datasets/req2code/dronology-dd/answer.csv" | ||
}, | ||
"... other configuration parameters ..." | ||
} | ||
|
||
## Stats | ||
* # TraceLinks (GS): 740 | ||
* # Source Artifacts: 211 | ||
* # Target Artifacts: 423 | ||
## Results | ||
* True Positives: 283 | ||
* False Positives: 1286 | ||
* False Negatives: 457 | ||
* Precision: 0.18036966220522627 | ||
* Recall: 0.3824324324324324 | ||
* F1: 0.24512776093546992 | ||
``` | ||
|
||
</details> | ||
|
||
## Evaluation | ||
|
||
LiSSA has been empirically evaluated on three different TLR tasks: | ||
|
||
- Requirements to code | ||
- Documentation to code | ||
- Architecture documentation to architecture models | ||
- Requirements to requirements | ||
|
||
The results indicate that the RAG-based approach can significantly outperform state-of-the-art methods in code-related tasks. | ||
However, further research is needed to enhance its performance for broader applicability. | ||
|
||
## Acknowledgments | ||
|
||
LiSSA is developed by researchers from the Modelling for Continuous Software Engineering (MCSE) group of KASTEL - Institute of Information Security and Dependability at the Karlsruhe Institute of Technology (KIT). | ||
|
||
For more information about the project and related research, visit our [website](https://ardoco.de/). | ||
|
||
--- | ||
|
||
*Note: This README provides a brief overview of the LiSSA project. For comprehensive details, please refer to the [repository](https://github.com/ArDoCo/LiSSA-RATLR)* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters