From c920b567542aa9e4783a3c538ba0a988dafc4514 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dominik=20Fuch=C3=9F?= Date: Mon, 17 Feb 2025 15:55:36 +0100 Subject: [PATCH] Update LiSSA-RATLR in documentation --- docs/Home.md | 2 +- docs/LiSSA.md | 139 ++++++++++++++++++++++++++++++++++++++--------- docs/_Sidebar.md | 2 +- 3 files changed, 114 insertions(+), 29 deletions(-) diff --git a/docs/Home.md b/docs/Home.md index cd5ac064a..e36e14d23 100644 --- a/docs/Home.md +++ b/docs/Home.md @@ -31,7 +31,7 @@ To get to know the project, please read the following pages: * [TLR](https://github.com/ArDoCo/TLR): Traceability Link Recovery (TLR) Modules * [StanfordCoreNLP-Provider-Service](https://github.com/ArDoCo/StanfordCoreNLP-Provider-Service): RESTful web service for text preprocessing * [InconsistencyDetection](https://github.com/ArDoCo/InconsistencyDetection): Inconsistency Detection (ID) Modules - * [LiSSA](https://github.com/ArDoCo/LiSSA): Linking Sketches and Software Architecture Modules + * [LiSSA-RATLR](https://github.com/ArDoCo/LiSSA-RATLR): LiSSA - A Framework for Generic Traceability Link Recovery * Testing and Evaluation * [IntegrationTests](https://github.com/ArDoCo/IntegrationTests): Integration Tests * [Benchmark](https://github.com/ArDoCo/Benchmark): Benchmarks diff --git a/docs/LiSSA.md b/docs/LiSSA.md index daf7e7347..d793f640d 100644 --- a/docs/LiSSA.md +++ b/docs/LiSSA.md @@ -1,27 +1,112 @@ -# Linking Sketches and Software Architecture (LiSSA) - -The LiSSA approach aims to connect sketches and informal diagrams (such as class diagrams, component diagrams, ...) with -formal models like component models. - -The following diagram shows the pipeline that is planned for the LiSSA approach. - -```mermaid -stateDiagram-v2 - DiagramDetection - TextPreprocessing - ArchitectureModel - TextExtraction - EntityRecognition - RecommendationGeneration - ConnectionGeneration - InconsistencyDetection - - DiagramDetection --> RecommendationGeneration - TextPreprocessing --> TextExtraction - ArchitectureModel --> RecommendationGeneration - TextExtraction --> EntityRecognition - DiagramDetection --> EntityRecognition - EntityRecognition --> RecommendationGeneration - RecommendationGeneration --> ConnectionGeneration - ConnectionGeneration --> InconsistencyDetection -``` \ No newline at end of file +# LiSSA: A Framework for Generic Traceability Link Recovery + +Welcome to the LiSSA project! +This framework leverages Large Language Models (LLMs) enhanced through Retrieval-Augmented Generation (RAG) to establish traceability links across various software artifacts. + +## Overview + +In software development and maintenance, numerous artifacts such as requirements, code, and architecture documentation are produced. +Understanding the relationships between these artifacts is crucial for tasks like impact analysis, consistency checking, and maintenance. +LiSSA aims to provide a generic solution for Traceability Link Recovery (TLR) by utilizing LLMs in combination with RAG techniques. + +The concept and evaluation of LiSSA are detailed in our paper: + +> Fuchß, D., Hey, T., Keim, J., Liu, H., Ewald, N., Thirolf, T., & Koziolek, A. (2025). LiSSA: Toward Generic Traceability Link Recovery through Retrieval-Augmented Generation. In Proceedings of the IEEE/ACM 47th International Conference on Software Engineering, Ottawa, Canada. + +You can access the paper [here](https://ardoco.de/c/icse25). + +## Features + +- **Generic Applicability**: LiSSA is designed to recover traceability links across various types of software artifacts, including: + - [Requirements to code](https://ardoco.de/c/icse25) + - [Documentation to code](https://ardoco.de/c/icse25) + - [Architecture documentation to architecture models](https://ardoco.de/c/icse25) + +- **Retrieval-Augmented Generation**: By combining LLMs with RAG, LiSSA enhances the accuracy and relevance of the recovered traceability links. + +## Getting Started + +To get started with LiSSA, follow these steps: + +1. **Clone the Repository**: + ```bash + git clone https://github.com/ArDoCo/LiSSA-RATLR + cd LiSSA-RATLR + ``` + +2. **Install Dependencies**: + Ensure you have Java JDK 21 or later installed. Then, build the project using Maven: + ```bash + mvn clean package + ``` + +3. **Run LiSSA**: + Execute the main application: + ```bash + java -jar target/ratlr-*-jar-with-dependencies.jar eval -c config.json + ``` + +### Configuration + +1. Create a configuration you want to use for evaluation / execution. E.g., you can find configurations [here](https://github.com/ArDoCo/ReplicationPackage-ICSE25_LiSSA-Toward-Generic-Traceability-Link-Recovery-through-RAG/tree/main/LiSSA-RATLR-V2/lissa/configs/req2code-significance). You can also provide a directory containing multiple configurations. +2. Configure your OpenAI API key and organization in a `.env` file. You can use the provided template file as a template `env-template`. +3. LiSSA caches requests in order to be reproducible. The cache is located in the cache folder that can be specified in the configuration. +4. Run `java -jar target/ratlr-*-jar-with-dependencies.jar eval -c configs/....` to run the evaluation. You can provide a JSON or a directory containing JSON configurations. +5. The results will be printed to the console and saved to a file in the current directory. The name is also printed to the console. + +### Results of Evaluation / Execution +The results will be stored as markdown files. +A result file can look like below. +It contains the configuration and the results of the evaluation. +Additionally, the LiSSA generate CSV files that contain the traceability links as pairs of identifiers. + +
+Example Result + +```json +## Configuration +{ + "cache_dir" : "./cache-r2c/dronology-dd--102959883", + "gold_standard_configuration" : { + "hasHeader" : false, + "path" : "./datasets/req2code/dronology-dd/answer.csv" + }, + "... other configuration parameters ..." +} + +## Stats +* # TraceLinks (GS): 740 +* # Source Artifacts: 211 +* # Target Artifacts: 423 +## Results +* True Positives: 283 +* False Positives: 1286 +* False Negatives: 457 +* Precision: 0.18036966220522627 +* Recall: 0.3824324324324324 +* F1: 0.24512776093546992 +``` + +
+ +## Evaluation + +LiSSA has been empirically evaluated on three different TLR tasks: + +- Requirements to code +- Documentation to code +- Architecture documentation to architecture models +- Requirements to requirements + +The results indicate that the RAG-based approach can significantly outperform state-of-the-art methods in code-related tasks. +However, further research is needed to enhance its performance for broader applicability. + +## Acknowledgments + +LiSSA is developed by researchers from the Modelling for Continuous Software Engineering (MCSE) group of KASTEL - Institute of Information Security and Dependability at the Karlsruhe Institute of Technology (KIT). + +For more information about the project and related research, visit our [website](https://ardoco.de/). + +--- + +*Note: This README provides a brief overview of the LiSSA project. For comprehensive details, please refer to the [repository](https://github.com/ArDoCo/LiSSA-RATLR)* diff --git a/docs/_Sidebar.md b/docs/_Sidebar.md index f3f660e84..f2c809422 100644 --- a/docs/_Sidebar.md +++ b/docs/_Sidebar.md @@ -12,4 +12,4 @@ 1. [SAM-Code](traceability-link-recovery#sam-code) 1. [SAD-SAM-Code](traceability-link-recovery#sad-sam-code) 7. [Inconsistency Detection (ID)](Inconsistency-Detection) -8. [LiSSA](lissa) \ No newline at end of file +8. [LiSSA-RATLR](lissa) \ No newline at end of file