Refining Intelligence, Optimizing Performance Seamlessly
Developed with the software and tools below.
Table of Contents
Large Language Models (LLMs) have proven to be remarkably accurate and effective for several tasks such as summarisation, language translation, question answering and many others. However, to expand their capabilities and performance, these models have progressively increased in size. This growth has prompted research in two key areas: model compression and fine-tuning. Through compression techniques like pruning, redundant parameters and connections are trimmed, decreasing both memory usage and inference time. Fine-tuning then tailors the model's parameters to excel in designated domains or tasks, leveraging pre-trained natural language knowledge. This synergy optimises efficiency minimising impact on performance, addressing challenges of computational demands and task-specific proficiency. We seek to find the optimal ordering of this synergy, reporting our results on well-known LLM benchmarks. This study discusses a methodology for model compression and performance regeneration via Wanda pruning and LoRA fine-tuning. We investigate and quantify the impact on performance based on the ordering of pruning and fine-tuning for a compressed model on task-specific metrics, showing that 'Order Matters'.
- prune: only prune the model for one time.
- prune_finetune: fine-tune -> prune (the left pipeline in the left figure with L = 1).
- finetune_prune: prune -> fine-tune (the right pipeline in the left figure with L = 1)..
- iter_pf: (prune -> fine-tune) x L (the left pipeline in the right figure).
- iter_fp: (fine-tune -> prune) x L (the right pipeline in the right figure).
![]() |
![]() |
![]() |
![]() |
└── Prune-Finetune-LLM/
├── wanda
├── factoid_qa
│ ├── __init__.py
│ ├── freebase_qa.py
│ └── FreebaseQA-eval.json
├── plots
│ ├── [plot1].png
│ ├── [plot2].png
│ └── ...
├── main.py
├── eval.py
├── process.py
├── utils.py
├── constant.py
├── experiments.py
├── run.sh
├── plot_comparison_weights.py
├── plots_comparison_metrics.py
├── README.md
└── requirements.txt
.
File | Summary |
---|---|
main.py | Coordinates model operations, specifically managing the pruning, fine-tuning, and assessment of a large language model (LLM). This servers as the main interface of executing model operation. |
eval.py | Evaluates large language models (LLMs) across various datasets and metrics. |
process.py | Defines pruning and fine-tuning of the model. |
utils.py | Serves as a utility module providing functions for model layer identification, language model response generation, response parsing, model loading, response validation, and results management. |
constant.py | Defines critical pathways for various pipeline stages in the repository, and the path of python interpreter |
experiments.py | Contains a series of different experiments, allowing for combinations of pruning and fine-tuning operations on pre-trained models under different pipelines, by executing main.py . |
plot_comparison_weights.py | Visualizes statistical distributions and differences of weights of LLMs, in order to compare different pipelines. |
plots_comparison_metrics.py | Generates comparative visualizations of performance metrics across different pruning and fine-tuning pipelines. |
requirements.txt | Depandencies for this repo. |
run.sh | Executes multiple experiment pipelines, leveraging the experiments.py script. |
wanda
This directory contains wanda pruning method, based on https://github.com/locuslab/wanda .factoid_qa
This directory contains factoid qa metric, which is a accuracy assessing model's ablity to store factual knowledge, based on https://github.com/kelvin-jiang/FreebaseQA .plots
This directory visualization results.System Requirements:
- Python:
version 3.10.12
- Clone the Prune-Finetune-LLM repository and submodules:
$ git clone --recurse-submodules https://github.com/kangchengX/Prune-Finetune-LLM.git
- Install venv
$ apt install python3-venv
- Create virtual environment
$ python -m venv venv
- Activate the virtual environement
$ source venv/bin/activate
- Change to the project directory:
$ cd Prune-Finetune-LLM
- Install the dependencies:
$ pip install -r requirements.txt
Thanks all 5 members of our team.
Alvaro, Fernandez; Aung, Htet; Carlos, Diez; Filippo, Fiocchi; Xu, Kangcheng