Skip to content

Secure Your Contract: Towards Efficient and Reliable AI-Assisted Contract Drafting

Notifications You must be signed in to change notification settings

AINNOV/secure-your-contract

Repository files navigation

Secure Your Contract: Towards Efficient and Reliable AI-Assisted Contract Drafting

2024.09.11 ~ 2024.11.23

Alt text

Secure Your Contract is an AI-based contract assistant that helps you draft a contract avoiding possible disadvantageous terms & keywords.

It is based on LLaMa 2 7B for simplicity with much of references, finetuned using QLoRA (4-bit quantization) which is very promising towards a lightweight off-the-shelf system. Please go for LLaMa-3 8B 4-bit as a backbone if you would like to build on this project.

Secure Your Contract provides two types of analysis:

  1. Negative Term Detector(NTD): LLM-based negative term/phrases detector will provide the detections against risky or weakly risky terms, the reasons for them and corresponding refinement suggestions.

  2. Negative Keyword Detector(NKD): Traditional NLP & text mining-based negative keywords detector consist of 9 different methodologies.

0. Data Preparation

Contract - Analysis Pairs
  1. You need to download 100 contract keyword-related contract/agreement documents from WONDER.LEGAL such that the model learns realistic examples.

  2. For both auto-generated (by ChatGPT in our method for memory management) and crawled data, contract(prompt)-analysis(response) data pairs are placed under ./data/pair/prompt/ and ./data/pair/response/ respectively with each separated as .txt files.

MITIE installation
  1. You need to follow the instruction of MITIE github such that you are able to run ./LLMs/mitie_finetune.py. Place MITIE-models right under the root folder.
Convert Data Format

If you want to push your .json to your huggingface repository, run

cd utils
python3 save_oneliner.py

The code will convert the normal prompt - response paris to them for LLaMa-2 with your template applied.

If you want to convert PDF(s) to .txt(s) or .json, run

cd utils
python3 pdf2json.py 
python3 pdf2text.py 
python3 pdfs2texts.py 

1. NTD (Negative Term Detector)

Fine-Tuning

Fine-tune NTD(Negative Term Detector) with LLaMa-2 7B chat model with QLoRA:

cd LLMs
python3 finetune.py

Note you can change configs/finetune.yml for different settings.

Inference

Inference with your fine-tuned model on the sample contract data:

cd LLMs
python3 inference.py

Note you can change configs/inference.yml for different settings.

2. NKD (Negative Keyword Detector)

cd LLMs
python3 neg_detect.py

Note using fine-tuned MITIE is not helpful, although it is possible.

3. Evaluate

To evaluate with RAG(Retrieval Augmented Generation), run

python3 evaluate_rag.py

Note you need to contruct your FAISS DB first. Since LLaMa-2 has a limited token ingestion capacity, RAG rather degrades the model performance in our case.

w/o Inference & w/o RAG

To evaluate your model with inference, run

python3 evaluate.py

To evaluate without inference, run

python3 evaluate_onlyscores.py

4. Visualize

For histograms, run

python3 visualize_histograms.py

For word clouds, run

python3 visualize_wordclouds.py

About

Secure Your Contract: Towards Efficient and Reliable AI-Assisted Contract Drafting

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published